Next Article in Journal
The Impact of Inundation and Nitrogen on Common Saltmarsh Species Using Marsh Organ Experiments in Mississippi
Previous Article in Journal
Design of a Novel Pump Cavitation Valve and Study of Its Cavitation Characteristics
Previous Article in Special Issue
Evaluating Wildfire-Induced Changes in a Water-Yield Ecosystem Service at the Local Scale Using the InVEST Model
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Amalgamation of Drainage Area Ratio and Nearest Neighbors Methods for Predicting Stream Flows in British Columbia, Canada

by
Muhammad Uzair Qamar
*,
Courtney Turner
and
Cameron Stooshnoff
Ministry of Water, Land and Resource Stewardship, Surrey, BC V3R 1E1, Canada
*
Author to whom correspondence should be addressed.
Water 2025, 17(10), 1502; https://doi.org/10.3390/w17101502
Submission received: 3 April 2025 / Revised: 12 May 2025 / Accepted: 14 May 2025 / Published: 16 May 2025

Abstract

:
British Columbia, Canada, is recognized for its abundant natural resources, including agricultural and aquaculture products, sustained by its diverse climate and geography. Water resource allocation in BC is governed by the Water Sustainability Act, enacted on 29 February 2016, replacing the historic Water Act. However, limited gauging of streams across the province poses challenges for ensuring water allocation while meeting Environmental Flow Needs. Overallocated watersheds and data-scarce watersheds in need of licensing highlight the need for robust streamflow prediction methods. To address these challenges, we developed a methodology that integrates the Drainage Area Ratio and Nearest Neighbors techniques to predict streamflows efficiently, without incurring additional financial costs. We utilized Digital Elevation Models and flow data from provincially and municipally managed hydrometric stations, as well as from the Water Survey of Canada, to normalize streamflows based on area, slope, and elevation. This approach ensures hydrological predictions that account for variability in hydrological processes resulting from differences in lumped-scale watershed characteristics. The method was validated using streamflow data from hydrometric stations maintained by the aforementioned entities. For validation, each station was iteratively treated as ungauged by temporarily removing it from the dataset and then predicting its streamflow using the proposed methodologies. The results demonstrated that the amalgamated Drainage Area Ratio–Nearest Neighbors approach outperformed the traditional Drainage Area Ratio method, offering reliable predictions for diverse watersheds. This study provides an adaptable and cost-effective framework for enhancing water resource management across BC.

1. Introduction

British Columbia (BC), Canada, is renowned for its natural resources and is a major producer of fruits, vegetables, wine, and seafood, thanks to its diverse climate and geography. The province regulates access to its water resources by issuing water licenses under the Water Sustainability Act (WSA), which came into effect on 29 February 2016 [1], replacing the previous Water Act. This groundbreaking legislation integrates economic and environmental considerations by requiring statutory decision-makers to assess the Environmental Flow Needs (EFNs) of streams during the adjudication of water license applications. However, with only a limited number of streams being gauged across the province, managing water allocation while accounting for EFNs has become challenging for provincial water teams responsible for technical analyses and licensing under the WSA. As a result, some watersheds, such as the Nicomekl and Serpentine, have been over-allocated, with more licenses issued than the watersheds can support [2]. At the same time, other watersheds, which still have available capacity for licensing, lack sufficient data to fully understand the EFN requirements. Therefore, the effective management of BC’s water resources under the WSA requires accurate streamflow predictions, especially in regions with limited or no hydrological data.
Every descriptive method in hydrology fundamentally relies on the transfer of hydrological information from a gauged watershed to an ungauged watershed ( u g ) , utilizing known relationships and patterns to infer streamflow characteristics in areas lacking direct measurements [3,4]. However, traditional predictive methods often face challenges such as complexity, the need for extensive data, or suitability, primarily for large-scale watersheds like rivers and lakes [5]. For water rights authorization, however, there is a need for methods that are not only easy to implement but also require minimal data and perform effectively in small watersheds. These attributes are crucial to ensure that the methodology can be efficiently integrated into decision-making processes. Such methods are particularly well-suited for small streams or creeks, where applicants are often small-scale farmers or businesses.
In u g , where direct streamflow measurements are unavailable, hydrologists often rely on surrogate methods to estimate flow regimes. Among these methods, empirical techniques like the Drainage Area Ratio (DAR) and data-driven methods such as Nearest Neighbor ( N N ) are widely used in hydrology due to their simplicity and effectiveness, particularly in data-scarce environments [6,7,8]. In particular, the DAR approach, also referred to as the watershed area ratio, has been extensively applied to estimate hydrological data for u g worldwide [9,10], including in Canada [11]. It is a well-established empirical approach that estimates streamflow in u g watersheds based on the ratio of drainage areas between gauged and u g watersheds. While this lumped predictive method is easy to implement and data-efficient, it assumes that streamflow is directly proportional to the drainage area, without considering other hydrological factors that may influence flow.
While the DAR method offers a straightforward approach to streamflow estimation, particularly for water licensing applications, its lumped framework and the underlying assumption that streamflow scales solely with watershed area can introduce substantial uncertainties, potentially compromising the robustness of the resulting predictions. To ensure the proposed methodology works effectively for British Columbia, adjustments must be made to DAR, as streamflow in BC fluctuates across a range of time scales. On shorter time scales, ranging from less than a day to a few days, it is primarily influenced by weather events, including rainfall, snowmelt, ice melt, and evapotranspiration [12]. On longer time scales, such as over several years or decades, large-scale climatic phenomena like El Niño become significant factors [13]. Despite these variations, streamflow often follows predictable seasonal patterns, which are generally linked to the main sources of water flow: rainfall, snowmelt, and glacier melt [14].
The influence of other watershed characteristics, such as slope and elevation, becomes evident in how these seasonal patterns manifest, as these factors directly affect the timing, volume, and distribution of runoff throughout the year [4,15]. These key characteristics influence streamflow generation and, therefore, can serve as valuable indicators for predicting streamflow, particularly in u g watersheds [4,8]. The slope of a watershed affects the velocity of surface runoff, with steeper slopes leading to more concentrated runoff during precipitation events or snowmelt, resulting in higher peak flows [16]. Elevation also plays a significant role, as higher elevations are more likely to experience snowfall, which contributes to streamflow during warmer temperatures when snowmelt occurs [17]. This delayed contribution from snowmelt is particularly important in regions with substantial snowpacks. Additionally, elevation is often correlated with temperature and atmospheric conditions, influencing precipitation type and timing, which, in turn, affects runoff patterns. Collectively, watershed slope and elevation govern hydrological processes, offering essential insights into streamflow dynamics and overall watershed behavior. When combined with empirical methods such as DAR, these physiographic factors can enhance streamflow predictions, even in regions with limited hydrological data.
Integrating slope and elevation into the DAR method enhances its ability to normalize flow estimates beyond the conventional use of watershed area alone. Instead of relying solely on a watershed area as in the traditional DAR approach, these supplementary watershed characteristics offer a more comprehensive framework for flow estimation in the u g watershed. The selection of normalization parameters—whether watershed area alone, or in conjunction with elevation or slope—should be determined based on the combination that yields the most statistically robust results in neighboring watersheds. While the integration of N N s with D A R is not novel, having been previously explored [7], prior methodologies have primarily utilized the area solely as a scaling parameter. The novelty of the proposed approach lies in introducing two additional normalization parameters alongside area, with the optimal parameter being selected based on which yields the most statistically robust results when applied to neural network predictions of Ug using known streamflow data.
The proposed methodology is grounded in the principles of traditional hydrological predictive models, where hydrological data from gauged watersheds are transferred to u g watersheds under specific assumptions, in accordance with established hydrological laws [3]. To ensure that the hydrological characteristics of an u g watershed—namely, the timing, volume, and distribution of runoff throughout the year—correspond with those of gauged watersheds, we selected its N N s with similar flow-generating parameters, particularly watershed area, slope, and elevation. In contrast to the DAR, N N the method utilizes statistical learning to identify similarities between basins based on physical and climatic attributes, thereby providing more refined predictions by leveraging a broader set of basin characteristics. While both methods (DAR and N N ) have distinct strengths and limitations, their combination ( r e p r e s e n t e d   b y   N N s   +   D A R   h e r e a f t e r ) offers a promising approach to improving streamflow predictions in u g watersheds. The DAR method provides a quick, data-efficient estimate based on the basic physical characteristics of the drainage area, while the N N method can complement this by incorporating additional environmental variables to refine predictions. By integrating these methods, we aim to balance simplicity with accuracy, offering a more robust tool for hydrological forecasting in regions with limited data availability.
Building on the aforementioned observations, we hypothesize that normalizing watersheds located in the same hydrological zones based on their area (for homogeneous watersheds) alone, or combined with mean elevations (for snow-dominant watersheds) or slope (for rainfall-dominant watersheds), should—with reasonable efficiency—predict hydrological data for u g watersheds. To test this hypothesis, the study enhances the traditional DAR method by incorporating an N N approach for donor site selection, thereby accounting for hydrological and physiographic similarity. Accordingly, the methodological comparison is intentionally limited to the conventional DAR approach in order to clearly demonstrate the specific improvements introduced by the proposed modification.

2. Study Area and Data

The proposed approach was demonstrated in the Central and South regions of B C (Figure 1). Both climatic and geomorphological factors significantly influence flow generation in this region. The study area includes snow-fed and rain-fed watersheds, ensuring that the proposed method could be effectively validated for various flow generation mechanisms. Watershed descriptors from 15-gauge stations were retrieved using Digital Elevation Models ( D E M s ) sourced from the Base Map Online Store, a platform that allows users to search, display, and order orthophoto imagery, raster, and vector data within the QGIS framework.
Figure 1. Hydrometric Stations in British Columbia: (Top Right) regional context with the study region highlighted; (Top Left) zoomed-in view of hydrometric station locations; (Bottom) detailed elevation map showing terrain variations and hydrometric station distribution. This figure should be interpreted in conjunction with Table 1.
Figure 1. Hydrometric Stations in British Columbia: (Top Right) regional context with the study region highlighted; (Top Left) zoomed-in view of hydrometric station locations; (Bottom) detailed elevation map showing terrain variations and hydrometric station distribution. This figure should be interpreted in conjunction with Table 1.
Water 17 01502 g001
Whereas the hydrometric stations selected for the proposed analysis include the following:
1. Real-time monitoring stations managed by the provincial Surface Water team: McLennan Creek (08MH0058), Howes Creek (08MH0055), Little Campbell River (08MH0041), Elk Creek (08MF0004), and Clayburn Creek (08MH0051). The flow data for these hydrometric stations are available at a high temporal resolution, with measurements recorded at 15-minute intervals. Initially, these high-frequency data were averaged to derive daily and, subsequently, monthly streamflow values.
While the high-resolution modeling of streamflow dynamics, such as daily assessments, is essential for applications like flood forecasting, this study focuses on monthly streamflow data for the following reasons: (a) Water Resource Assessment and Management: Evaluating both the quantity and quality of water availability within a basin, and subsequently managing it for efficient supply and maintaining EFNs, typically involves analyses over extended periods, such as months or years. This approach aligns with the need for data aggregation to capture seasonal and annual variations in streamflow; (b) Correlation with Long-Term Climate Patterns: Monthly and annual streamflow series exhibit stronger correlations with long-term climatic factors compared to daily series. This characteristic facilitates the establishment of more robust connections between climate variables and streamflow dynamics, enhancing the understanding of hydrological responses to climatic influences.
These considerations underscore the appropriateness of utilizing monthly streamflow data in this research, balancing the need for temporal resolution with the practicalities of data availability and the study’s objectives. Moreover, analyzing monthly-averaged streamflows is advantageous due to their stable nature, which enhances the reliability of predictive methodologies. This approach facilitates a more robust assessment of model efficiency, as monthly data tend to exhibit reduced variability and are less susceptible to fluctuations [18].
2. The flow data for various hydrometric stations were obtained from the Water Survey of Canada’s Real-Time Data Index (https://wateroffice.ec.gc.ca/mainmenu/real_time_data_index_e.html; accessed on 21 January 2025), a vital resource for real-time hydrometric data from rivers, lakes, and streams across Canada. The daily flow data, spanning from 1956 to 2025, exhibit record lengths ranging from 5 to 74 years across various stations. The selected stations include:
  • Mclennan Creek near Mount Lehman (08MH082)
  • West Creek near Fort Langley (08MH098)
  • Bertrand Creek at International Boundary (08MH152)
  • Pepin Creek at International Boundary (08MH156)
  • Fishtrap Creek at International Boundary (08MH153)
  • Chilliwack River below Slesse Creek (08MH055)
  • Nathan Creek near Glen Valley (08MH084)
  • Liumchen Creek near the Mouth (08MH57)
  • Anderson Creek at the Mouth (08MH104)
  • Nicomekl River at 203rd Street in Langley (08MH155)
This dataset provides valuable historical and recent streamflow records essential for hydrological analysis and research. Managed by Environment and Climate Change Canada, this platform provides comprehensive information on water levels and flows from an extensive network of monitoring stations. It supports applications in water resource management, flood forecasting, and environmental protection. Users can search for data by station, region, or watershed and explore historical datasets for trend analysis and research.
3. Hydrometric stations managed by the municipal government: Atchelitz Creek (PD200737), Gibsons Creek (HYD-GIBS-R1), and Wilfred Creek (PD189470).
The flow data for these hydrometric stations were obtained directly from the municipal government in digital format. The data from these stations are collected using two different methods. In some cases, such as PD200737 and HYD-GIBS-R1, continuous measurements are recorded using installed hydrometric instruments. In other cases, such as PD189470, data are obtained through spot measurements, where flow observations are manually recorded on specific dates rather than continuously monitored.
4. Observation post includes sites like Gamelin Creek (PD43584), where local knowledge was leveraged to determine the periods of maximum and minimum streamflow throughout the year. Instead of continuous monitoring, these observations rely on insights from local residents or field assessments to identify seasonal flow patterns.
The subsequent table delineates the temporal coverage and mean monthly discharge metrics for each hydrometric monitoring station under consideration. The temporal coverage specifies the duration over which continuous hydrological data were systematically collected at each station, while the mean monthly discharge represents the average volumetric flow rate computed across the entire observational period. These metrics are instrumental in assessing hydrological characteristics and variability within the monitored watersheds.
The table presents hydrometric data from various stations, including both continuously monitored sites and observation posts. Record lengths vary significantly, with long-term data available for stations like 08MH098 (1960–2023) and recent records for HYD-GIBS-R1 (2023–2025). Watershed areas range widely, from 1.69 million m2 (08MH0058) to 426 million m2 (08MH055), reflecting different catchment sizes. Mean monthly flow varies substantially, with the highest recorded at 47.7 m3/s (08MH055) and much lower values at smaller stations, such as HYD-GIBS-R1 (0.081 m3/s). Elevation and slope data indicate a diverse range of terrains, from low-lying areas (PD200737: 20.727 m elevation, 0.842 slope) to steep, high-altitude regions (PD189470: 1167.926 m elevation, 27.944 slope). Notably, PD189470 is an observation post, meaning it relies on local knowledge and periodic measurements rather than continuous hydrometric monitoring.

3. Methodology

The rationale for adopting this methodology ( N N s + D A R ) to predict streamflows in BC is based on the approach proposed by Ahmed [19]. The study highlights that the most effective means of estimating streamflow characteristics at u g sites is through regional procedures that utilize hydrologic zones—areas with homogeneous runoff characteristics where available data can be extrapolated with reasonable accuracy. These zones are typically delineated using physiographic features or statistical analysis of hydrologic data to ensure reliable streamflow estimations. Given the highly heterogeneous nature of British Columbia’s hydrology and the limited availability of gauged data, the study employed a physical mapping procedure to define hydrologic zones. This approach follows the methodology outlined in the BCSI report [20], which is publicly accessible at: https://catalogue.data.gov.bc.ca/dataset/329fd234-8835-4d44-9aaa-97c37bfc8d92; accessed on 14 January 2025.
The delineated hydrological zones used in this study are presented in Figure 2.
However, the same study also confirmed that there are cases where an NN approach to selecting stations for prediction in u g watersheds may be more appropriate.
The methodology was applied to multiple locations across BC where the published literature showed that mean elevation and runoff magnitudes are directly related, i.e., runoff magnitude increases with an increase in basin elevation [19]. Whereas the slope of a watershed plays a pivotal role in determining variations in surface hydrology, particularly in pluvial watersheds, impacting the time between precipitation and maximum discharge within the watershed [21]. In the context of BC, Sharma and Dery [22] also identified a significant positive correlation between the slope of watersheds and Atmospheric River-related Annual Maxima runoff percentage across the province. Therefore, three basin characteristics (area, mean elevation, and slope) were selected based on (1) the published literature, which showed how they impact runoff magnitudes; (2) ease of availability of these parameters; and (3) convenience/ease of implementation and reimplementation of the methodology. It is also important to note that while slope is widely recognized as a key factor influencing streamflow generation within a watershed, where steeper gradients promote faster runoff and higher flow velocities, the role of elevation is more complex and may not always exhibit a direct correlation with streamflow [23]. This distinction becomes particularly relevant in lumped hydrological modeling, where streamflow predictions are derived from limited input data. In such modelling techniques, assuming a fixed relationship—whether direct or inverse—between elevation and streamflow without empirical validation can introduce significant uncertainties, leading to errors in the prediction methodology. Therefore, a more nuanced approach that accounts for the variability in elevation effects is crucial for enhancing the accuracy and reliability of hydrological models.
Beyond the scientific foundation of the methodology, its implementation necessitates hydrological data from multiple locations across the target watersheds where streamflow estimations are required. The province of BC has several abandoned and real-time hydrometric stations across its area. Since real-time hydrometric stations are only located on large water bodies (rivers or large creeks) and authorization is not limited to the larger water bodies, we used abandoned hydrometric station data in our analysis as well.
The methodology for predicting monthly flows in u g watershed is outlined in the following steps, followed by the mathematical algorithms used to generate hydrological data for u g .
  • Clustering Watersheds by Hydrological Zone
Watersheds within the same hydrological zone were clustered together. To identify the most relevant watersheds for analysis, we calculated their distances from the u g watershed using Euclidean space metrics. This clustering approach ensured that the selected watersheds were hydrologically similar and spatially relevant.
2.
Watershed Delineation
Using D E M s , we meticulously delineated the watershed boundaries for each gauged and u g watershed. The delineation process was carried out to accurately define the contributing areas and ensure consistency in the dataset. Below is the watershed delineation at the outlet for Gamelin Creek (represented by PD43584 in Figure 3), provided as an example in our analysis.
3.
Area, Elevation, and Slope Estimation
The Q G I S   S A G A plugin was employed to calculate the watershed area, mean elevation, and mean slope for the N N s of the u g watershed. This provided a quantitative basis for comparing the physical characteristics of the u g watershed with its neighboring watersheds.
  • While watershed area and elevation are the primary factors used to identify N N s and drive the streamflow prediction process, slope has also been incorporated due to its significant role in runoff generation, particularly in low-lying areas. As noted in the Introduction section, in regions where rainfall is the dominant driver of streamflow and elevation has minimal influence, slope becomes a crucial factor in determining streamflow magnitudes. This relationship has been well-documented in B C by Sharma and Dery [22], reinforcing the importance of integrating slope alongside area and elevation to enhance the accuracy of hydrological modeling.
4.
Sorting by Elevation and Slope Difference
The N N s were further sorted based on the absolute difference in mean elevation and slope between each N N and the u g watershed. This secondary sorting step ensured that the watersheds with the most similar topographical characteristics were prioritized for subsequent analysis.
5.
Flow Data Extraction
Flow magnitude data for the selected N N s were obtained from multiple sources to ensure a comprehensive hydrological assessment. These sources included: (1) the Water Survey of Canada’s database (https://wateroffice.ec.gc.ca/; accessed on 14 January 2025), (2) Surface Water-managed hydrometric stations, (3) municipal government records, and (4) public contributions. The dataset comprised mean monthly flow values, spot measurements, and observations of maximum and minimum flow occurrences, all of which were essential for characterizing the hydrological patterns of the selected N N s and improving the reliability of streamflow predictions.
6.
Selection of NNs for Data Coverage
To ensure a comprehensive and accurate representation of flow data, the N N s were chosen in a specific sequence. The process began by selecting the N N with the smallest elevation or slope difference compared to the u g watershed. This was carried out to ensure that the first N N was hydrologically similar, as elevation or slope can significantly influence streamflow. Once the first N N was selected, additional N N s were added one by one in iterative steps. The goal was to ensure that for each month of the year, there were at least two mean monthly flow values available from different N N s . This selection strategy was informed by observations during the cross-validation phase, where it became evident that the most accurate predictions occurred when each month was represented by flow data from at least two N N s . Using fewer than two N N s often failed to capture the variability in flow patterns, resulting in oversimplified outputs. Conversely, incorporating more than two N N s tended to introduce unnecessary complexity and led to overly smoothed predictions. These findings are consistent with prior studies by Samaniego et al. [24] and Qamar et al. [8], which also emphasized the importance of balancing representativeness and model parsimony in N N s selection.
7.
Prediction of Flow Data for u g Watersheds
Once the N N s were selected, each N N in the cluster was alternately treated as u g . For the remaining N N s with known hydrological data, the flow data at the u g watershed was predicted using the proposed approach, broadly summed up in the following steps:
  • The flow data (obtained from multiple sources mentioned above) for each watershed with known flow was normalized by dividing the monthly flow values by the watershed area, producing a unit area discharge value m 3 s m 2 .
  • The normalized flow data were then scaled up to match the watershed area of the u g watershed, providing an initial prediction of the flow.
  • To further refine the prediction, the scaled-up flow data were normalized based on the elevation and slope of each watershed, accounting for topographical influences.
  • The adjusted flow values for each N N were averaged, yielding a representative predictive monthly flow dataset (expressed as two column vectors) for the u g watershed. Watershed area, elevation, and slope were used as mathematical operators in this process.
8.
Validation Using Mean Absolute Error ( δ )
The predicted flow data were compared with observed flow data at the gauged stations to calculate δ . This process was repeated iteratively, with each basin being removed once to simulate an u g scenario. The error metric, representing δ for each basin k , was calculated as follows:
δ k = 1 n i = 1 n P i k O i k
where P i k and O i k are the predicted and observed flow values, respectively, and n is the number of observations.
By systematically iterating through all basins, the methodology ensured robustness and consistency in the predictive flow estimates for u g watersheds.
Mathematically, the modeling process can be defined as the following steps.
Definitions:
Q u ( m ) = Predicted flow for the original u g watershed u for month m .
Q i ( m ) = Measured flow for the gauged watershed i for month m .
n = Total number of gauged N N s used for prediction.
A u ,   A i = Watershed area of the u g and gauged watershed i , respectively.
E u ,   E i = Mean elevation of the u g and gauged watershed i , respectively.
S u ,   S i = Mean slope of the u g and gauged watershed i , respectively.
w i = Weight assigned to each gauged N N (optional, default is equal weighting).
  • Step 1: Normalizing Flows by Watershed Area
To normalize the monthly flow Q i ( m ) of each gauged station i of the current month m by its watershed area:
Q i ( m ) = Q i ( m ) A i
  • Step 2: Normalizing by Elevation and Slope
For each gauged station i , normalize the area-normalized flow Q i ( m ) using elevation and slope:
Elevation Normalization
Q i , E ( m ) = Q i ( m ) E i = Q i ( m ) A i · E i
Slope Normalization
Q i , S ( m ) = Q i ( m ) S i = Q i ( m ) A i · S i
  • Step 3: Treating Each Gauged Station as u g
Each N N   i is treated as if it were u g i . Its flow is predicted using the remaining N N s by scaling the normalized flows back (or up) to the characteristics of u g i .
  • Predicting Flow for Each N N using Area:
Q u , A i m = A u i · 1 n 1 j i Q j , A ( m )
where j represents the remaining number of N N s after i is assumed to be u g and removed from the dataset.
  • Predicting Flow for Each N N using Elevation:
Q u , E i m = A u i · E u i · 1 n 1 j i Q j , E ( m )   and
Q u , E i m = A u i E u i ·   1 n 1 j i Q j , E ( m )
The position of elevation in the denominator indicates that elevation is inversely proportional to watershed streamflow.
  • Predicting Flow for Each N N using Slope:
Q u , S i m = A u i · S u i · 1 n 1 j i Q j , S ( m )
  • Step 4: Averaging Monthly Predictions
To obtain the final predicted flow for each N N treated as u g i , we take the average of the predicted flows ( Q ¯ ) over all the months:
  • Average Monthly Flow using Area:
Q ¯ u , A [ i ] = 1 12 m = 1 12 Q u , A i m
  • Average Monthly Flow using Elevation:
Q ¯ u , E [ i ] = 1 12 m = 1 12 Q u , E [ i ] ( m )
  • Average Monthly Flow using Slope:
Q ¯ u , S [ i ] = 1 12 m = 1 12 Q u , S [ i ] ( m )
  • Step 5: Calculating δ
To determine which normalization (area, elevation, or slope) is more effective, compare the predicted flows with the actual flows using δ :
  • δ for Area:
δ A = 1 n i = 1 n Q ¯ i Q ¯ u , A [ i ]
  • δ for Elevation:
δ E = 1 n i = 1 n Q ¯ i Q ¯ u , E [ i ]
  • δ for Slope:
δ S = 1 n i = 1 n Q ¯ i Q ¯ u , S [ i ]
  • Step 6: Predicting Flow for the Original u g
Whichever normalization (elevation or slope) produces the lowest δ is selected for predicting the flows for the original u g :
Q u m = A u · 1 n i = 1 n Q i , E m ,     i f   δ A < δ E , δ E , δ S A u · E u · 1 n i = 1 n Q i , E ( m ) ,     i f   δ E < δ A , δ E , δ S     A u E u · 1 n i = 1 n Q i , E ( m ) ,     i f   δ E < δ A , δ E , δ S   A u · S u · 1 n i = 1 n Q i , S ( m ) ,     i f   δ S < δ A , δ E , δ E
Note that the downward arrows ( ) indicate stations where the inverse relationship between elevation and discharge was considered, as it resulted in a lower δ compared to the direct relationship ( ) .
Final Predicted Flow for Each Month:
The final equation used to predict the monthly flow Q u ( m ) for the original u g is as follows:
Q u m = P u · A u · 1 n i = 1 n Q i ( m ) A i · P i
where P is the chosen parameter (either area, elevation, or slope) based on which one resulted in the lower δ . Please note that if the selected P u is elevation and is inversely related to discharge, then Equation (15) will be displayed as follows:
Q u m = A u P u · 1 n i = 1 n Q i ( m ) A i · P i
For improved clarity and understanding, the methodology is presented in Figure 4.
It is important to note that the hydrological zones delineated in Figure 2—based on the study conducted by Ahmed [19]—represent regions where watersheds exhibit strong correlations in flow behavior. Consequently, N N s are selected exclusively from within these zones, ensuring consistency in flow-generating mechanisms between u g watershed and its neighbors.

4. Results and Discussion

To assess the effectiveness of the proposed methodology, we compiled observed streamflow data from various hydrometric stations managed by provincial hydrometric specialists, municipal governments, and observation stations. In cases where direct discharge measurements were not feasible, local residents recorded the months of maximum and minimum flow. The observed data were then compared with the predicted values. As detailed in previous sections, some hydrometric stations recorded data at high temporal resolutions, including 15-minute intervals or daily measurements. To standardize the dataset for analysis, these high-resolution observations were aggregated to a monthly time scale by computing the mean discharge for each month using R (version 4.3.0).
The watersheds analyzed in this study are primarily influenced by rainfall in the southern regions of the province and by snowmelt in the northern regions of British Columbia. To evaluate the performance of the proposed methodology, we compared the results with the conventional D A R method, which serves as a key normalization parameter alongside elevation and slope. In regions lacking installed hydrometric gauges, supplementary information was obtained from residents whose families have lived in the area for multiple generations. These individuals, who either utilize water from nearby channels or possess extensive knowledge of regional flow patterns, provided qualitative insights into seasonal discharge variability.
As outlined in the Methodology section, while the relationship between watershed slope and discharge is direct, the connection between elevation and discharge is less certain. To assess the relationship between elevation and discharge, we calculated the δ in the N N s of u g under two assumptions: one where elevation is directly related to discharge, and another where this relationship is not assumed. This process was repeated for all N N s , with the predicted monthly streamflows compared to the observed streamflows for each N N . The relationship that resulted in the lowest δ was selected for predicting the hydrological data at u g . Table 2 displays the δ values obtained by utilizing area, elevation, and slope as normalization parameters for streamflow prediction in the N N s of u g .
Notably, stations such as 08MH098, 08MH152, 08MH156, 08MH153, and 08MH055 demonstrated improved prediction accuracy when the inverse relationship between elevation and discharge was applied. In contrast, for other stations, the direct relationship between elevation and discharge, or the use of other normalization parameters, yielded the most accurate results. These variations emphasize the significance of considering alternative relationships for elevation as a normalization parameter, underscoring the need for a flexible approach when optimizing streamflow predictions in the current modeling framework.
The relationship between elevation and streamflow varies across the study area. While most of the stations in the study area exhibit a direct relationship with mean elevation, some stations show an inverse relationship between elevation and flow. This pattern is particularly evident in larger watersheds, such as 08MH055. This inverse trend in higher-elevation watersheds is likely influenced by a combination of delayed snowmelt contributions, high infiltration rates—suggested by Google Earth imagery indicating dense vegetation—longer water travel times (time of concentration), and precipitation distribution patterns. Identifying the dominant factor requires further analysis of precipitation data, land cover, soil permeability, and groundwater interactions.
Similarly, the inverse relationship between elevation and flow in lower-elevation watersheds along the international border between the USA and Canada (08MH152, 08MH153, and 08MH156) can be attributed to increasing groundwater contributions as the point of interest within the watershed shifts downstream. Groundwater contributions were particularly evident during our field visits, and an image from one of these visits is provided below in Figure 5:
Notably, 08MH098 is located on West Creek, near its drainage point into the Fraser River, making it more susceptible to groundwater influence.
We observed that for u g watersheds, better predictive results were achieved when the N N s had similar watershed characteristics to those of the u g . The normalization procedure tended to produce less efficient results when it was applied to watersheds with significant differences in characteristics. In contrast, when the u g and its N N s had similar characteristics, less normalization was required, leading to more accurate predictions. For instance, watershed PD43584 had a δ of 38.381—significantly higher than that of other watersheds in the study, such as 08MH082. This can be explained by the substantial variation in size among PD43584’s NNs, which leads to significant differences in discharge magnitudes. In particular, one of its NNs, 08MG005, exhibits a much higher discharge than the others, which skews the normalization and results in a disproportionately high δ value. As a result, predictive performance at 08MH082 is theoretically expected to be more reliable than at PD43584, given its more consistent watershed characteristics.
Table 3 presents δ and Nash–Sutcliffe efficiency values for various watersheds, calculated using the hydroGOF package in R(version 4.3.0) [25], highlighting how these metrics vary based on differences in watershed characteristics between u g and its N N s . The table also identifies the normalization parameter selected for each station and demonstrates how δ and η change depending on the degree of variation—both minimum and maximum—in watershed characteristics between u g and its N N s .
The minimum, average, and maximum variations in watershed characteristics between the target watershed ( u g ) and its N N s , along with the corresponding performance parameters, η and δ in Table 3, highlight how different hydrometric stations exhibit varying degrees of similarity to u g in terms of elevation, watershed area, and slope. The variation percentages indicate the extent of deviation of each N N from u g , with lower values representing greater similarity.
For stations where elevation was used as the primary normalization parameter, the variation ranged from as low as 0.404% (HYD-GIBS-R1) to as high as 88.44% (PD189470). Notably, larger deviations in elevation were associated with lower η values, suggesting reduced predictive performance, as seen in PD189470 ( η = −0.707). Conversely, stations with relatively low variation, such as 08MH084 (average variation = 9.034%), exhibited higher performance with η values approaching 1.
Similarly, when the watershed area was used as the normalization parameter, variations remained moderate, with maximum deviations around 38.534% (08MH082, 08MH098, and 08MH156). In these cases, the η values generally remained positive, indicating reasonable predictive accuracy.
For stations where slope was the chosen parameter, variation patterns varied widely. While some stations (e.g., 08MH104 and 08MH155) showed minimal variation (1.901%), others, such as PD200737, exhibited extreme deviations, exceeding 300%. Interestingly, extreme variations in slope were associated with significantly lower η values (e.g., η = 0.0746 for PD200737), highlighting the critical importance of selecting N N s with watershed characteristics that closely match those of u g to ensure more reliable predictions.
Overall, the table underscores that η and δ values are sensitive to the choice of normalization parameter and the degree of variation between u g and its N N s . The findings suggest that selecting the most appropriate N N s for u g should be guided by minimizing the variation between N N s and u g . Moreover, by reviewing Table 3 and Table 4 concurrently, it can be clearly interpreted that as the performance of the normalization parameter in predicting the hydrological data of N N s deteriorates (resulting in larger M A E values), the prediction performance for u g also declines. This indicates that the prediction performance can be assessed prior to the actual application of the model for u g . For example, for hydrometric stations 08MH157, 08MH0004, and PD43584, the performance deteriorates, as shown in Table 3, with increasing δ values for predicting the hydrological data of N N s for u g . By the same token, stations 08MH098, 08MH084, and 08MH0058 demonstrated strong statistical performance, as reflected in their comparatively better predictive performance for predicting the hydrological data of their N N s , as shown in Table 2.
The results presented below illustrate the outcomes when the area parameter is utilized as the normalization parameter in the prediction process for u g , where the area parameter yielded the least error among the N N s for these stations.
Figure 6 compares observed (“Actual”) and modeled (“DAR”) streamflow discharge across four hydrometric stations, revealing seasonal variations in discharge trends. At station 08MH0051, the model underestimates discharge during the high-flow season (January to March) but aligns more closely during the low-flow period (May to September). Similarly, at 08MH082, the DAR model captures seasonal trends but overestimates discharge in the high-flow period, while providing a good fit during low-flow periods. Station 08MH098 exhibits strong agreement between modeled and observed values, with minimal deviations. At PD200737, the model underestimates streamflow during the summer months but aligns better in other periods. Overall, while the DAR method effectively captures seasonal discharge patterns, some discrepancies persist, particularly in high-flow conditions, indicating the need for further model refinement.
Building upon the analysis of the area normalization parameter, the subsequent results investigate the performance of the elevation parameter in normalizing the prediction of u g .
Figure 7 presents a comparison between predicted and observed discharge values across eight hydrometric stations: 08MH152, 08MH156, 08MH153, 08MH055, 08MH0055, 08MH0004, 08MH0058, and 08MH084.
For station 08MH152, observed discharge values (represented by the blue line) generally exceed predicted values during the high-flow period (January to March). However, during the low-flow period (June to August), predictions from both methods align more closely with the observed data. In station 08MH156, predictions from both the N N s + D A R and D A R methods are nearly identical, indicating that the inclusion of N N s does not significantly enhance the accuracy over the D A R method alone. Both methods, however, tend to underestimate discharge during the high-flow period.
For station 08MH153, the N N s + D A R method provides improved prediction accuracy compared to D A R alone, particularly during the high-flow period. Despite this improvement, both methods underestimate discharge during the low-flow period.
Station 08MH055 exhibits high variability in observed discharge values, with a notable peak in June. Neither prediction method successfully captures this sharp peak, although both methods adequately represent the overall seasonal trend. The N N s + D A R method provides a better fit to observed discharge values than the D A R method alone.
In station 08MH0055, both methods tend to underestimate discharge during high-flow periods (January to April) and overestimate during low-flow periods (July to August). However, the N N s + D A R method provides a closer match to observed values compared to D A R alone.
For station 08MH0004, observed discharge displays a significant peak in June, which is overestimated by the D A R method, but more accurately captured by the N N s + D A R approach. Both methods, however, tend to underestimate discharge during the winter months.
For station 08MH0058, both N N s + D A R and D A R methods closely track the observed discharge throughout the year, with slight underestimation during peak flows in January and overestimation during the late fall. These minor discrepancies indicate good predictive performance at this station. In station 08MH084, the observed, N N s + D A R , and D A R lines are well-aligned, effectively capturing both high- and low-flow periods, suggesting that both methods perform similarly and effectively.
For HYD-GIBS-R1, the D A R model tends to overestimate discharge, particularly during peak flow periods, indicating potential limitations in accurately capturing seasonal variability. In contrast, the N N s + D A R hybrid method provides a closer approximation to actual discharge values, suggesting improved predictive capability. Notably, during the low-flow period around June, both models struggle to capture the sharp decline, though N N s + D A R follows the trend more closely.
In general, the N N s + D A R method generally improves prediction accuracy compared to the D A R method alone, particularly in capturing seasonal trends and reducing errors during low-flow periods. However, challenges persist in accurately capturing sharp peaks and high-flow events, particularly at stations with more variable hydrological regimes, which may necessitate the introduction of basin-scale information into the current modeling framework.
Subsequent to the analysis of the elevation normalization parameter, the following section assesses the influence of the slope parameter on the normalization of u g predictions.
Figure 8 presents a comparison between predicted and observed discharge values for four hydrometric stations (08MH104, 08MH155, 08MH0041, and 08MH157), with the slope parameter used as the normalization parameter.
For station 08MH104, both prediction methods follow the seasonal trend of the actual data, although D A R tends to slightly overestimate discharge in January and underestimate it during the late fall. The N N s + D A R method provides a closer fit to observed data, especially during low-flow periods.
At station 08MH155, both the N N s + D A R and D A R methods show good agreement with actual discharge, effectively capturing both high- and low-flow periods, though D A R slightly overestimates discharge in the winter months.
For station 08MH0041, both prediction methods underestimate peak flows observed in January and February but align well with actual values during the low-flow period (June to August). The N N s + D A R approach provides a closer match to observed discharge throughout the year compared to D A R alone.
In contrast, station 08MH157 exhibits higher variability in actual discharge, with a sharp peak observed around June. Both D A R and N N s + D A R methods overestimate flows during this period, although N N s + D A R performs slightly better in aligning with the actual pattern. Additionally, both methods tend to overestimate discharge in the early months and underperform during the low-flow season.
The above results also confirm that the proposed methodology consistently predicts the months of maximum and minimum monthly flow. This accuracy was further validated for PD43584, designated as an “Observation Post,” where local Indigenous knowledge was employed to determine the timing of peak and lowest flow values. The predicted flow rates and their corresponding months are presented in Figure 9.
For PD43584, the observed maximum monthly flow occurs in June, while the minimum flow is recorded during the winter months, aligning precisely with the insights provided by local Indigenous knowledge.
An overall performance comparison of the proposed methodology with D A R showed that the proposed methodology can predict hydrological data in u g watersheds with reasonable accuracy, making it applicable for predicting hydrological data elsewhere. Applying this methodology to N N s of u g watersheds demonstrate their effectiveness at the local scale before extending their application to other u g watersheds. This property helps determine whether to use the methodology for predicting hydrological data for a specific point of diversion or to rely on information from another source if better results are not achieved in the N N s .
A comparative analysis of model performance for streamflow prediction using two methods: the D A R method and N N s + D A R approach are presented in the following Figure 10.
From the figure, it is clear that the N N s + D A R method consistently outperforms the D A R method alone in terms of both efficiency and accuracy across most hydrometric stations. Specifically, the N N s + D A R approach yields higher efficiency values, indicating a better ability to reproduce the observed discharge variability and lower error, reflecting smaller deviations from actual streamflow data. The improvement is especially notable in stations with more complex and fluctuating flow regimes, where the combined method demonstrates significantly better performance with higher efficiency and reduced error. However, for stations characterized by stable and less variable streamflow patterns, both methods perform comparably, suggesting that the simpler D A R method may still be sufficient in such cases. Overall, the figure underscores the effectiveness of integrating N N s with D A R for enhancing streamflow prediction accuracy, particularly in dynamically varying hydrological conditions where traditional methods may fall short.
Figure 6, Figure 7 and Figure 8 indicate that the model performs poorly for watersheds 08MH055 and 08MH157. This poor performance can be attributed to the transboundary nature of these watersheds—portions of their drainage areas extend into the United States. The DEMs used in our analysis are limited to the Canadian side of the international border and therefore only generate streamflow networks within Canadian territory. As a result, the delineation of these watersheds is incomplete (see Figure 11 below), leading to partial streamflow networks. This incomplete representation introduces inaccuracies in key watershed characteristics, including area, slope, and elevation, which, in turn, negatively impact the model’s predictive performance.
As illustrated in Figure 11 above, the delineation only covers the northern portion of the watersheds within Canada, with no contribution from the U.S. side.
Additionally, the performance deterioration is more pronounced for stations with larger watershed areas. For example, stations like 08MH157 and 08MH055, with watershed areas of 39,879,899 m2 and 426,331,622 m2, respectively, show much worse performance than stations with smaller areas, such as 08MH0058 (1,691,188 m2) or 08MH104 (26,357,626 m2). Larger watersheds tend to exhibit more complex and varied hydrological behavior, making it more difficult for the model to accurately predict flow dynamics when there is significant variation in watershed characteristics. This highlights the importance of both the similarity in characteristics between the station and its N N s and the size of the watershed in determining the model’s predictive accuracy.
This issue becomes particularly pertinent when the N N of an u g is a substantially larger water body, such as a major river, compared to the other N N s . It also arises when predicting flow for a tributary watershed using data from a much larger hydrological system. To address such cases, we propose either excluding watersheds with significant area discrepancies or, if exclusion is not feasible, incorporating additional neighboring watershed(s) to normalize the impact of including a significantly larger water body. While establishing what constitutes a “significantly larger” watershed can be complex, this methodology proves valuable when it is necessary to incorporate a hydrologically distinct or larger watershed into the analysis in order to ensure comprehensive data coverage across all twelve months of the year. This approach is particularly useful when a suitable N N cannot be identified, necessitating the selection of a more compromise-prone N N .
A practical example of this approach is demonstrated in the flow prediction for station 08MH157, where the N N s 08MH004 and 08MH055 were selected due to their minimal δ in relation to the normalization parameter of elevation (see Figure 12 below). To mitigate the risk of oversimplifying the results, the larger watershed 08MH055 was chosen despite its considerable size and elevation. To balance the impact of including a large watershed (with a portion of its area located in the United States and, therefore, not fully delineated in the stream network), we also incorporated watershed 08MH163—entirely located within Canada and covering an area of 26,075,739 m2—as an additional neighboring watershed in the analysis. This adjustment helped correct the oversimplification introduced by the inclusion of a hydrologically dominant watershed.
Figure 12 above demonstrates the marked improvement in prediction accuracy achieved by applying the scaling factors derived from this procedure. Specifically, the performance indicators were enhanced, with δ reducing from 2.070 to 0.794 and η increasing from −1.908 to 0.520. The dotted lines represent the mean monthly discharge for each dataset: actual, initial prediction, and revised prediction. The placement of these mean values highlights the tendency of the initial prediction to overestimate discharge, while the revised prediction more closely approximates observed trends. These results demonstrate the effectiveness of incorporating additional N N s to improve streamflow predictions, particularly when dealing with larger watersheds that extend beyond Canada’s borders and, therefore, lack complete delineation. This approach contributes to a more robust and reliable modeling framework.
The accuracy of predicted discharge data can be significantly affected by anthropogenic activities as well, particularly during critically low-flow periods and peak irrigation demand months, as unauthorized water diversions across British Columbia’s water channels contribute to flow variability and discrepancies between observed and predicted values [26]. These diversions disrupt the natural flow regime by introducing artificial and undocumented alterations, especially in upstream areas. When diversions occur upstream of gauging stations, they reduce recorded flow volumes, which are crucial for model calibration and validation. As a result, prediction models misinterpret these diminished flows as indicative of natural conditions, leading to a systematic underestimation of streamflow in downstream and u g regions. This issue is particularly relevant in cases where a disproportionate amount of unauthorized water is diverted from a stream surrounded by agricultural areas, compared to its N N s with minimal agricultural influence. Here, we must note that an argument can be made that the presence of unauthorized water diversions (i.e., the extraction or redirection of water without a valid license or in excess of permitted amounts) in historical records means they are inherently included in the observed discharge data used for model training. However, the critical issue is the interannual variability of these unauthorized diversions, which introduces uncertainty in the prediction process. The proposed models are trained on past data that reflect a mixture of natural flows and any historical anthropogenic alterations, including unauthorized diversions. While this enables the model to implicitly learn patterns under past diversion conditions, it does not account for year-to-year fluctuations in the magnitude, timing, or spatial distribution of unauthorized diversions, particularly during periods of low flow and high irrigation demand. These variations are unrecorded and differ significantly across watersheds depending on land use (e.g., intensity of agriculture), enforcement practices, and climatic conditions. Because there is no consistent or quantifiable record of unauthorized water use across the province, these diversions introduce a non-stationary component in the discharge data that the model cannot reliably capture or forecast. This contributes to the observed discrepancies between predicted and actual flows during critical periods.
Unauthorized diversions also undermine fundamental hydrological assumptions in predictive models, such as the assumed stability between watershed characteristics (e.g., area, slope, and elevation) and streamflow. These models are built on the premise of natural, unaltered systems; however, diversions artificially reduce flow magnitudes, weakening the model’s ability to make accurate predictions. The impacts of these diversions extend across watersheds, introducing errors that propagate through the network, distorting predictions for u g locations. This issue becomes particularly pronounced during extreme hydrological events such as drought, where unauthorized diversions can exacerbate low-flow conditions during droughts, further undermining prediction reliability. The lack of regulation and documentation of unauthorized diversions complicates efforts to quantify their impact, exacerbating underestimation and reducing the overall accuracy of streamflow models. To enhance future predictive models, we recommend incorporating the proportion of agriculturally licensed water use areas relative to the total agricultural or water consumption area, ensuring a more comprehensive representation of human-induced hydrological impacts.
Another factor that could introduce uncertainty is the presence of springs within the watershed, which can impact localized hydrology. Springs, as localized sources of groundwater, contribute to streamflow, particularly during dry periods. Their impact can vary based on size, location, and seasonal groundwater variations. Since spring flow may not always be captured by traditional gauging stations or models focused on surface runoff, their contribution can be overlooked, leading to inaccuracies in streamflow predictions. Additionally, springs can alter local flow dynamics, especially in areas where groundwater discharge significantly influences streamflow. It may be contended that the influence of springs—whether perennial or ephemeral—is inherently included in the discharge measurements recorded at downstream hydrometric stations. Indeed, in such cases, the spring contributions are implicitly represented in the training data used by the model. However, our concern lies in the localized and heterogeneous nature of spring inflows, which may not be uniformly distributed across different watersheds. When using data-driven models such as the proposed technique, particularly those that leverage data from neighboring watersheds or use regional generalizations, the presence or absence of spring contributions introduces site-specific hydrological complexity. These complexities are not always captured effectively when predictor variables do not explicitly account for groundwater–surface water interactions. Moreover, while perennial springs tend to have more stable contributions throughout the year, seasonal variability and interannual changes in spring discharge (due to variations in groundwater recharge, land use, or climatic conditions) can subtly alter flow dynamics, especially during transition seasons like spring and fall. In models that do not explicitly parameterize groundwater discharge or spring dynamics, this can manifest as prediction discrepancies [27]. In summary, while spring contributions are captured at the gauge level, their variability and spatial non-uniformity—especially when transferring model assumptions or training across watersheds—can still pose challenges for predictive accuracy, and this is the context in which they were noted as a potential disturbance factor.
A relevant example of this issue can be found in Wilfred Creek (PD200737), located in the Chilliwack region of BC. In this area, a spring plays a significant role in contributing to the flow of the creek, which has implications for streamflow predictions. The presence of the spring, which feeds groundwater into the creek, introduces complexities in accurately forecasting the flow, as traditional models may not capture the contribution of this subsurface water source. Since the flow from the spring is not always accounted for by standard runoff-based models, the predictions for PD200737 can be inaccurate when relying solely on surface runoff data.
Table 4 shows that when the runoff generated exclusively by the spring is incorporated into the predicted flow values, the prediction accuracy improves significantly. This highlights the importance of considering all water sources, including springs, in hydrological modeling. Without integrating these local hydrological factors, such as spring contributions, the predictions could be misleading, resulting in the over- or underestimation of streamflow. Therefore, understanding the full spectrum of hydrological processes, including spring-fed contributions, is essential for improving model accuracy and making more reliable streamflow forecasts in such areas.
To address these challenges regarding unauthorized diversions and the contribution of springs, it is essential to implement monitoring and quantification systems to track diversions, correct historical flow records to account for these alterations, make use of local indigenous knowledge, and incorporate anthropogenic influences into prediction models. This requires a better understanding of local hydrology and translating those factors into the modeling procedure. This translation was beyond the scope of the current study. Implementing these strategies is vital for improving the accuracy of streamflow predictions and ensuring sustainable water resource management, particularly during “high demand” irrigation periods. It is important to clarify that the term “high demand” in the aforementioned sentence is used specifically in the context of water supply. While individual water diversions authorized under existing licenses on the stream may appear minor, their cumulative impact on low-flow systems, typically with discharges below 1 m3/s, can be substantial. This is particularly true during dry summer months when the natural baseflow is already limited. These small but numerous withdrawals can introduce noticeable variability in observed streamflow, which, in turn, affects the accuracy of model predictions.
While we acknowledge that the lumped nature of the modeling technique introduces certain limitations in accounting for the detailed hydrological behavior of individual watersheds, which may reduce predictive efficiency and necessitate the inclusion of more localized information to improve accuracy, it is important to emphasize the primary purpose of the proposed modeling approach. This technique is designed to support informed decision-making regarding water licensing applications and to provide a standardized method that can be readily reused by water management teams across the province.
For the sake of simplicity and operational efficiency, the proposed modeling technique is a practical and viable option. One of its key advantages is that it does not impose any financial burden on the province, unlike distributed or semi-distributed modeling techniques, which typically require significant investments in data collection, software, and expert personnel.
While we recommend this technique for general flow prediction tasks, we acknowledge that it may not be suitable for complex applications in water resources engineering and hydraulics. For critical applications—such as the design of hydraulic infrastructure like flood diversion systems or spillways, or for flood simulations—that require higher temporal resolution (e.g., hourly rather than monthly data, which our currently methodology is based on), we strongly recommend using more detailed, localized watershed data and modeling approaches. This involves supplementing basic parameters, such as watershed slope and elevation, with additional meaningful descriptors to improve the precision and reliability of the analysis.
Finally, it is important to acknowledge a foundational challenge in hydrological modeling: the reliability of streamflow data derived from stage–discharge rating curves. These curves, which convert water level observations into discharge estimates, are essential for flow monitoring but are subject to long-term inaccuracies due to changes in riverbed morphology, sedimentation, and anthropogenic alterations. While the streamflow data used in this study were sourced from national hydrometric networks and subjected to standard quality control procedures, the potential influence of rating curve variability cannot be entirely eliminated. To reduce sensitivity to such uncertainties, our approach operates at a monthly time scale, which tends to smooth short-term anomalies and reduce noise associated with episodic rating curve shifts. Additionally, by selecting neighboring watersheds from hydrologically coherent zones, we ensure a degree of consistency in both flow-generating mechanisms and data quality across sites. Nonetheless, we recognize that the evolving nature of rating curves remains a fundamental limitation in hydrology that must be considered when applying and interpreting model results, particularly in regions experiencing rapid morphological or land use changes.

5. Conclusions

The fundamental DAR method estimates streamflow at an u g watershed by applying a proportional scaling factor derived from the ratio of the drainage area at the u g site to that of a nearby gauged station. Specifically, the streamflow at the u g watershed is computed as the product of the streamflow observed at the gauged station and the DAR, under the assumption of hydrological similarity between the two sites. However, the method is constrained by its reliance on the assumption of hydrological similarity and its neglect of key flow-generating watershed characteristics, reducing its accuracy in complex or heterogeneous systems.
The proposed method offers a streamlined approach to incorporating the dominant flow-generating characteristics of the watershed while preserving the simplicity and practicality associated with its implementation. This method is easy to implement and involves no cost to the province, making it a practical tool for improving decision-making in water licensing. Regarding statistical performance, of the 18 stations where the methodology was implemented, it consistently outperformed DAR at the majority of sites, thereby substantiating its superior reliability in comparison to its counterpart and reinforcing its robustness in predicting streamflow across diverse hydrological settings. However, due to the lumped nature of the modeling approach, despite accounting for major flow-generating characteristics (such as elevation and slope) of watersheds across B.C., it may necessitate the integration of more localized data within the same framework for applications in more complex water resources engineering projects.

Author Contributions

Conceptualization, M.U.Q.; methodology, M.U.Q.; software, M.U.Q.; validation, M.U.Q., C.T. and C.S.; formal analysis, M.U.Q.; investigation, M.U.Q. and C.T.; resources, M.U.Q., C.T. and C.S.; data curation, C.S.; writing—original draft preparation, M.U.Q.; writing—review and editing, C.T.; visualization, C.T.; supervision, M.U.Q.; project administration, C.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

We sincerely acknowledge the contributions of our Fish Protection Hydrologist, Jacquelyn Shrimer, for her valuable insights into the study area and her assistance with hydrometric data collection.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. British Columbia. Water Sustainability Act. 2016. Available online: http://www.bclaws.ca/civix/document/id/complete/statreg/14015 (accessed on 14 January 2025).
  2. Jamal, I.B. Optimal Allocation of ‘BOD’ Loadings in a Tidal River. Ph.D. Dissertation, University of British Columbia, Vancouver, BC, Canada, 1986. [Google Scholar]
  3. Qamar, M.U.; Vidrio-Sahagún, C.T.; He, J.; Tariq, U.; Ali, A. Prediction of Monthly Flow Regimes Using the Distance-Based Method Nested with Model Swapping. Water Resour. Manag. 2024, 38, 5597–5613. [Google Scholar] [CrossRef]
  4. Ganora, D.; Claps, P.; Laio, F.; Viglione, A. An approach to estimate nonparametric flow duration curves in ungauged basins. Water Resour. Res. 2009, 45, W10418. [Google Scholar] [CrossRef]
  5. Daniel, E.B.; Camp, J.V.; LeBoeuf, E.J.; Penrod, J.R.; Dobbins, J.P.; Abkowitz, M.D. Watershed modeling and its applications: A state-of-the-art review. Open Hydrol. J. 2011, 5, 26–50. [Google Scholar] [CrossRef]
  6. Qin, G.; Li, H.; Wang, X.; He, Q.; Li, S. Annual runoff prediction using a nearest-neighbour method based on cosine angle distance for similarity estimation. Proc. Int. Assoc. Hydrol. Sci. 2015, 368, 204–208. [Google Scholar] [CrossRef]
  7. Ergen, K.; Kentel, E. An integrated map correlation method and multiple-source sites drainage-area ratio method for estimating streamflows at ungauged catchments: A case study of the Western Black Sea Region, Turkey. J. Environ. Manag. 2016, 166, 309–320. [Google Scholar] [CrossRef] [PubMed]
  8. Qamar, M.U.; Azmat, M.; Cheema, M.J.M.; Shahid, M.A.; Khushnood, R.A.; Ahmad, S. Model swapping: A comparative performance signature for the prediction of flow duration curves in ungauged basins. J. Hydrol. 2016, 541, 1030–1041. [Google Scholar] [CrossRef]
  9. Gianfagna, C.C.; Johnson, C.E.; Chandler, D.G.; Hofmann, C. Watershed area ratio accurately predicts daily streamflow in nested catchments in the Catskills, New York. J. Hydrol. Reg. Stud. 2015, 4, 583–594. [Google Scholar] [CrossRef]
  10. Yilmaz, M.U.; Aksu, H.; Onoz, B.; Selek, B. An effective framework for improving performance of daily streamflow estimation using statistical methods coupled with artificial neural network. Pure Appl. Geophys. 2023, 180, 3639–3654. [Google Scholar] [CrossRef]
  11. Shu, C.; Ouarda, T.B. Improved methods for daily streamflow estimates at ungauged sites. Water Resour. Res. 2012, 48, W02523. [Google Scholar] [CrossRef]
  12. Singh, P.; Bengtsson, L. Impact of warmer climate on melt and evaporation for the rainfed, snowfed and glacierfed basins in the Himalayan region. J. Hydrol. 2005, 300, 140–154. [Google Scholar] [CrossRef]
  13. Kelman, J.; de M. Vieira, A.; Rodriguez-Amaya, J.E. El Niño influence on streamflow forecasting. Stoch. Environ. Res. Risk Assess. 2000, 14, 123–138. [Google Scholar] [CrossRef]
  14. Pike, R.G.; Redding, T.E.; Moore, R.D.; Winker, R.D.; Bladon, K.D. Compendium of Forest Hydrology and Geomorphology in British Columbia. 2010. Available online: www.for.gov.bc.ca/hfd/pubs/Docs/Lmh/Lmh66.htm (accessed on 14 January 2025).
  15. Frisbee, M.D.; Phillips, F.M.; Campbell, A.R.; Liu, F.; Sanchez, S.A. Streamflow generation in a large, alpine watershed in the southern Rocky Mountains of Colorado: Is streamflow generation simply the aggregation of hillslope runoff responses? Water Resour. Res. 2011, 47, W06512. [Google Scholar] [CrossRef]
  16. Meshkat, M.; Amanian, N.; Talebi, A.; Kiani-Harchegani, M.; Rodrigo-Comino, J. Effects of roughness coefficients and complex hillslope morphology on runoff variables under laboratory conditions. Water 2019, 11, 2550. [Google Scholar] [CrossRef]
  17. Tennant, C.J.; Crosby, B.T.; Godsey, S.E. Elevation-dependent responses of streamflow to climate warming. Hydrol. Process. 2015, 29, 991–1001. [Google Scholar] [CrossRef]
  18. Sivakumar, B. Forecasting monthly streamflow dynamics in the western United States: A nonlinear dynamical approach. Environ. Model. Softw. 2003, 18, 721–728. [Google Scholar] [CrossRef]
  19. Ahmed, A. Inventory of Streamflow in the South Coast and West Coast Regions; Knowledge Management Branch, British Columbia Ministry of Environment and Climate Change Strategy: Victoria, BC, Canada, 2017. Available online: https://a100.gov.bc.ca/pub/acat/documents/r53344/SouthCoast_WestCoastReport_digitalversion-updated_1595024912424_5024418807.pdf (accessed on 14 January 2025).
  20. Coulson, C.H.; Obedkoff, W. British Columbia Streamflow Inventory. Water Inventory Section, Resources Inventory Branch. 1998. Available online: https://a100.gov.bc.ca/pub/acat/documents/r2227/BCStreamflowInventReport_1129157350136_811115b979b642468629a1549c0a39ac.pdf (accessed on 14 January 2025).
  21. Viessman, W.; Lewis, G.L. Introduction to Hydrology, 4th ed.; Addison Wesley Longman: Boston, MA, USA, 2003; p. 751. [Google Scholar]
  22. Sharma, A.R.; Déry, S.J. Linking atmospheric rivers to annual and extreme river runoff in British Columbia and Southeastern Alaska. J. Hydrometeorol. 2020, 21, 2457–2472. [Google Scholar] [CrossRef]
  23. Whitfield, P.H.; Pomeroy, J.W. Disparity in low-flow trends found in snowmelt-dominated mountain rivers of western Canada. J. Hydrol. Reg. Stud. 2025, 57, 102144. [Google Scholar] [CrossRef]
  24. Samaniego, L.; Bárdossy, A.; Kumar, R. Streamflow prediction in ungauged catchments using copula-based dissimilarity measures. Water Resour. Res. 2010, 46, W02506. [Google Scholar] [CrossRef]
  25. Zambrano-Bigiarini, M. R Package, Version 0.3-8; Goodness-of-Fit Functions for Comparison of Simulated and Observed Hydrological Time Series; Zenodo: Geneva, Switzerland, 2017. [Google Scholar]
  26. British Columbia Ministry of Water, Land and Resource Stewardship. Update on Groundwater. 2024. Available online: https://news.gov.bc.ca/newsletters/bc-groundwater-updates/update-on-groundwater-march-2024 (accessed on 14 January 2025).
  27. Gerlach, M.E.; Rains, K.C.; Guerrón-Orejuela, E.J.; Kleindl, W.J.; Downs, J.; Landry, S.M.; Rains, M.C. Using remote sensing and machine learning to locate groundwater discharge to salmon-bearing streams. Remote Sens. 2021, 14, 63. [Google Scholar] [CrossRef]
Figure 2. Illustration of the hydrological zones defined in this study. Please note that these hydrological zones are based on the study conducted by Ahmed [19].
Figure 2. Illustration of the hydrological zones defined in this study. Please note that these hydrological zones are based on the study conducted by Ahmed [19].
Water 17 01502 g002
Figure 3. Delineated watershed areas for Gamelin Creek and hydrometric stations.
Figure 3. Delineated watershed areas for Gamelin Creek and hydrometric stations.
Water 17 01502 g003
Figure 4. Flowchart of the proposed methodology.
Figure 4. Flowchart of the proposed methodology.
Water 17 01502 g004
Figure 5. Groundwater contributions in low-elevation areas: field observations from Fishtrap Creek at the Canada–USA International Boundary (08MH153). The panoramic image clearly shows streamflow being measured at a cross-section under a small bridge on Zero Avenue in Abbotsford. The cross-section is largely inundated by groundwater, giving it a stagnant appearance.
Figure 5. Groundwater contributions in low-elevation areas: field observations from Fishtrap Creek at the Canada–USA International Boundary (08MH153). The panoramic image clearly shows streamflow being measured at a cross-section under a small bridge on Zero Avenue in Abbotsford. The cross-section is largely inundated by groundwater, giving it a stagnant appearance.
Water 17 01502 g005
Figure 6. Comparison of observed (“Actual”) and modeled (“DAR”) streamflow discharge at four hydrometric stations (08MH0051, 08MH082, 08MH098, and PD200737) over a yearly cycle. The horizontal dashed lines represent the mean discharge level for reference.
Figure 6. Comparison of observed (“Actual”) and modeled (“DAR”) streamflow discharge at four hydrometric stations (08MH0051, 08MH082, 08MH098, and PD200737) over a yearly cycle. The horizontal dashed lines represent the mean discharge level for reference.
Water 17 01502 g006
Figure 7. Observed versus predicted streamflow discharge at nine hydrometric stations, highlighting the accuracy of the N N s + D A R and D A R methods in capturing seasonal variations and discharge trends.
Figure 7. Observed versus predicted streamflow discharge at nine hydrometric stations, highlighting the accuracy of the N N s + D A R and D A R methods in capturing seasonal variations and discharge trends.
Water 17 01502 g007aWater 17 01502 g007b
Figure 8. Performance comparison of the N N s + D A R and D A R methods for predicting streamflow discharge at eight hydrometric stations, with a focus on high-flow and low-flow periods.
Figure 8. Performance comparison of the N N s + D A R and D A R methods for predicting streamflow discharge at eight hydrometric stations, with a focus on high-flow and low-flow periods.
Water 17 01502 g008
Figure 9. Predicted monthly discharge for PD43584, with maximum and minimum discharge values highlighted.
Figure 9. Predicted monthly discharge for PD43584, with maximum and minimum discharge values highlighted.
Water 17 01502 g009
Figure 10. Comparative performance of the D A R and N N s + D A R methods for streamflow prediction across all hydrometric stations used in this study. The figure summarizes performance across multiple hydrometric stations using two key statistical metrics: η and δ . The efficiency metric ( η ) represents the ability of each method to capture the variance of observed discharge, with higher values indicating better model performance. Values placed below the 1:1 line indicate better prediction performance of the proposed methodology. δ reflects the error, where lower values indicate smaller deviations between predicted and observed discharge. In this case, values above the 1:1 line indicate improved prediction performance of the proposed methodology.
Figure 10. Comparative performance of the D A R and N N s + D A R methods for streamflow prediction across all hydrometric stations used in this study. The figure summarizes performance across multiple hydrometric stations using two key statistical metrics: η and δ . The efficiency metric ( η ) represents the ability of each method to capture the variance of observed discharge, with higher values indicating better model performance. Values placed below the 1:1 line indicate better prediction performance of the proposed methodology. δ reflects the error, where lower values indicate smaller deviations between predicted and observed discharge. In this case, values above the 1:1 line indicate improved prediction performance of the proposed methodology.
Water 17 01502 g010
Figure 11. Delineated watershed boundaries for stations 08MH055 and 08MH157.
Figure 11. Delineated watershed boundaries for stations 08MH055 and 08MH157.
Water 17 01502 g011
Figure 12. Comparison of observed and predicted discharge trends for station 08MH157, highlighting the overestimation in the initial prediction and the improved alignment of the revised ( N N s + D A R ) prediction with actual values. Dotted lines represent the mean monthly discharge for each dataset.
Figure 12. Comparison of observed and predicted discharge trends for station 08MH157, highlighting the overestimation in the initial prediction and the improved alignment of the revised ( N N s + D A R ) prediction with actual values. Dotted lines represent the mean monthly discharge for each dataset.
Water 17 01502 g012
Table 1. Summary of hydrometric stations: record length, watershed area, flow, elevation, and slope characteristics.
Table 1. Summary of hydrometric stations: record length, watershed area, flow, elevation, and slope characteristics.
S. No.Hydrometric StationRecord Length Watershed   Area   m 2 Mean   Monthly   Flow   m 3 s Mean   Elevation   ( m ) Mean Slope
( m / m )
108MH00582020–20241,691,1880.0717107.5823.644
208MF00042020–202412,056,2130.989941.77122.767
308MH00512019–202414,836,5290.585225.15611.749
408MH0841959–199028,149,5501.0428102.7663.635
508MH00552020–20245,056,0451.046115.5782.145
608MH1571985–198939,879,8990.564978.21931.217
708MH0551956–1962426,331,62247.71162.03329.338
808MH0821960–196411,975,9620.45797.3113.928
908MH0981960–202312,900,7030.43685.1853.224
1008MH1561985–202315,069,9030.25897.0723.654
1108MH00412019–202426,822,5020.91076.6312.643
1208MH1531984–202320,636,4950.85283.2733.471
1308MH1521984–201235,091,9911.054102.3722.132
1408MH1041965–198726,357,6260.68977.4772.143
1508MH1551985–202370,330,3902.0655.5042.183
16PD2007372001–20207,155,8740.29220.7270.842
17HYD-GIBS-R12023–20253,337,9550.081510.69312.223
18PD189470Spot Measurements1,230,404NA519.10722.768
19PD43584Observational Post 2,495,116Maximum = June
Minimum = Winter
1167.92627.944
Table 2. δ values for streamflow prediction in the NNs of u g using area, elevation, and slope as normalization parameters. Bold values indicate the selected parameter for each station, as it yielded the lowest δ compared to the others.
Table 2. δ values for streamflow prediction in the NNs of u g using area, elevation, and slope as normalization parameters. Bold values indicate the selected parameter for each station, as it yielded the lowest δ compared to the others.
StationAreaElevationSlope
08MH0820.0530.1110.053
08MH0980.021 0.029   ( ) 0.071
08MH00510.1360.1690.400
PD2007371.4664.4182.825
08MH0840.0660.0050.150
08MH1520.1130.079  ( ) 0.101
08MH1560.1490.120  ( ) 0.126
08MH1530.0580.049  ( ) 0.115
08MH0550.4470.372  ( ) 1.154
08MH00580.1260.0970.174
08MH00550.1620.0750.114
HYD-GIBS-R10.5930.5640.790
PD1894700.5410.1510.215
PD4358440.55138.38140.991
08MH1576.5232.3031.322
08MH1040.2270.2510.063
08MH1550.2070.2160.035
08MH00410.2800.4470.270
08MH000412.08611.1208.198
Table 3. Minimum, average, and maximum variations in characteristics between the u g and N N s and the corresponding η and δ values. “NA” in the table refers to an observation post for which monthly quantitative data were not available.
Table 3. Minimum, average, and maximum variations in characteristics between the u g and N N s and the corresponding η and δ values. “NA” in the table refers to an observation post for which monthly quantitative data were not available.
Hydrometric Station Area   ( m 2 ) Parameter Used Variation   ( % )   of   Watershed   Characteristics   ( Between   u g   and   its   N N s ) Performance Parameters
Minimum VariationAverage VariationMaximum Variation η δ
08MH00581,691,188Elevation4.47611.61420.8180.8540.016
08MF000412,056,212Elevation3.87079.514155.1580.2470.276
08MH005114,836,529Elevation0.50715.23633.8270.3770.284
08MH08428,149,549Elevation4.6869.03420.6380.9500.143
08MH00555,056,045Elevation11.46418.46327.9510.8160.0283
08MH15739,879,899Elevation18.79118.79118.791−1.9082.070
08MH055426,331,622Elevation18.79118.79118.791−1.40520.577
HYD-GIBS-R13,337,955Elevation0.40424.11878.1220.1600.041
PD1894701,230,405Elevation81.42084.93088.440−0.7070.044
PD435842,495,115Elevation24.16528.01532.628NANA
08MH08211,975,962Area5.6069.03312.1260.5970.139
08MH09812,900,703Area5.6069.03312.1260.9750.041
08MH15615,069,903Area5.26921.90238.5340.6570.117
08MH004126,822,502Slope17.39618.16618.9370.9100.142
08MH15320,636,495Slope5.26921.90238.5340.8660.191
08MH15235,091,991Slope5.26921.90238.5340.4150.294
08MH10426,357,626Slope1.9011.9011.9010.8430.145
08MH15570,330,390Slope1.9011.9011.9010.7890.395
PD2007377,569,104Slope1.961158.157325.0130.07460.095
Table 4. Predicted vs. actual flows of Wilfred Creek with and without spring contribution.
Table 4. Predicted vs. actual flows of Wilfred Creek with and without spring contribution.
MonthFlow of Creek (Including Spring Contribution)Predicted FlowsSpring OverflowsRevised Predicted Flows
(Predicted Flows + Spring Overflows)
( m 3 / s ) ( m 3 / s ) ( m 3 / s ) ( m 3 / s )
January0.1010.03580.0560.0918
February0.2010.02790.0810.1089
March0.1800.02390.1170.1409
April0.1640.03300.0960.129
May0.1210.05970.0660.1257
June0.1000.05750.0550.1125
July0.1320.02920.0620.0912
August0.0980.01240.0430.0554
September0.1500.00900.0390.048
October0.0910.01870.0280.0467
November0.0650.03040.0350.0654
December0.0740.03260.0320.0646
Average0.1230.0310.0590.090
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Qamar, M.U.; Turner, C.; Stooshnoff, C. Amalgamation of Drainage Area Ratio and Nearest Neighbors Methods for Predicting Stream Flows in British Columbia, Canada. Water 2025, 17, 1502. https://doi.org/10.3390/w17101502

AMA Style

Qamar MU, Turner C, Stooshnoff C. Amalgamation of Drainage Area Ratio and Nearest Neighbors Methods for Predicting Stream Flows in British Columbia, Canada. Water. 2025; 17(10):1502. https://doi.org/10.3390/w17101502

Chicago/Turabian Style

Qamar, Muhammad Uzair, Courtney Turner, and Cameron Stooshnoff. 2025. "Amalgamation of Drainage Area Ratio and Nearest Neighbors Methods for Predicting Stream Flows in British Columbia, Canada" Water 17, no. 10: 1502. https://doi.org/10.3390/w17101502

APA Style

Qamar, M. U., Turner, C., & Stooshnoff, C. (2025). Amalgamation of Drainage Area Ratio and Nearest Neighbors Methods for Predicting Stream Flows in British Columbia, Canada. Water, 17(10), 1502. https://doi.org/10.3390/w17101502

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop