1. Introduction
Wildfires in the western United States are becoming frequent, extensive, and severe as the naturally dry climate, which is already prone to wildfires, is further exacerbated by climate change [
1,
2]. A long history of fire suppression has further altered fuel loads and forest structure, compounding the wildfire threat [
3,
4]. These wildfires have substantial impacts on the vegetation and soil of mountainous forested watersheds, and these impacts are further affected by climate change-induced extreme rainfall, which causes excessive peak surface runoff and soil erosion [
5]. Since wildfires have substantial effects on vegetation, soil, and hydrology, accurate burn severity assessment is critical for understanding and managing these impacts. Burn severity that is linked to fire intensity and organic matter must be accurately mapped to assess hydrological impacts [
6,
7]. Misclassifications, such as assigning one burn severity class to another, can lead to incorrect prediction of effects on vegetation and soil. Therefore, accurate burn severity assessment in a burned watershed is crucial for predicting consequences like streamflow changes and effectively managing water budgets [
6,
8]. Remote sensing offers a practical approach in mapping fire, enabling quantitative and qualitative evaluation of fire effects in burned areas.
Remote sensing is widely used for burn severity mapping [
9,
10]. Traditional methods use satellite images, as obtaining these images is cost-effective, and the capacity of satellites to cover large areas is beneficial. The differences between pre-fire and post-fire images, taken before and immediately after the fire, help detect the fire perimeter, allow for quantifying the changes caused by fire, and facilitate the determination of burn severity [
11,
12]. According to several studies [
13,
14,
15,
16], commonly used indices derived from satellite imagery include Normalized Difference Vegetation Index (NDVI) [
17], Relativized Burn Ratio (RBR) [
18], Relative Differenced Normalized Burn Ratio (RdNBR) [
19], and Normalized Burn Ratio (NBR) [
20,
21]. Among these, Differenced Normalized Burn Ratio (dNBR), derived from pre- and post-fire NBR images, is the primary spectral index used by the United States Department of Agriculture (USDA) Forest Service’s Remote Sensing Applications Center (RSAC) and the United States Geological Survey (USGS) for Earth Resources Observation and Science (EROS) Center to produce Burned Area Reflectance Classification (BARC) maps. These maps serve as an input for the Burned Area Emergency Response (BAER) teams to generate a soil burn severity map using thresholds derived through field-based measurements [
22]. While BARC products can be produced for any fire upon request, many smaller fires are often not mapped, even though smaller fire events can still produce ecological impacts in small watersheds. This limitation motivates the need for the development of alternative approaches capable of reliably mapping burn severity in areas not covered by the existing burn severity mapping programs.
While satellite observations have benefits for monitoring wildfire and post-fire impacts, they also have limitations. Their coarser spatial and temporal resolution may not capture the necessary details precisely to assess wildfire impacts on soil and vegetation, and cloud cover can also obstruct image visibility. On the other hand, unmanned aerial vehicles (UAVs) offer an alternative, providing high-resolution images at flexible times and altitudes. UAVs can be used during or after a fire event to capture images when analysis requires specific landscape information. This differs from satellites, where the risk of potential loss of critical post-fire effects arises from long revisit days. However, UAVs also face challenges such as short battery life, limited spatial coverage, high costs, access issues, and safety regulations. Therefore, integrating UAV and satellite data leverages the strengths of both technologies to overcome their individual challenges.
UAVs provide detailed site-specific post-fire landscape information, while satellites offer coverage of all burned areas, cost-effectiveness, and consistent long-term monitoring. Therefore, integrating limited UAV observations with satellite observations presents a promising approach to expanding the study of post-fire effects from a small scale to a watershed scale, capitalizing on the strengths of both technologies. Researchers have used satellite and UAV images for different applications. Khanal & Barber [
23] integrated UAV observations with Landsat [
24] observations to improve evapotranspiration estimation. Otsu et al. [
25] used different vegetation indices from Landsat and incorporated UAV observations to calibrate the severity of forest defoliation in Spain, successfully estimating the defoliation severity. Although recent studies have integrated satellite images with UAV images to create severity maps like defoliation severity, there needs to be more research measuring the efficacy of these methods using a field-based reference severity map. Field-based burn severity mapping has been implemented to create burn severity thresholds with the combination of spectral indices [
26], but it can be costly, time-consuming, and labor-intensive. Covering large areas using a field-based measurement method can be challenging, making it difficult to establish accurate burn severity thresholds. With recent advancements in UAV-mounted thermal sensors, it is possible to capture relative soil surface temperatures in burned and unburned areas at high spatial resolution. Building on this capability, we introduce an approach that uses the relative differences in soil surface temperature, derived from the MicaSense Altum (MicaSense, Seattle, WA, USA) sensor capable of capturing multispectral and thermal images, combined with Landsat observations, to establish burn severity thresholds. While combining UAV and satellite imagery for wildfire applications is common, the innovation of this study lies in leveraging thermal data to map burn severity. Burn severity is strongly related to vegetation cover and soil exposure, which result in relative differences in land surface temperature between burned and unburned areas. UAV-derived thermal data can provide a reliable indicator for these effects, complementing satellite-based vegetation indices by capturing surface heating and exposed soil that may not be fully represented by vegetation reflectance alone. By integrating thermal data with satellite-derived indices, our approach captures both vegetation damage and soil exposure, aligning with the BAER fire product, which combines satellite-derived BARC products with field-based observations. Moreover, in cases where burn severity maps are unavailable from agencies, our method provides a practical alternative for generating these critical inputs, enabling post-fire streamflow assessments in data-limited regions.
Furthermore, to demonstrate the potential applications of high-resolution UAV-Landsat burn severity maps in post-fire hydrologic assessments, we transferred the burn severity information to a neighboring watershed since the original burned watershed was not feasible for post-fire hydrologic modeling due to complexities, including large watershed size and small burned area, which limited the evaluation of watershed-scale streamflow impacts in the originally burned watershed. By leveraging machine learning, we integrated detailed topographical and ecological information, accounting for the influence of different factors on burn severity. This approach allows for scalable and transferable burn severity assessment, even in potentially burned watersheds, and can help anticipate the potential impacts of wildfire on watershed and water resources, ultimately supporting the development of effective management strategies for watershed managers.
Despite the widespread use of official burn severity products, many small or ecologically sensitive fires remain unmapped, and current approaches do not incorporate high-resolution thermal data to capture post-fire changes in vegetation and soil. By integrating UAV-derived thermal imagery with Landsat-derived dNBR, this study provides a novel, scalable, and transferable approach for generating burn severity maps without field calibration. This method not only enables burn severity mapping in the original burned areas but also facilitates diverse scenario testing for post-fire hydrologic impacts in the area of interest, addressing a critical gap in existing wildfire-related studies and providing practical guidance for water resource management.
The objectives of this study are to generate a high-resolution burn severity map using UAV-acquired thermal imagery integrated with Landsat observations and to explore the potential application in post-fire hydrologic modeling. This study addresses the following questions: (1) How effectively can the integration of UAV and Landsat data be used to generate burn severity maps without the need for field-measured data for severity thresholds? (2) Which features drive burn severity transfer across watersheds? (3) How can UAV-Landsat-derived burn severity maps be applied in a scalable and transferable way to effectively assess post-fire streamflow impacts? Our study highlights the application of combined UAV and satellite imagery for simulating post-fire hydrologic responses to wildfire, offering a scalable and cost-effective approach for burn severity mapping and watershed-scale post-fire streamflow prediction, even where official burn severity maps are unavailable.
2. Materials and Methods
2.1. Study Area
This study was conducted within the Beaver River watershed in the Tushar Mountains of south-central Utah. The study watershed (
Figure 1) within the Beaver River watershed has an area of ~238 km
2 (92 square miles). The watershed includes rugged highlands and plains with dry summers, snowy winters, and a 2010–2020 mean annual precipitation of 686 mm [
27]. The east-to-west-flowing Beaver River supports local agriculture. Soils are primarily gravelly loam (88%) and loam (10%) [
28], while land cover is dominated by evergreen (60%), mixed forest (18%), and shrub (10%) [
29]. For this study, we focused on the Thompson Ridge Fire in the Fishlake National Forest. The fire was ignited by a lightning strike on 4 August 2023, and burned ~29 km
2 of land in Beaver, Garfield, and Piute counties. The burned area primarily comprises gravelly loam, cobbly loam, and very cobbly loam soil, with vegetation of spruce fir, aspen-conifer mix, and mountain big sagebrush.
2.2. Justifications for Modeling Fire Effects in a Different Location
The Thompson Ridge Fire spanned the Sevier and Beaver River watersheds, with most of its perimeter in the Sevier watershed. The contributing area to the nearest downstream gage was ~2900 km
2—making the burned area < 1% so fire-induced hydrologic effects would be minimal at the outlet. Modeling the large, diversion-altered Sevier River with a high-resolution distributed model would also be computationally intensive. To address these constraints, the Thompson Ridge Fire impacts were imposed on a nearby Beaver River sub-basin having comparable soil and vegetation characteristics. This smaller watershed includes a continuously monitored outlet gage and is well-suited for evaluating wildfire-driven streamflow changes via burn-severity transfer modeling. A detailed flowchart for the design of this study is shown in
Figure 2.
2.3. UAV and Landsat Images Acquisition and Preprocessing
A DJI M210 V2 (Da-Jiang Innovations, Shenzhen, China) UAV equipped with a MicaSense Altum sensor collected multispectral (RGB, red edge, NIR) and thermal imagery for integration with Landsat data. Flight routes were designed by selecting burned areas within the fire perimeter. Flights were conducted on 18–20 October 2023 with fixed sunlight exposure between 11:00 AM and 3:00 PM. The average air temperatures during these days and times were relatively stable, recorded at 16.93 °C, 18.73 °C, and 16.5 °C at the Circleville station [
30]. Ten flights were made within these days, and covered ~1 km
2 within the fire perimeter. Before and after each flight, a few images of the calibration panel were taken for radiometric calibration. The selected 1 km
2 site encompassed the entire area within the fire perimeter authorized for UAV operation under the flight permit issued by the US Forest Service, Beaver Ranger District. Although limited in spatial extent, the flight areas were representative of the broader burned area because they encompassed gentle to steep terrain with forest and shrub-dominated vegetation types, as well as all burn severity classes, which were necessary for assessing burn severity relationships across diverse topographic and vegetation gradients. Operational constraints, such as battery endurance and access restrictions, were managed through pre-flight planning and field logistics. Each flight lasted ~20–25 min, and multiple batteries were used, which were recharged on-site using a portable generator. Due to uneven surface conditions, UAV flights heights ranged from 70 m to 115 m above the ground, where lower flight altitudes were selected for flat areas and higher ones for steep terrain. The pixel size of images varied due to different altitudes.
We preprocessed raw images from the UAV using Agisoft Metashape Professional software (version 1.8.3) to produce a composite image with a 3 cm spatial resolution, corresponding to the spatial resolution of the multispectral imagery. When the image pixels are larger than the area of interest, temperature interference can cause errors in temperature estimates because the pixels contain mixed temperature information [
31]. Since MicaSense Altum captures multispectral imagery at a finer resolution than thermal imagery, we resampled the thermal band to the spatial resolution of the multispectral band to align it spatially. Previous studies have resampled thermal imagery captured with the MicaSense Altum sensor using Agisoft Metashape Professional to match the spatial resolution of multispectral imagery. These studies demonstrated that the resampled thermal data produced reliable temperature estimates, validated by field-based temperature measurements conducted over a crop field and a water body [
32,
33]. Furthermore, we used ground reference points from Google Earth Pro to georeference orthomosaics using a first-order polynomial transformation in ArcGIS Pro (version 3.3.0). We used pre-fire and post-fire Landsat images from 15 July 2023 and 25 September 2023, respectively, to calculate the magnitude of NBR. Detailed information on the timing of UAV image acquisition is shown in
Table 1.
2.4. Burn Severity Mapping
2.4.1. Spectral Index for Burn Severity Mapping
Multispectral satellite images are commonly used to create burn severity maps [
34]. In our study, we used the dNBR to generate a burn severity map using bi-temporal Landsat images (pre- and post-fire) within the fire perimeter. We chose the dNBR as the spectral index for burn severity mapping because it is the index used to generate the soil burn severity map [
22]. Our goal was to assess the accuracy of the UAV-Landsat approach for burn severity mapping, so we used the same index and spatial resolution (30 m) to produce a burn severity map consistent with the BAER fire product, available in the burn severity portal [
22]. The response of unburned and burned areas to light absorption and reflection in specific spectral regions differs. For instance, healthy plants reflect a higher percentage of near-infrared light, while unhealthy plants show higher reflectance in the visible region [
35]. We determined the dNBR values for each pixel within the fire perimeter by subtracting the post-fire NBR value from the pre-fire NBR, using the following Equation (1).
where NBR is the Normalized Burn Ratio and dNBR is the Differenced Normalized Burn Ratio.
2.4.2. Land Surface Temperature Analyses
We manually delineated 150 square plots, each approximately 1 m × 1 m in size, within random 50 Landsat pixels to analyze land surface temperature using thermal band observations of the UAV. We created these plots to ensure land surface temperature was extracted from the representative burned and unburned areas. The UAV flights covered both flat and steep terrain, although most of the areas were characterized by relatively gentle slopes, covering both forested and shrub-dominated regions. With the limited areas available for UAV flights, we did not differentiate plots by topographic characteristics, such as elevation, slope, or aspect. However, the plots were distributed across diverse terrain and vegetation types, capturing the range of conditions in both burned and unburned areas. Manually created polygons using the UAV images have been used to extract information to determine defoliation severity thresholds [
25]. Char was used to distinguish between burned and unburned areas. Furthermore, we could differentiate areas unaffected by shadows with multispectral images, and the plots were created in these shadow-free areas. The size of the plots was fixed as 1 m × 1 m because the shrub was dense, and to ensure the temperature was extracted from the soil surface only. In addition, to ensure consistency throughout the study, we used the same plot size for forest-dominated areas. We randomly selected pixels from Landsat such that the sample pixel included unburned, low, moderate, and high burn severity, based on the soil burn severity map (referred to as the reference map throughout) obtained from the burn severity portal [
22]. We used the BAER fire product for burn severity, which uses the BARC map to produce the field-validated soil burn severity map within the fire perimeter. We specifically chose this fire product since we acquired the UAV images within a few weeks of containment, when vegetation regrowth had not yet begun, as observed during the field visit. Three plots were created within each Landsat pixel. The average temperature within the plot was calculated by averaging the pixel values from the thermal image. The land surface temperature in each pixel was obtained using Equation (2) [
36],
The average land surface temperature within the Landsat pixel was obtained by averaging the temperature estimates of areas of three plots located within the same Landsat pixel.
Land surface temperatures from 150 plots were grouped, and temperature thresholds were calculated from the 25th, 50th, and 75th percentiles for burn severity classification. Below the 25th percentile temperature, pixels were classified as unburned; between the 25th and 50th percentile, pixels were classified as low burned. Moderate burned pixels were classified between the 50th and 75th temperature percentiles, whereas those above the 75th percentile were classified as high burned pixels. We considered percentile-based stratification because post-fire higher land surface temperatures are associated with higher fuel consumption, reduced canopy and litter cover, increased soil exposure to sunlight, and reduced evapotranspiration. This relationship between elevated land surface temperature and increased burn severity has been demonstrated in the study by Vlassova et al. [
37]. Using percentiles, we classified burn severity classes, avoiding fixed cut-off temperature values that may not reflect site variability. Linear, quadratic, and cubic polynomial regression models were used to analyze the relationships between land surface temperature and the dNBR. The coefficient of determination (R
2) was used to evaluate the fitting ability of the following models (Equations (3)–(5)).
where y is land surface temperature, and S is the spectral index (dNBR).
With the temperature thresholds known, thresholds for the dNBR-based burn severity were calculated from the best-fit equation.
2.4.3. Validation of UAV-Landsat-Based Burn Severity Mapping
A total of 32,449 pixels from the reference map were used as a benchmark to validate the accuracy of the burn severity map derived from the UAV-Landsat integration approach. Accuracy assessment was examined using a pixel-based confusion matrix. We calculated the overall accuracy (OA) and the kappa coefficient (K) to evaluate the overall burn severity classification. In addition, the precision, recall, and F1 score were computed to evaluate the class-wise performance of the burn severity classification. The F1 score is regarded as the best metric for performance evaluation as it gives importance to both precision and recall. The equations (Equations (6)–(10)) used to calculate these metrics are presented here.
2.5. Methodology to Impose Burn Severity in the Study Area
Burn severity often correlates with topographic factors such as slope, elevation, and aspect [
38,
39]. To preserve the spatial realism in relocating the fire for hydrologic response to the fire effects, we used a data-driven approach to project the effects of fire into the study area. This approach is consistent with transfer modeling, whereby models developed in one region are applied to other areas or environmental conditions, widely used in ecological and climate change studies [
40]. We used the Random Forest classification [
41] algorithm due to its widespread use in wildfire burn severity prediction and its ability to deliver high classification accuracy [
42]. We trained two separate Random Forest classification models using spatial and topographic features, fire information, land cover, soil, spectral bands, vegetation indices, and fuel conditions as predictors from the Thompson Ridge Fire area. Detailed information on features that were used to train the Random Forest classifier models is presented in
Table 2. The response variables for two separate models were burn severity, derived from the integration of UAV and Landsat-based approach, and the reference map [
22]. The Thompson Ridge Fire area and the study region are situated within the Wasatch and Uinta Mountains Ecoregion, characterized by similar vegetation types, climate, and elevation ranges, which supports ecological similarity and reduces the probability of environmental extrapolation during severity transfer.
We used a Random Forest Classifier from the scikit-learn library for training, testing, and predictions. The model setup followed the same procedure for both classification models. We calculated the correlation coefficient between each variable and the target variable for both models, and the value ranged from −0.58 to 0.58. To reduce model complexity and computational time, we removed features with a correlation coefficient exceeding ±0.7, indicating strong correlation [
45], resulting in a final set of 23 features (latitude, longitude, elevation, Enhanced Vegetation Index, northness, aspect, BAND5, slope, Euclidean distance to drainage, Topographic Wetness Index, ruggedness, curvature (planar), curvature (profile), canopy bulk density, fuel type, fire behavior model, succession class, canopy height, land cover, canopy base height, soil, historical disturbance type, and historical disturbance severity). Although tree-based models like the Random Forest classifier handle multicollinearity well [
46], the dataset reduction step improved efficiency. We split the dataset from the Thompson Ridge Fire area into 80% training and 20% testing subsets, covering all severity classes. We applied a hyperparameter tuning process using GridSearchCV with 3-fold cross-validation on the training dataset to ensure model robustness in making accurate predictions. The tuned hyperparameter included the number of trees in the forest (n_estimators: 5 to 90 in intervals of 5), maximum number of terminal nodes in each tree (max_leaf_nodes: 10, 50, 100, 200), maximum depth of each decision tree (max_depth: 5 to 30 in intervals of 5), minimum number of samples required to split an internal node (min_samples_split: 2, 5, 10, 20, 30), minimum number of samples needed to be at a leaf node (min_samples_leaf: 2, 5, 10, 20, 30), and number of features to consider for the best split (max_features: auto, sqrt, log2). The optimized hyperparameters for the models were as follows:
For the model based on burn severity from the reference map as a target variable, best parameters were as follows: {‘max_depth’: 20, ‘max_features’: ‘auto’, ‘max_leaf_nodes’: 200, ‘min_samples_leaf’: 5, ‘min_samples_split’: 20, ‘n_estimators’: 65}.
For the model based on burn severity from the UAV-Landsat map as a target variable, best parameters were as follows: {‘max_depth’: 15, ‘max_features’: ‘auto’, ‘max_leaf_nodes’: 200, ‘min_samples_leaf’: 5, ‘min_samples_split’: 30, ‘n_estimators’: 55}.
After training, the models were applied to the Beaver River watershed to produce spatially realistic burn severity maps, simulating fire impacts using data specific to the watershed.
2.6. Hydrologic Model Description
The Distributed Hydrology Soil Vegetation Model (DHSVM) [
47] was selected to simulate wildfire effects on watershed hydrology due to its grid-based representation of soil and vegetation for both pre- and post-fire conditions. DHSVM simulates water and energy balance at fine spatial and temporal scales and has been widely applied in forested basins [
48,
49,
50,
51]. The Digital Elevation Model (DEM) grid determines the spatial scale, and each model grid represents the vegetation type, soil type, and soil depth. Details of the DHSVM and its mechanisms can be found in Wigmosta et al. [
47]. The meteorological data required to run the model are air temperature, precipitation, relative humidity, incoming shortwave and longwave radiation, and wind speed, and must be defined for each time step.
2.7. Data for the DHSVM
2.7.1. Spatial Inputs
The DEM representing the watershed was obtained from the United States Geological Survey (USGS) dataset [
43] at a resolution of 30 m. We used the DEM to delineate the watershed and generate a mask file using a flow accumulation raster developed using geographic information system processing. The land cover classification at 30 m resolution was obtained from the National Land Cover Dataset (NLCD) [
29] to classify the land cover types within the watershed. We obtained soil surface texture information from the Gridded National Soil Survey Geographic Database (gNATSGO) [
28].
2.7.2. Meteorological Inputs
Due to the unavailability of meteorological stations within the study watershed to provide all the necessary meteorological datasets for hydrological modeling, we selected nine representative locations, considering slope, aspect, and elevation. These points were used to extract gridded meteorological data and to compute radiation inputs for model simulations. The daily minimum and maximum air temperatures, precipitation, and wind speeds for meteorological stations within the watershed were obtained from the University of Idaho Gridded Surface Meteorological Dataset (GRIDMET) [
52]. The GRIDMET provides climate data at a resolution of ~4 km, which blends Parameter-elevation Regressions on Independent Slopes Model (PRISM) data [
53] with data from the North American Land Data Assimilation System (NLDAS-2). The GRIDMET data have been used in hydrological modeling studies in the United States [
54]. For the baseline to disaggregate temperature and precipitation data from daily to hourly resolution, we observed hourly precipitation and temperature patterns at two meteorological stations, Beaver station, within the watershed, and Manderfield station, near the study watershed over five years (2019–2023) since the data available for Manderfield station was for five years only [
30]. Two of the nine synthetic locations were positioned near the Manderfield station, and seven near the Beaver station. We assumed that precipitation and temperature followed a similar hourly trend for simplification in nearby synthetic stations. Daily to hourly precipitation disaggregation for two stations was based on the precipitation patterns of the Manderfield station. For the remaining seven stations, we used data from the Beaver station. Similarly, we analyzed hourly temperature patterns to understand when maximum and minimum temperatures occurred in a day. We observed that the daily minimum temperature occurred at 6 AM and the maximum temperature occurred at 3 PM at both stations. Temperature was assumed to decrease linearly from 3 PM to 6 AM and increase linearly from 6 AM to 3 PM. We obtained wind speed in daily resolution and assigned the daily average wind speed to each hour. The hourly relative humidity was calculated from the hourly dew point temperature and hourly air temperature, using the daily minimum temperature as the dew point temperature. Similar assumptions have been made to disaggregate temperature and precipitation data from daily to sub-daily resolution and calculate relative humidity in hydrological modeling studies [
55,
56,
57]. However, disaggregating meteorological data from one temporal resolution to another with certain assumptions may not accurately represent the true behavior of the natural phenomena. Observed hourly incoming shortwave and longwave radiation in the watershed were not available. Therefore, incoming shortwave and longwave radiation were calculated using empirical approaches. Longwave was calculated using the Stefan-Boltzmann constant, air temperature, fraction of cloudless sky, and atmospheric emissivity [
58]. Incoming shortwave radiation was estimated as a function of irradiance [
59] and transmittance using daily air temperature range [
60].
2.8. Calibration and Validation
We calibrated and validated the hydrologic model (DHSVM) using a manual method of trial and error to obtain optimum streamflow at the watershed outlet (USGS 10234500) and Snow Water Equivalent (SWE) at Merchant Valley Snow Telemetry (SNOTEL) located within the study area [
61]. We ran the model at an hourly time step and a 30 m × 30 m spatial resolution of the DEM. We assumed a linear lapse rate for temperature estimation at different elevations. Unlike temperature, precipitation cannot be estimated to follow a linear trend; we used monthly normal precipitation maps (800 m resolution, 30-year monthly normal precipitation map from 1991–2020) from the PRISM to distribute station precipitation data to model grids.
Initial model calibration parameter values were obtained from literature [
49,
55,
62,
63,
64,
65,
66,
67]. The sensitive parameters during model calibration and validation were selected based on Du et al. [
62]. Sensitive parameters were adjusted manually until the simulated streamflow and the SWE matched the observations.
Table 3 summarizes the key soil, vegetation, and model constant parameters and their values used in the hydrologic model to control streamflow and SWE. The detailed descriptions of parameters presented in
Table 3 are provided in Wigmosta et al. [
47].
Overall, the calibrated parameters remained within physically plausible ranges, based on similar mountainous, forested watersheds. However, some adjustments were necessary to account for watershed heterogeneity, the availability of coarser soil classification data, and the broad vegetation classification used in this study (such as evergreen, mixed, and deciduous forests, rather than species-based classification), to achieve realistic simulations of SWE and streamflow.
The lateral saturated hydraulic conductivity was within the initial literature-derived range reported for forested mountainous catchments in the western US [
62,
67]. Hydraulic conductivity is highly inconsistent across watersheds due to changes in soil structure, macropore flow, and root density. To capture these dynamics, an exponential decrease parameter, which determines how quickly hydraulic conductivity changes with depth, was used following Hasan et al. [
49] and WPNLLC [
67], ensuring realistic lateral movement of water. Similarly, soil porosity, field capacity, and wilting point values were obtained from Hasan et al. [
49], Du et al. [
62], and WPNLLC [
67]. The final calibrated values were consistent with those typically reported in forested mountainous soils in the western US. These parameters effectively control soil water storage and availability for evapotranspiration processes.
The minimum stomatal resistance was another sensitive vegetation parameter refined during the calibration process. The final calibrated values for the overstory and understory were slightly lower than the literature ranges [
55,
65,
66], while the maximum stomatal resistance for both overstory and understory was within the acceptable limits [
55,
62,
66]. However, the minimum stomatal resistance value for the overstory falls within the broader range (70–500 s/m) reported in various studies, and slight refinement was required for the understory to maintain base flow [
62]. Furthermore, stomatal resistance is difficult to upscale from field-level measurements to the watershed scale, making it difficult to represent accurately in distributed hydrologic models.
The soil moisture threshold defines the soil moisture content above which transpiration is not limited. This value corresponds to approximately 50% of the field capacity value used during the calibration process, which aligns with the range (50–80%) reported by Maidment [
68] as described in the vegetation parameters descriptions document (
www.pnnl.gov). Such a threshold is realistic for semiarid forest soils and effectively represents vegetation response to available soil moisture.
The fractional vegetation coverage typically ranges from 0 to 1, and we selected 0.4 to 0.6 to represent mixed-density vegetation cover, which is typically found in semiarid mountainous watersheds, where vegetation density varies considerably across elevation, aspect, and slope. The snow interception efficiency, which governs the proportion of snowfall intercepted by vegetation cover, was maintained within the range reported for coniferous forests [
66,
67]. The rain and snow temperature thresholds were adopted from Sun et al. [
64], as the study reported that −4 to 1 °C for rain and 0 to 6.5 °C for snow are usually used for snow process modeling, and suggested 4.2 °C as the plausible snow threshold temperature for the Wasatch and Uinta Mountains region, stating rain threshold temperature was not a sensitive parameter. In this study, the snow threshold was slightly higher, reflecting the need to adjust for potential biases in the gridded meteorological input data compared to in situ temperature measurements.
We ran the model for two water years (WYs), 1994 and 1995 (1 October 1993 to 30 September 1995), for spin-up and ignored the simulated streamflow and SWE for analyses. Model spin-up allowed the model to initialize its state variables, which ensured that the model started simulating streamflow from a representative condition of a watershed. For the calibration period, we selected five consecutive water years from WY 1996 (1 October 1995, to 30 September 1996) to WY 2000 (1 October 1999, to 30 September 2000), which included at least one year with annual average precipitation, one year with below annual average precipitation, and one year with above annual average precipitation [
27]. For the validation period, we selected five consecutive water years other than the calibration period, WY 2001 (1 October 2000, to 30 September 2001) to WY 2005 (1 October 2004, to 30 September 2005). The simulated streamflow and SWE were assessed against the observed dataset by performance measurement metrics (
Section 2.9).
We calibrated and validated the hydrologic model using data from the WY 1996 to WY 2005. Our primary objective was to evaluate the effectiveness of an integrated UAV-Landsat burn severity mapping approach in capturing post-fire streamflow responses, relative to a reference burn severity map. Assessing burn severity impacts over this period and location does not compromise the validity of our analyses, as the focus was on comparing the derived map’s ability to represent the streamflow impacts rather than reproducing a specific historical event.
2.9. Model Evaluation Criteria
The simulated hourly streamflow at the watershed outlet was averaged to calculate the daily average and monthly average flow for each water year during calibration and validation periods. Furthermore, the simulated hourly SWE was averaged to calculate the daily average SWE. Three statistical indicators, Nash–Sutcliffe Efficiency (NSE), Kling–Gupta Efficiency (KGE), and Percent Bias (PBIAS) (Equations (11)–(13)), were used to evaluate the model performance in simulating the streamflow and SWE. These criteria were used to reduce the error while comparing simulated and observed streamflow and SWE.
2.9.1. Nash–Sutcliffe Efficiency (NSE)
In our evaluation, we used the NSE, a metric proposed by Nash et al. [
69], which is a measure of the quality of the model in terms of the representation of variance. The NSE value equal to 1 suggests a perfect fit between simulated and observed values, whereas a negative value indicates poorer performance than the mean fit.
where
oi are observed values,
si are simulated values at
ith time step, and
is the mean observed value.
2.9.2. Kling–Gupta Efficiency (KGE)
The KGE is an expression of Euclidean distance described by bias, variability, and correlation [
70]. An ideal KGE value equal to unity shows the perfect match between simulated and observed values, and a value greater than −0.41 shows better performance than the mean [
71].
where
is the variability ratio,
is the bias and
is the correlation.
2.9.3. Percent Bias (PBIAS)
The PBIAS provides information if, on average, the simulated values are greater or smaller than the observed values [
72]. A negative value indicates overestimation bias, which means that the simulated values are higher than the observed values, whereas a positive value or underestimation bias indicates that the simulated values are lower than the observed values. An ideal zero PBIAS represents a perfect fit between simulated and observed values.
where
oi are observed values and
si are simulated values at
ith time step.
2.10. Post-Fire Vegetation and Soil Parameters Setup
The influences of canopy on snow interception and release, transpiration, snow accumulation and melt rate, root cohesion, and radiation attenuation control the changes in hydrologic processes in post-fire conditions. Similarly, the influences of soil infiltration on water storage, evaporation from the soil surface, and surface runoff differ from pre-fire conditions.
Deprived of detailed information on post-fire changes in soil and vegetation properties for our study location, we adopted a post-fire modeling approach using information from published studies. We simulated streamflow in post-fire conditions by adjusting vegetation and soil properties based on different burn severities. This approach allowed us to account for the effect of burn severities on vegetation and soil, reflecting potential outcomes observed in the literature. The modifications to soil and vegetation parameters for different severities are presented in
Table 4. Model grids unaffected by wildfire were not adjusted from the calibration and had the same characteristics as the pre-fire condition. We adjusted the monthly leaf area index (LAI), fractional coverage, and snow interception efficiency for different vegetation and burn severity types. These changes to post-fire vegetation were obtained from a previous study [
48]. We modeled the effects of fire on the soil by adjusting the maximum infiltration rate. The adjustments to infiltration rate were based on assumptions made on the study of wildfire impacts on conifer forests, as our study watershed is predominantly coniferous. Regardless of the vegetation species, we applied similar assumptions to the maximum infiltration rate of soil consistent with those used by Lanini et al. [
73].
4. Discussion
4.1. Integration of UAV-Landsat Imagery for Burn Severity Mapping
The integration of UAV-based thermal imagery with Landsat-based dNBR index offers a unique method for burn severity mapping. Our results showed that land surface temperatures were higher in burned areas compared to unburned areas. The deposition of ash on the soil surface, which was used as an indicator of fire impacts, absorbs more sunlight and likely contributes to the elevated temperatures of burned areas. This finding is further supported by the observation of an increase in land surface temperature by 9 K after a year of fire [
91], and an increase of up to 8.6 K due to high-severity fire [
92].
Importantly, we are not using land surface temperature as a spectral measurement derived from different wavelengths, nor are we analyzing the influence of temperature on the wavelengths used in the spectral index. Our objective was to explore how the high spatial resolution of UAV-derived land surface temperature can reveal thermal patterns corresponding to burn severity. To accomplish this, we established a relationship between land surface temperature from the UAV with Landsat-derived dNBR, which allowed us to define thresholds for burn severity, and this approach allowed us to examine how thermally distinct areas captured only through fine-resolution UAV-based observations relate to spectral indicators of fire impact. This approach strengthens the interpretation of post-fire landscape responses by combining thermal and spectral observations.
The UAV-Landsat integration approach can be important for post-fire management, hazard mitigation, and watershed restoration planning. High-resolution UAV observations enable the fine-scale detection of burn severity, which can be scaled to satellite imagery, allowing for rapid assessment in landscapes where BAER maps or other burn severity products may not be available or delayed. This method is not only applicable in similar regions but also has the potential to be transferred to other ecoregions; however, it requires calibration of severity thresholds, as different regions may have distinct soil and vegetation characteristics. Furthermore, it requires careful consideration of the timing of UAV acquisition to minimize variability in thermal measurements. Addressing uncertainties, such as UAV sample size, topographic and environmental influences on thermal measurements and spectral indices, and calibrating burn severity based on site-specific conditions, is crucial for applying the integration method to other regions and for operational use.
4.2. Streamflow Response to Burn Severity
We observed that streamflow increased in the post-fire environment compared to pre-fire conditions. The increase in flows might be explained by the decrease in the loss of precipitation in the form of evapotranspiration due to a decrease in vegetation cover [
89]. The increase in the accumulation of precipitation on the ground and sunlight melting snowpacks earlier due to direct contact with snowpacks made water available earlier [
62]. In addition, the decrease in soil water infiltration allowed water for faster surface runoff. Since the snowpack accumulation on the ground increased and these snowpacks melted earlier and at a faster rate than usual, with decreasing canopy cover resulting in higher flows [
88]. Post-fire streamflow response in this study was for the immediate year after the wildfire, but the soil and vegetation parameters might be different with an increase in duration between the fire year and the year of analysis.
4.3. Scalability, Transferability, and Applicability
Our study demonstrated the scalability of the UAV-Landsat burn severity mapping method by generating severity thresholds from limited UAV- and satellite-based observations, and applying them to the entire burned location. This process shows how a relatively small amount of localized information derived through high-resolution UAV observations with larger coverage satellite-based imagery can be scaled to a larger domain, enabling burn severity mapping across large fire-impacted landscapes. Furthermore, our study demonstrated the potential of the integration approach in burn severity transferability for simulating post-fire watershed response when integrated into hydrologic models. However, the transferability approach should be implemented with caution, as using it across areas with differing ecological and topographical characteristics may lead to inappropriate environmental projections. Moreover, the results highlighted the applicability of our study to real-world management and planning. For instance, our UAV-Landsat integration method provides a practical alternative for generating burn severity maps suitable for hydrologic modeling, even where the burn severity mapping is unavailable. The burn severity transferability framework enables post-fire scenario testing of potential wildfire impacts in unburned watersheds or within the same watershed, particularly if the burned area is small, providing valuable information for water budget management, forest restoration, and watershed recovery planning. Beyond mapping severity, the framework enables predictive scenario testing that informs erosion control, reforestation, and water management, and can be extended to assess wildfire impacts under projected climate change and vegetation recovery, supporting adaptive and wildfire-climate-resilient watershed management. Our study demonstrates scalability, transferability, and applicability together, providing a practical tool for predicting watershed response to wildfire impacts.
4.4. Limitations
Our study highlights the potential of integrating UAV-based thermal imagery with satellite-derived dNBR for mapping burn severity, which could be used for post-fire hydrologic assessments even in the unburned watersheds. However, limitations related to spatial access, data processing, and hydrologic modeling suggest the need for further refinement.
Land surface temperatures were collected within a fixed three-day window, without accounting for the effects of meteorological factors such as incoming solar radiation, humidity, wind speed, and soil composition, which could influence the temperature. Furthermore, we extracted land surface temperatures from 1 m × 1 m plots within 30 m × 30 m Landsat pixels by manually selecting areas avoiding canopy shadows. Although this minimized shadow effects, post-fire winds and rainfall might have transported ash from burned to unburned regions, potentially leading to a misrepresentation of temperature estimates derived from areas identified as representative of burned conditions.
Additionally, we resampled the thermal images to match the resolution of the multispectral images, since images acquired through the MicaSense Altum sensor, preprocessed to the same spatial resolution as the multispectral images, using Metashape Professional software, provided reliable temperature estimations in previous studies [
32,
33]. While resampling the original pixels might influence temperature estimates, prior validation studies indicate that uncalibrated MicaSense Altum measurements have root mean squared error (RMSE) values mostly within the sensor’s thermal accuracy of ±5 °C [
36], where Niwa [
32] and Tunca et al. [
33] reported RMSE of 0.93–6.56 °C, and 4.23 °C, respectively. The explanation of the process of downscaling pixels of the UAV-based thermal imagery is not explicit while processing the UAV imagery through Agisoft Metashape Professional software.
Furthermore, we acknowledge that using percentile-based temperature thresholds has limitations. The temperature differences between plots may depend not only on burn severity but also on topography (such as slope, aspect, and elevation) and residual vegetation heterogeneity. The percentile-based approach may mix severity with background variability. Moreover, the thresholds derived in this study are site-specific, based on the local temperature distribution of 150 plots considered for analysis, and therefore may not be directly transferable to other burned areas without recalibration.
The scope of our study was further constrained by limited access to diverse soil and vegetation types, affecting our ability to generalize findings. Due to road closures and a lack of permission from the Forest Service, we were restricted to conducting the UAV flights in limited areas, primarily in areas dominated by shrub and evergreen forest. Our access was limited to areas with gravelly loam and cobbly loam, with the majority of coverage in gravelly loam. This limitation emphasizes the importance of collecting temperature data from different locations, since soil and vegetation types can affect soil temperature.
We disaggregated daily meteorological data into hourly values using observed patterns and linear interpolation. This could smooth out hourly fluctuations and may not fully capture accurate short-term meteorological variability, which could influence hydrologic processes. However, our model focuses on daily mean SWE, as well as daily and monthly mean streamflow; this limitation is unlikely to substantially affect the results. We calibrated the hydrologic model against observed SWE and streamflow using parameters obtained from previous studies, refining highly variable values such as minimum stomatal resistance beyond literature ranges to achieve realistic baseflow simulations. Furthermore, post-fire effects on streamflow were simulated using generalized assumptions made in previous studies with only changes in a few soil and vegetation parameters, which may not reflect site-specific conditions accurately. In this study, we adjusted a few soil and vegetation parameters; however, other parameters not considered in the study might also have been altered by fire and could influence post-fire streamflow.
4.5. Recommendations
Future studies should address the identified limitations to improve the precision and applicability of the UAV-Landsat integration approach for burn severity mapping and post-fire hydrologic modeling. Expanding UAV data acquisition across diverse vegetation types, soil textures, and topographic conditions would improve the generalization of temperature-severity relationships. Future studies should consider larger land surface areas for temperature analyses and validate land surface temperature estimates using field-based temperature measurements within the fire perimeter to quantify thermal accuracy under varying environmental conditions.
Moreover, the burn severity thresholds derived in this study should be validated against independent field measurements of fuel loss and topographic effects. Testing alternative methods that represent burn severity without relying solely on percentile-based classification could further improve classification accuracy. Using higher-accuracy sensors for image collection could enhance the robustness of temperature threshold classifications, and consequently, the reliability of burn severity classification. Increasing the number of Landsat pixels used in regression analyses could also strengthen the determination of temperature-severity thresholds. While Random Forest was used in this study for burn severity transfer modeling due to its robustness and widespread application, we recommend testing alternative statistical and machine learning methods that may provide improved accuracy. Future studies should also investigate how various individual features and combinations of features impact model performance and assess potential fitting issues.
In hydrologic modeling applications, we recommend careful consideration for future studies that aim to analyze pre-fire and post-fire dynamics at an hourly scale, where accurately capturing peak and low flows would require higher-resolution meteorological inputs. Future studies should incorporate field measurements of sensitive parameters for both pre-fire and post-fire scenarios to accurately model hydrologic processes. Moreover, post-fire streamflow was modeled in a different watershed, and therefore, we could not directly evaluate whether the UAV-Landsat-derived burn severity map performed better than the reference map in simulating streamflow in the original burned watershed. While we validated the UAV-Landsat-based burn severity map against the reference map in terms of pixel-wise burn severity classification, its effectiveness in driving post-fire streamflow remains untested. Future studies should prioritize local field-based parameterization and direct validation within the fire-affected watershed to enhance the reliability of burn severity mapping using the UAV-Landsat integration approach employed in this study, before testing the burn severity transferability approach in unburned areas.
5. Conclusions
This study advances post-fire hydrologic modeling by demonstrating that UAV-based thermal imagery, when integrated with Landsat data, can generate burn severity maps of comparable accuracy to BAER products without field calibration, thus extending hydrologic analysis capabilities to regions where field-based data are unavailable. The UAV-Landsat integration approach achieved a burn-severity classification accuracy of ~73% (Kappa of ~0.62), comparable to the BAER fire product. The transferability modeling test confirmed its applicability in neighboring watersheds, with annual average streamflow being 3.69% to 5.12% lower than that from BAER-based hydrologic simulations.
Beyond mapping burn severity, the UAV-Landsat integration method enables predictive scenario testing in partially burned or unburned watersheds, allowing managers to simulate hypothetical wildfire scenarios and evaluate the potential impacts of wildfire on snowmelt, runoff, streamflow, and water budgets. The framework supports rapid post-fire assessments, providing a scalable solution for agencies responsible for managing hydrologic risk and water budgets in remote or data-limited mountainous watersheds.
As wildfires become more frequent and intense, driven by climate change, increasing watershed resiliency is becoming increasingly essential. This study lays the groundwork for future research to refine and expand the methodology for establishing robust strategies to protect water resources in fire-prone regions using UAVs. By coupling machine learning with distributed hydrologic modeling, this approach enables data-driven predictions of wildfire-induced watershed changes, providing actionable insights for post-fire water management, restoration, and climate-resilient planning.
Future studies should validate this integration across diverse ecosystems, incorporate near-real-time UAV and satellite data fusion, and expand the approach to include multi-temporal monitoring of vegetation recovery and hydrologic response. Coupling this framework with climate projections and real-time forecasting systems will further enhance its potential for supporting adaptive watershed management under future wildfire regimes.