Modelling, Characterizing, and Monitoring Boreal Forest Wetland Bird Habitat with RADARSAT-2 and Landsat-8 Data

: Earth observation technologies have strong potential to help map and monitor wildlife habitats. Yellow Rail, a rare wetland obligate bird species, is a species of concern in Canada and provides an interesting case study for monitoring wetland habitat with Earth observation data. Yellow Rail has highly speciﬁc habitat requirements characterized by shallowly ﬂooded graminoid vegetation, the availability of which varies seasonally and year-to-year. Polarimetric Synthetic Aperture Radar (SAR) in combination with optical data should, in theory, be a great resource for mapping and monitoring these habitats. This study evaluates the use of RADARSAT-2 data and Landsat-8 data to characterize, map, and monitor Yellow Rail habitat in a wetland area within the mineable oil sands region. Speciﬁcally, we investigate: (1) The relative importance of polarimetric SAR and Landsat-8 data for predicting Yellow Rail habitat; (2) characterization of wetland habitat with polarimetric SAR data; (3) yearly trends in available habitat; and (4) predictions of potentially suitable habitat across northeastern Alberta. Results show that polarimetric SAR using the Freeman– Durden decomposition and polarization ratios were the most important predictors when modeling the Yellow Rail habitat. These parameters also effectively characterize this habitat based on high congruence with existing descriptions of suitable habitat. Applying the prediction model across all wetland areas showed accurate predictions of occurrence (validated on ﬁeld occurrence data), and high probability habitats were constrained to very speciﬁc wetland areas. Using the RADARSAT-2 data to monitor yearly changes to Yellow Rail habitat was inconclusive, likely due to the different image acquisition times of the 2014 and 2016 images, which may have captured seasonal, rather than inter-annual, wetland dynamics. Polarimetric SAR has proved to be very useful for capturing the speciﬁc hydrology and vegetation structure of the Yellow Rail habitat, which could be a powerful technology for monitoring and conserving wetland species habitat.


Introduction
The recent proliferation of Earth observation (EO) data has facilitated tremendous growth in applications of this technology. EO data have become progressively more used in species modeling, due to easy access to data almost anywhere in the world and the ability to have repeat observations of the same area, i.e., monitoring [1]. Additionally, EO data can provide continuous predictor variables which require no interpolation or sampling biases [1]. These predictor variables are often the most useful in modern presence/absence species distribution models using techniques like Boosted Regression Trees [2]. Wetland species, in particular, can benefit from habitat/species modeling with EO because these habitats are typically remote/inaccessible (boreal wetlands especially), and they are also very dynamic in both space and time. Wetland mapping and monitoring with EO sensors, such as optical, Synthetic Aperture Radar (SAR), and Light Detection and Ranging have been well studied [3][4][5][6][7], but the application of these technologies to characterize species habitat within wetlands is the next logical step which can help habitat monitoring and conservation efforts.
The species of interest for this study is the Yellow Rail (YERA-Coturnicops noveboracensis), which is an obligate wetland species that are typically very secretive. The majority of its known habitat is in the boreal forest zone [8], and it is thought to have a small and declining global population (ca. 10,000-25,000 individuals [9]). However, the species is also listed as data deficient because of the challenges associated with sampling and mapping the habitat this species requires. This led to it being listed as a species of special concern under Canada's Species at Risk Act. Most descriptions characterize its breeding habitat as graminoid wetlands with shallow standing water, and an understory layer of senescent vegetation [10,11]. The Yellow Rail's suitable habitat is expected to be highly specific, dynamic, and sensitive to human hydrological disturbance making it an ideal test case for EO species habitat modeling/monitoring applications.
Most previous studies using EO data to study wetlands have focused on wetland mapping and monitoring, rather than directly modeling wildlife habitat. The mapping of wetland areas versus other landcover can typically be accomplished with either optical data [12][13][14] to identify wetland vegetation or with a Digital Elevation Model to identify topographically wet areas [5,15]. The seasonal and dynamic nature of wetlands is typically captured with SAR backscatter, due to its ability to see through clouds and its strong ability to detect surface water/flooded vegetation [3]. This hybrid data approach can typically achieve very high accuracies for wetland, wetland class, and landcover maps [5,7,16], but, in the end, this categorical data may not be very useful for characterizing meaningful habitat for wetland species. Using these EO sources directly to model the species' habitat may be a better method [16]. This method can directly correlate a species' habitat to EO data without the intermediate step of running a species model on top of the landcover classification models, which may compound errors. These categorical landcover classifications are often not updated regularly, making species monitoring difficult when using such approaches [1]. Further, EO data can be specifically chosen to detect habitat features of interest to a given species. In the case of YERA, we would want EO data to detect and monitor the changing hydrology and capture the specific vegetation structure rather than a coarser wetland type designation. SAR, and specifically polarimetric SAR, including both fully polarimetric and compact polarimetric configurations, have shown a strong ability to detect water/vegetation interactions and characterize wetland vegetation structure [17,18]. Certain decompositions from SAR, such as the Freeman-Durden decomposition [19], can separate SAR backscatter into useful parameters for characterizing wetland habitat: Double bounce-typically associated with flooded vegetation in wetlands; volume scattering-higher values associated with higher biomass, and rough surface-rougher surface at the scale of the wavelength. Moderate-resolution optical data should also theoretically detect graminoid vegetation, but may be limited for characterizing vegetation/water interactions. A DEM may not be of interest for YERA modeling as DEMs are typically used to distinguish wetland from upland areas, and we would expect habitat to only occur in flat wetland areas.
While species habitat modeling with EO is less common than landcover mapping with EO, there have been numerous studies that use EO for species conservation purposes. Prior studies have used optical data for species distribution models of the bird's breeding habitat [20][21][22] or used SAR/LiDAR to model specific habitat vegetation structures for bird species [23,24]. Other studies have used SAR to map and monitor Panda habitat [25], and to characterize wetland habitat in the Brazilian Pantanal [26]. Similar polarimetric SAR and bird habitat studies have been done with Black Tern habitat in the Great Lakes region [27].
In a related study to ours, [16] used Sentinel-1, Sentinel-2 (multi-spectral optical), and DEM data to predict YERA species abundance. Species habitat monitoring is potentially a very powerful and under-utilized application of this technology, especially when monitoring dynamic habitats.
This study aims to build on the work from [16,28] by investigating the importance of polarimetric SAR from RADARSAT-2 for developing more accurate predictive models of the YERA habitat. We also investigate yearly and seasonal habitat dynamics with RADARSAT-2 and Landsat-8 data to determine if we can track changes in habitat suitability in time. The objectives of this study are divided into four parts: (1) Investigate the relative importance of polarimetric SAR for predicting YERA wetland habitat, (2) characterize YERA habitat based on the EO variables (3) explore the ability of EO data to detect changes in YERA habitat year-to-year, and (4) assess the ability of polarimetric SAR and optical data to predict suitable YERA habitat across larger areas.

Study Area
The area of interest for this study is a 3480 km 2 region in northeastern Alberta, Canada ( Figure 1). This is in the Boreal Plains Ecozone/Boreal Forest Natural Region, and elevation in this area is relatively flat from 200-600 m above sea level. Vegetation in this area is mainly conifer, mixed and deciduous forests, along with large peatland complexes in low-lying areas [29]. As seen in panel (a) of Figure 1, the study area is the main mining area in the Athabasca oil sands mineable area. The area has numerous open mines with associated tailing ponds and resource roads. The current extent of wetlands in the area is located around the blue field points seen in Figure 1. According to DeLancey, Simms, Mahdianpari, Brisco, Mahoney and Kariyeva [7], 30% of the study area is wetland habitat, with most of this being treed, shrub, and graminoid fens.

Yellow Rail Abundance Data
The YERA abundance data comes from previous studies in [17,29]. This data comes from autonomous recording units during field campaigns in 2013-2019. There were 203 recording units placed within wetland sites in the study area. The presence/absence of YERA at each site was determined using automated processing and human interpretation. First, a recognizer was used to filter out potential YERA calls, then, a human interpreter would validate this call. Prior work has shown that the effective detection radius for these recording units is about 250 m [30], and therefore, these sites were buffered by 250 m. These sites were then clipped to include only wetland areas using the ABMI Wetland Inventory [7], which has been suggested as best practice in Thuiller, et al. [31].

Earth Observation Data
This study used RADARSAT-2 and Landsat-8 data from 2014 and 2016 listed in Table  1. RADARSAT-2 Fine Quad-Pol data was provided by the Canadian Centre for Mapping and Earth Observation. The 2016 data consisted of two ascending orbit images, while the 2014 data was one ascending orbit image. The images were processed in PCI Geomatica with the following steps: 9 × 9 boxcar filter, matrix symmetrization, Freeman-Durden decomposition [19], and finally ortho-rectification and terrain correction using the ALOS digital elevation model. Parameters extracted from this data for YERA habitat modeling were the three components of the Freeman-Durden decomposition (volume scattering, double bounce, and rough surface; see [17] for a more detailed physical description of these parameters), Vertical-Vertical (VV), Horizontal-Horizontal (HH), and Horizontal-Vertical (HV) backscatter, cross-pol ratio (HH/VV), and co-pol ratio (HH/HV) (see Table 2). The Freeman-Durden decomposition is a technique that uses the phase and amplitude part of the SAR wave to analyze the scattering characteristics of polarimetric SAR data. The volume scattering component typically represents random scattering of the SAR wave off targets like tree canopies, double bounce represents a SAR wave which bounces off two reflectors and back to the sensor (like water then vegetation), and rough surface represents scattering off of rough surfaces at the scale of the wavelength. The other parameters used from SAR are the amplitude of the SAR return in different send and receive configurations (HH, HV, VV) or ratios between their parameters (cross-pol ratio = HH/VV).
Landsat-8surface reflectance data were acquired and processed in Google Earth Engine [32]. Cloud free Landsat-8 images were chosen as close to the RADARSAT-2 dates as possible (see Table 1). From the Landsat-8 images, the Normalized Difference Vegetation Index (NDVI), Normalized Difference Water Index (NDWI), and Normalized Burn Ratio (NBR) were calculated. Parameters extracted and downloaded from this data for YERA habitat modeling were Bands 2-7, NDVI, NDWI, and NBR (see Table 2).

Habitat Modeling, Prediction, and Monitoring
The YERA abundance data were reduced to presence/absence for modeling by collapsing all 2013-2019 records. If YERA were detected at all during the 2013-2019 period, the site was considered occupied (presence), and if they were never detected, the site was considered absent. In the end, the presence/absence data were buffered polygons clipped to wetland boundaries with an associated presence/absence value. For each of these 208 polygons, the mean value for each of the RADARSAT-2 and Landsat-8 parameters were extracted. With these data, a Boosted Regression Tree model [33] was run with similar tuning parameters as in McLeod et al. (2021) [16]: Tree complexity = 5, learning rate = 0.03, and bag fraction of 0.5. Boosted Regression Trees is an ensemble modeling technique that uses multiple, averaged classification trees using boosting. These models are typically good for species modeling because they try to reduce overfitting and can model nonlinear responses and complex interactions. Outputs of this model were used to produce model statistics, variable importance charts (importance of RADARSAT-2 versus Landsat-8 for characterizing habitat), and response curves (specific YERA habitat characterization). The model was also run two additional times with RADARSAT-2 inputs only and Landsat-8 inputs only to further assess the relative importance of SAR versus optical. For each model variation, the Area Under the receiver operating characteristic curve (AUC) model statistic was compared. The receiver operating characteristic curve is a plot of the true positive rate and the false positive rate at different thresholds, and thus, the AUC is the calculated area of that curve. AUCs close to 1 are considered great classifiers, while anything near 0.5 is considered poor.
To predict YERA habitat across the study area, a hexagonal grid was generated with individual hexagons approximating the area of a 250 m radius circle. These hexagons were then clipped to wetland areas, and polygons were removed under 17,671 m 2 (5 × 5 Landsat-8 pixels). The 208 sites were split into 70% training and 30% test data to assess the prediction accuracy. Input variables into the Boosted Regression Tree model were reduced to exclude redundant inputs. The final list of input variables were: Volume scattering, double bounce, rough surface, cross-polarization ratio, and co-polarization ratio from RADARSAT-2 and NDVI, NDWI, B3, B5, B6, and B2 from Landsat-8. These variables were then used to predict the probability of YERA occupancy in the hexagonal grid. We assessed the accuracy of this prediction by predicting the probability of occurrence in the 30% test data. To assess the ability of EO data to monitor changes in habitat sites were split into three categories: (1) Loss of YERA occupancy from 2014 to 2016, (2) no change in occupancy, and (3) gain in occupancy. From this, the yearly trend in the EO inputs was charted to see if loss of occupancy was associated with any loss/gain in the EO indicator.

Results
An initial exploratory analysis of the inputs in relation to occupancy of YERA showed noticeable differences between presence/absence sites in some of the inputs (Figure 2). The rough surface component of the Freeman-Durden decomposition showed a large difference between occupied and unoccupied sites, with YERA preferring lower surface roughness (Figure 2     Response of predicted YERA occupancy probability to all EO inputs was generated to assess ideal habitat. Figure 4 shows some of the more interesting/important response curves. CrossPolRatio (Figure 4b) and DoubleBounce_FD (Figure 4c) were generally positively correlated with occupancy probability. RoughSurface_FD (Figure 4a) was strongly negatively associated with occupancy. VolumeScattering_FD (Figure 4d) showed a more complex relationship with occupancy with probability peaking at low to moderate amounts of volume scattering. Landsat-8 inputs, such as B3 and B6 (Figure 4e,f), showed higher predicted probability associated with higher reflectance values.
The two-year trend data can be seen in Figure 5. All site types (loss, gain, no change) had the same positive/negative change between the years, but the slope of the change did differ between site types. Sites with a loss of YERA occupancy saw a stronger dip in the cross-pol ratio than sites with no change or gain. Double bounce values increased less for sites that gained YERA while sites that lost YERA were associated with lower NDVI slope. Predicted YERA occupancy can be seen in Figure 6. Most wetland areas were found to not be suitable YERA habitat (darker colors). The majority of high probability habitat is localized to a few small areas in the fen immediately west and east of McClelland Lake. Other prime habitat areas were small hexagons near the edges of lakes or streams. All treed wetland areas were shown to have near-zero YERA occupancy probability. Overall, this predicts YERA occupancy with an 82% accuracy when compared to the 30% test field sites. This prediction had a near-equal number of commission and omission errors, indicating a relatively useful prediction.  Hexagonal YERA occupancy probability for the study area. Yellows represent high YERA occupancy probability, and darker colors represent low probabilities. Insets on the right zoom in on areas with suitable YERA habitat (center and graminoid areas of wetland habitat) and poor habitat (edge/treed wetland).

Relative Importance of EO Inputs
Figures 2 and 3 revealed that polarimetric SAR is much more important for predicting YERA wetland habitat than Landsat-8 optical data. Polarization ratios or decomposition parameters from SAR were shown to have a much larger numerical difference between occupied and unoccupied sites and were much more important in the model. The prediction models were shown to be much more powerful with RADARSAT-2 data included. In fact, it appears that Landsat-8 variables did not improve model performance as a RADARSAT-2-only model generates the same AUC as the RADARSAT-2 + Landsat-8 models. This highlights a limitation of optical data in that it does not penetrate to the vegetation/water surface like SAR. Polarimetric SAR can detect important vegetation flooding and hydrology, due to the 180-degree phase shift of a double bounce SAR return [3,18]. This information is expressed in the cross-pol ratio and double bounce parameters. In addition to the relative importance of RADARSAT-2 data for habitat modeling, the parameters from certain SAR decompositions, such as the Freeman-Durden, can be intuitively related back to on-theground habitat characterization, which can be important for conservation efforts. Figure 4 can facilitate the interpretation of YERA habitat from EO indices. The relationship between YERA occupancy probability and cross-pol ratio/double bounce shows YERA prefer habitat with flooded vegetation (i.e., there is water/veg interaction in the SAR return). Figure 4a shows a preference for smooth habitat, both vegetation, and topography-wise. Panel (d) of Figure 4 shows an interesting result with YERA preferring habitat with low to moderate amounts of volume scattering. This is most likely associated with graminoid habitat interspersed with tree and shrub habitat that would register high values of volume scattering and open water/barren areas, which would register very low volume scattering values. Landsat-8 index responses to occupancy are harder to interpret, but most bands show higher occupancy at high reflectance. This could be an association with more senescent vegetation (i.e., green vegetation is darker with low reflectance). Figure 7 shows an example of one of these YERA-occupied (verified in the field) habitats which demonstrated one of the highest predicted probability of occupancy hexagons seen in Figure 6. All the wetland habitat associations detected by the EO data are visible: Flooded vegetation, low to moderate biomass, smooth surface, and more senescent vegetation. These interpretations of EO data seem to be consistent with existing descriptions of YERA habitat based on on-the-ground data [10,11], which is a positive finding as EO data does not always match with field observations.

Monitoring Trends
The yearly trends in YERA habitat from 2014 to 2016 show inconclusive results. All sites, no matter whether the change in occupancy was negative or positive, had the same slope direction. This may indicate that the seasonal changes from August to June produce a stronger signal than the changes from 2014 to 2016. The relative change from 2014 to 2016 may tell us something about changing YERA habitat between years. For example, sites that lost YERA from 2014 to 2016 had a larger drop in cross-pol ratio than sites with no change or gain. This may indicate that sites that lost YERA dried out, corresponding with a lower flooded vegetation signal in 2016; however, this data does not provide conclusive evidence. It is likely that future studies monitoring yearly changes in wetland habitat will need to have consistent image acquisitions dates from year-to-year to capture wetlands in similar seasonal states. Optical data, such as Sentinel-2, may can accomplish this with a 5 day repeat imagery, although, consistent cloud or haze may make this difficult. RADARSAT Constellation Mission (RCM) data and future SAR missions may be a solution as it can provide a rich time series data stack that allows for more control on the observation of seasonal wetland dynamics. Ideally, rather than monitoring indices, multi-year/dynamic occupancy models [34] could be run for YERA or other species based on yearly EO data. This could better capture dynamic wetland habitats and estimate the extinction and colonization rates of many sites. With the increasing availability of high temporal resolution optical and SAR EO data, this will likely be achievable with the right multi-year field data.

Habitat Prediction
The prediction of YERA occupancy probability showed good results, with most sites accurately predicting YERA habitat (82%). This prediction came with a 75% true-positive rate and an 86% true-negative rate. We believe this model/similar models should be generalizable to larger areas in the Boreal regions of Canada/North America, as seen in McLeod et al. (2021) [16]. Figure 6 shows that most wetland areas are not suitable habitat highlighting that YERA have very specific habitat needs. All treed wetland areas show near-zero occupancy probability, and most high probability areas were constrained to graminoid wetland areas with a flooded vegetation signature. The prediction results may be biased towards larger YERA habitat patches, due to the large size of the prediction hexagons. Many hexagons can have both treed and graminoid habitat, and therefore, the SAR response for the whole hexagon may not show ideal habitat even though much of the hexagon could be good habitat. A more sensitive model (i.e., fewer false negatives with more false positives) may be more appropriate in many cases for species modeling, especially in areas with development like this study. In the end, the sensitivity can be altered by end-users of these models to shift towards a more desired true positive rate.
Ideally, this prediction map could be updated yearly with new input RADARSAT-2 and Landsat-8 data every year to monitor seasonal changes in the YERA wetland habitat. This study, unfortunately, was data-limited for RADARSAT-2 data (only two available images during the growing season) and could not accomplish yearly monitoring, but moving forward, the RCM data can provide similar data with a dense time series for most areas in Canada.

Conclusions
The application of polarimetric RADARSAT-2 data proved to be very powerful for characterizing and predicting YERA habitat. Although optical data from Landsat-8 is likely useful for these applications, it is somewhat limited for detecting flooded vegetation and capturing specific vegetation structures that may be important for wetland-associated wildlife species. Parameters from the RADARSAT-2 Freeman-Durden decomposition provided somewhat intuitive interpretations of YERA habitat, which did match with onthe-ground expected habitat (flooded vegetation, with graminoid vegetation, and more senescent vegetation). Predicting this habitat across a larger area was found to be successful as high probability YERA habitat was constrained to wetland areas corresponding closely to existing descriptions of habitat from the literature (i.e., graminoid, shallow water, senescent vegetation). Results from 2014-2016 habitat trends proved to be somewhat inconclusive. This may be because the magnitude of seasonal change in wetlands (from June to Aug) may have been greater than the magnitude of year-to-year change. This highlights the need for more temporally consistent data with a better temporal resolution for wetland habitat monitoring. Fortunately, this data is now becoming available with the RADARSAT Constellation Mission, which should provide consistent polarimetric SAR data for capturing intra-and inter-year wetland variability. This data can then be a powerful tool for wetland habitat monitoring/conservation efforts. Funding: This work was funded by the Alberta Biodiversity Monitoring Institute (ABMI). Funding in support of this work was received from Alberta Environment and Parks. RADARSAT-2 data was kindly provided by the Canadian Centre for Mapping and Earth Observation.