Applying Machine Learning and Time-Series Analysis on Sentinel-1A SAR/InSAR for Characterizing Arctic Tundra Hydro-Ecological Conditions

Synthetic aperture radar (SAR) is a widely used tool for Earth observation activities. It is particularly effective during times of persistent cloud cover, low light conditions, or where in situ measurements are challenging. The intensity measured by a polarimetric SAR has proven effective for characterizing Arctic tundra landscapes due to the unique backscattering signatures associated with different cover types. However, recently, there has been increased interest in exploiting novel interferometric SAR (InSAR) techniques that rely on both the amplitude and absolute phase of a pair of acquisitions to produce coherence measurements, although the simultaneous use of both intensity and interferometric coherence in Arctic tundra image classification has not been widely tested. In this study, a time series of dual-polarimetric (VV, VH) Sentinel-1 SAR/InSAR data collected over one growing season, in addition to a digital elevation model (DEM), was used to characterize an Arctic tundra study site spanning a hydrologically dynamic coastal delta, open tundra, and high topographic relief from mountainous terrain. SAR intensity and coherence patterns based on repeat-pass interferometry were analyzed in terms of ecological structure (i.e., graminoid, or woody) and hydrology (i.e., wet, or dry) using machine learning methods. Six hydro-ecological cover types were delineated using time-series statistical descriptors (i.e., mean, standard deviation, etc.) as model inputs. Model evaluations indicated SAR intensity to have better predictive power than coherence, especially for wet landcover classes due to temporal decorrelation. However, accuracies improved when both intensity and coherence were used, highlighting the complementarity of these two measures. Combining time-series SAR/InSAR data with terrain derivatives resulted in the highest per-class F1 score values, ranging from 0.682 to 0.955. The developed methodology is independent of atmospheric conditions (i.e., cloud cover or sunlight) as it does not rely on optical information, and thus can be regularly updated over forthcoming seasons or annually to support ecosystem monitoring.


Introduction
The Arctic tundra biome is among the most vulnerable landscapes on Earth, undergoing dramatic changes to vegetation, water, and soil surface properties over recent decades. These widespread physical ecosystem changes are driven by rising concentrations of greenhouse gases [1], resulting in fundamental consequences to wildlife [2] and human populations [3]. Despite being one of the coldest biomes on Earth, this water-rich region is warming twice as fast as the global average-a phenomenon known as arctic amplification [4]. Accelerated warming of the climate is having profound impacts on search employing SAR for mapping high-latitude landcovers have used only backscatter intensity [18,20,21] or decomposition products [15][16][17]19] from a fully polarimetric SAR. Thus, the primary aim of this study was to fill this notable gap in the literature by investigating the potential of both dual-polarimetric SAR intensity and coherence products for the characterization of highly transient Arctic tundra systems. In particular, our study's objectives were: 1.
To analyze the temporal signatures of intensity and coherence measurements from Sentinel-1A C-band SAR data in relation to environmental conditions, thus providing insight on their utility for landcover characterization; 2.
To develop a machine learning methodology capable of identifying the hydro-ecological state (e.g., wet or dry, and general vegetation structure) of Arctic tundra landcovers using a time series of SAR/InSAR data and terrain metrics; 3.
To provide recommendations on the efficacy of each input data source for the development of baseline landcover data.

Study Area
The study area for this mapping experiment was the Mackenzie Delta and surrounding region (Figure 1), as this area represents a variety of Arctic tundra landcovers. The Mackenzie Delta is a post-glacial low-lying alluvial plain located in Canada's western Arctic, in the Northwest Territories. Inuvik and Aklavik are the principal settlements in the region. For reference, located 50 km north of Inuvik is the Trail Valley Creek Research Station (68 • 44 25 N 133 • 29 36 W) which has been a hub for vegetation, permafrost, snow and other land and ecosystem change research [47][48][49]. The delta is part of the Mackenzie River Basin (MRB), which is the second largest river basin in North America, occupying 20% of Canada's landmass. This area is mainly drained by the Mackenzie River, which is Canada's longest, flowing northwest out from Great Slave Lake through the delta and into the Beaufort Sea, Arctic Ocean. It is also the largest riverine source of organic carbon and sediment to the Arctic Ocean [50]. Recent studies have shown an increase in discharge in the Mackenzie River, suggesting a northern response to changing climatic conditions [51].
Remote Sens. 2022, 14, x FOR PEER REVIEW 3 of 26 Despite the promising results and diverse applications of InSAR, there is limited research on this technique's ability to characterize Arctic tundra ecosystems. Most research employing SAR for mapping high-latitude landcovers have used only backscatter intensity [18,20,21] or decomposition products [15][16][17]19] from a fully polarimetric SAR. Thus, the primary aim of this study was to fill this notable gap in the literature by investigating the potential of both dual-polarimetric SAR intensity and coherence products for the characterization of highly transient Arctic tundra systems. In particular, our study's objectives were: 1. To analyze the temporal signatures of intensity and coherence measurements from Sentinel-1A C-band SAR data in relation to environmental conditions, thus providing insight on their utility for landcover characterization; 2. To develop a machine learning methodology capable of identifying the hydro-ecological state (e.g., wet or dry, and general vegetation structure) of Arctic tundra landcovers using a time series of SAR/InSAR data and terrain metrics; 3. To provide recommendations on the efficacy of each input data source for the development of baseline landcover data.

Study Area
The study area for this mapping experiment was the Mackenzie Delta and surrounding region (Figure 1), as this area represents a variety of Arctic tundra landcovers. The Mackenzie Delta is a post-glacial low-lying alluvial plain located in Canada's western Arctic, in the Northwest Territories. Inuvik and Aklavik are the principal settlements in the region. For reference, located 50 km north of Inuvik is the Trail Valley Creek Research Station (68°44′25″N 133°29′36″W) which has been a hub for vegetation, permafrost, snow and other land and ecosystem change research [47][48][49]. The delta is part of the Mackenzie River Basin (MRB), which is the second largest river basin in North America, occupying 20% of Canada's landmass. This area is mainly drained by the Mackenzie River, which is Canada's longest, flowing northwest out from Great Slave Lake through the delta and into the Beaufort Sea, Arctic Ocean. It is also the largest riverine source of organic carbon and sediment to the Arctic Ocean [50]. Recent studies have shown an increase in discharge in the Mackenzie River, suggesting a northern response to changing climatic conditions [51].  The Mackenzie delta is an incredibly productive, sensitive, and dynamic ecosystem, and is the largest arctic delta in North America with an area of 13,000 km 2 , and the world's second largest [52]. The delta provides important habitat for mammals, fish, and migratory birds, which is recognized through the establishment of the Kendall Island Bird Sanctuary. Surrounding the delta to the east are dry, hilly uplands of the Tuktoyaktuk Coastlands consisting of various permafrost features such as polygonal terrain, ice wedges, and pingos. To the west are the Richardson Mountains, part of the northernmost ranges of the Cordillera, which parallel the boundary between the Northwest Territories and the Yukon.
The Mackenzie Delta is considered part of the discontinuous permafrost zone largely due to shifting river channels and the presence of thousands of lakes and wetlands, whereas the surrounding tundra uplands contain ice-rich continuous permafrost [53]. The dynamic hydrology of the delta has shown to result in 95% surface water coverage at flood peak [54], making it an excellent area to assess the efficacy of Earth observation satellite data for hydrological mapping and monitoring. Moreover, peak flooding often occurs following winter ice break up and during the summer after heavy rainfalls [55].

Vegetation of the Mackenzie Delta and Hydro-Ecological Classes of Interest
The Mackenzie Delta transitions from boreal forest in the south to low-shrub tundra in the north, a result of the region's climatic gradient and traversing of the treeline. The northern tundra of the delta is dominated by sedges and dwarf shrubs, with specific successional communities influenced by flooding and sedimentation processes [56]. This includes hydrophilic graminoids in poorly drained areas that transition from open water areas, such as sedges (Carex aquatilis) and emergent horsetail (Equisetum). Willow (Salix spp.) and alder (Alnus crispa) species are also very common, found along frequently flooded lakeshores and levees. Dwarf to low ericaceous shrubs commonly grow on the drier uplands. The central and southern parts of the delta more commonly contain dry white spruce (Picea glauca) forests, tall shrub communities, and open canopy peatlands with stunted woody vegetation and wet organic soils (Sphagnum spp.).
For this experiment, we identified six main semantic classes of interest found in the study area, defined based on dominant vegetation composition and/or hydrological properties. Classes were defined with consideration of the major influences on SAR backscatter, including physical vegetation structure and moisture. Classes included open water, wet graminoid, wet woody, dry woody, tundra, and mountain/unvegetated ( Figure 2). Open water areas included lakes, ponds, and linear features such as streams and channels, and could contain aquatic plants (e.g., Nuphar spp.). Wet graminoid areas were poorly drained areas that experienced temporally dynamic flooding and drawdown processes and contained sedges and rushes with little presence of shrubs or trees (<20%). The wet woody class was poorly to imperfectly drained areas with wet soils, often visible ponding, and the presence of woody vegetation (>20%). The hydrology of wet woody areas was less dynamic than that of open water and wet graminoid areas, and contained willows, alder, and sometimes an open canopy coverage of standing dead or live spruce species. Woody peatlands were included in this class. Dry woody areas were vegetated uplands with well drained and dry soils, and contained either a very dense coverage of ericaceous shrubs or thick needleleaf or deciduous trees. Tundra areas were elevated regions outside of the delta that were well drained, having continuous permafrost, and mostly contained a coverage of grasses, lichens, and mosses with relatively small areas of trees and shrubs [57]. Mountain/unvegetated areas contained exposed rock or soil with little to no vegetation.

Reference Data
Spatially referenced training and testing polygons were collected and validated using high-resolution multi-spectral WorldView-2 and -3 satellite imagery ( Figure 3). Two scenes were acquired for this task, one located in the low-shrub northern region of the delta (imaged 9 July 2020), and a second in the middle-south region of the delta covering a range of ecotypes, including woody wetlands, exposed mountains, and sparsely vegetated tundra (imaged on 17 August 2020). Both scenes were acquired as 11 bit GeoTIFF raster files with a ground sample distance (GSD) spatial resolution of 1.84 m. Spectral bands spanned the visible to near infrared regions of the electromagnetic spectrum, and included coastal blue (400-450 nanometers; nm), blue (450-510 nm), green (510-580 nm), yellow (585-625 nm), red (630-690 nm), red edge (705-745 nm), near infrared-1 (770-895 nm), and near infrared-2 (860-1040 nm) bands. To assist in the photointerpretation process, a normalized differenced vegetation index (NDVI [58]) layer was derived.
While the high-resolution imagery and NDVI layer were the primary datasets used for digitizing, topographic information from a digital elevation model (DEM) also

Reference Data
Spatially referenced training and testing polygons were collected and validated using high-resolution multi-spectral WorldView-2 and -3 satellite imagery ( Figure 3). Two scenes were acquired for this task, one located in the low-shrub northern region of the delta (imaged 9 July 2020), and a second in the middle-south region of the delta covering a range of ecotypes, including woody wetlands, exposed mountains, and sparsely vegetated tundra (imaged on 17 August 2020). Both scenes were acquired as 11 bit GeoTIFF raster files with a ground sample distance (GSD) spatial resolution of 1.84 m. Spectral bands spanned the visible to near infrared regions of the electromagnetic spectrum, and included coastal blue (400-450 nanometers; nm), blue (450-510 nm), green (510-580 nm), yellow (585-625 nm), red (630-690 nm), red edge (705-745 nm), near infrared-1 (770-895 nm), and near infrared-2 (860-1040 nm) bands. To assist in the photointerpretation process, a normalized differenced vegetation index (NDVI [58]) layer was derived.
supported the interpretation process (e.g., for identifying depressions and low-lying areas). DEM characteristics are described later in Section 2.10. Both WorldView-2 and -3 scenes were fully digitized, resulting in end-to-end reference polygons. Photointerpretation was completed using a consistent scale range of 1:2500 to 1:8000. The collection of polygons resulted in a contiguous area, and each polygon represented a relatively homogenous landcover type. The final database of reference polygons was split 50% for model training and 50% for model testing.

Sentinel-1 SAR Imagery
SAR imagery used in this study was from the European Space Agency's open-access and polar-orbiting Sentinel-1A satellite. Although Sentinel-1A and 1B satellites are identical, Sentinel-1A was chosen because of its longer historical archive, which offers the potential for greater year-to-year ecosystem monitoring beyond this study. Six Sentinel-1A C-band SAR scenes were downloaded from the Alaska Satellite Facility's (https://search.asf.alaska.edu/, accessed on 24 January 2022) distributed archive center and in the Interferometric Wide (IW) swath mode. Level-1 Single Look Complex (SLC) products were used rather than Ground Range Detected (GRD) products because SLC retains both amplitude and phase information necessary for InSAR analysis. The IW swath mode collects data using the Terrain Observation with Progressive Scans SAR (TOPSAR) acquisition method. This method results in each IW SLC product containing While the high-resolution imagery and NDVI layer were the primary datasets used for digitizing, topographic information from a digital elevation model (DEM) also supported the interpretation process (e.g., for identifying depressions and low-lying areas). DEM characteristics are described later in Section 2.10. Both WorldView-2 and -3 scenes were fully digitized, resulting in end-to-end reference polygons. Photointerpretation was completed using a consistent scale range of 1:2500 to 1:8000. The collection of polygons resulted in a contiguous area, and each polygon represented a relatively homogenous landcover type. The final database of reference polygons was split 50% for model training and 50% for model testing.

Sentinel-1 SAR Imagery
SAR imagery used in this study was from the European Space Agency's open-access and polar-orbiting Sentinel-1A satellite. Although Sentinel-1A and 1B satellites are identical, Sentinel-1A was chosen because of its longer historical archive, which offers the potential for greater year-to-year ecosystem monitoring beyond this study. Six Sentinel-1A C-band SAR scenes were downloaded from the Alaska Satellite Facility's (https://search.asf.alaska. edu/, accessed on 24 January 2022) distributed archive center and in the Interferometric Wide (IW) swath mode. Level-1 Single Look Complex (SLC) products were used rather than Ground Range Detected (GRD) products because SLC retains both amplitude and phase information necessary for InSAR analysis. The IW swath mode collects data using the Terrain Observation with Progressive Scans SAR (TOPSAR) acquisition method. This method results in each IW SLC product containing one image per sub-swath (three total), per polarization (i.e., VV and VH), for a total of six images. Each sub-swath then contains a series of nine bursts. Together, these total a 250 km swath. We used all three sub-swaths (IW1, IW2, and IW3) and all nine bursts available in each IW SLC scene for our analysis.
Multi-temporal analysis was performed on the time series of six Sentinel-1A scenes spanning July 2020 to August 2020. Sentinel-1 has a consistent and high repeat imaging schedule; however, we only used scenes acquired across the approximate short 2020 growing season (i.e., the period during which weather conditions are conductive to plant growth) of the region; winter and fall scenes were excluded due to the sensitivity of SAR measurements towards the dielectric properties of snow, ice, and surficial geocryological characteristics [59]. For example, previous research has found SAR scattering mechanisms to vary greatly depending on season and the freeze/thaw state of the ground surface in permafrost regions [60]. Our study also used both the VV and VH polarizations for analysis. While the co-polarized HH channel from a SAR sensor has shown to be optimal for interferometric coherence and hydrological applications, VV can be considered the next best polarization [61]. Further, while many studies assessing InSAR for hydrological applications use only one polarization (e.g., HH or VV), we also explored including the cross-polarization VH channel. This is because cross-polarized backscatter is known to be sensitive to vegetation canopy volume [16].

SAR Backscatter
A fully polarimetric SAR sensor acquires data using both horizonal (H) and vertical (V) polarizations and can be represented by a 2 × 2 Sinclair scattering matrix (S): where S HH , S HV , S VH and S VV are complex backscattering coefficients for different polarimetric combinations. However, Sentinel-1 is a dual-polarization SAR sensor that collects a fraction (precisely half of the scattering matrix components) of the total polarimetric information and thus S must be modified to the following: Moreover, it is known that S inadequately represents the scattering characteristics of radar targets [62]. Instead, the 2 × 2 covariance matrix (C 2 ) can be used to represent each SAR pixel at each point in time and can be represented by the following: where * is the complex conjugate operation. C 2 members are the second-order scattering information produced from the spatial averaging of the scattering vector k = [S VV , S VH ] T found in (3), where superscript T indicates the matrix transpose. It can be seen from (3) that the diagonal value of C 2 is real and the off-diagonal complex value. Thus, C 11 , C 12 , and C 22 contain all the necessary backscattering information about C 2 , as C 2 is a symmetric matrix.

Interferometric Coherence
The interferometric coherence (also called the complex correlation coefficient) is the normalized complex correlation between two SAR images acquired at different times. It is a measurement of the quality of the interferometric phase and can be expressed as: where z 1 and z 2 are two complex and co-registered SAR images, E is the expectation operator, and * is the conjugate operation [63]. The expectation operator in practice is approximated using a sampled average of pixels within a given window [64]. This is frequently referred to as multi-looking. When doing so, Equation (4) then becomes: γ has a range of values of 0.0 to 1.0 and is a measure of decorrelation between z 1 and z 2 . It is a fundamental source of information used to exploit SAR interferograms and to assess their quality. Low values of γ indicate decorrelation between z 1 and z 2 , whereas high values indicate image correlation. There are three main factors that cause decorrelation, which are thermal noise, spatial baseline decorrelation, and temporal correlation: where γ thermal is the SAR system noise, γ spatial is associated with the platform's positioning during image acquisitions, and γ temporal is caused by changes in feature scattering between the two SAR acquisitions [43]. Changes to ground ecological or hydrological conditions have been found to decrease γ over natural environments, such as wetlands (e.g., vegetation phenology, soil moisture, and water level). Estimation of γ is difficult when its value is low, which is indicative of a poor interferogram.

Sentinel-1 Image Processing
Backscatter intensity was first derived from each SAR scene using the Sentinel Application Platform (SNAP) toolbox [65]. Steps for deriving backscatter included thermal noise removal, radiometric calibration to sigma-nought (σ • ), TOPSAR deburst, multi-looking, and geometric terrain correction. To obtain the backscattering coefficient (or normalized radar cross section) expressed in decibels, the following equation was used: where DN is the Sentinel-1 scene pixels. Multi-looking was applied using a window of 4 × 1 in range and azimuth and terrain correction was completed using the ESA's Copernicus GLO-30 DEM (1.0 arcseconds). Backscatter images were exported at 15 m spatial resolution.
Coherence was computed following [66]. Sub-swaths (i.e., IW1, IW2, and IW3) for each Sentinel-1 scene were processed separately, which was performed by splitting (i.e., selecting) the scenes using the S-1 TOPS Split operator in SNAP. Precise orbit information was then applied, followed by image pairs closest in acquisition date being co-registered; the earlier image was always designated as the 'master' image, and the later the 'slave'. This produced five co-registered pairs, each with a temporal baseline of 12 days. Coregistration was completed based on satellite orbits along with information from the Copernicus GLO-30 DEM. The quality of the co-registration process was also increased by applying range and azimuth shift corrections using the Enhanced Spectral Diversity operator in SNAP. Interferogram processing for all co-registered pairs was completed using a coherence window of 10 × 3 pixels in the range and azimuth. The resulting coherence Remote Sens. 2022, 14, 1123 9 of 26 products had their burst seamlines corrected and then sub-swaths were merged with the S-1 TOPS Merge operator. Lastly, merged coherence products were multi-looked, terrain corrected, and exported at 15 m spatial resolution to match the intensity products ( Figure 4).

Time-Series Statistical Descriptors
After preprocessing the Sentinel-1 images, we calculated several statistical metrics for each pixel in the time-series stack, for each polarization (VV and VH), and for both coherence and intensity. The statistical descriptors chosen largely followed earlier studies that performed similar analysis [30,31,67]. Multi-temporal, pixel-based statistical descriptors included mean, standard deviation, coefficient of variation, median, maximum, and minimum. These descriptors were derived using the raster package of R [68] and were used as machine learning model inputs.

Meteorological and Hydrometric Environmental Data
Historical environmental measurements were obtained across our study's summer 2020 observational period. This included wind speed, precipitation, water level height, and discharge ( Figure 5). All measurements were acquired from the Government of Canada's Meteorological Service of Canada [69]. Hydrometric data (Figure 5a) were obtained from the Mackenzie River weather station and meteorological data (Figure 5b) from the Inuvik weather station. These environmental datasets aided in the interpretation of SAR intensity and coherence products and machine learning classification results.

Topographic Data
To aid in the classification of hydro-ecological conditions within our study area, a DEM was acquired. This was the Copernicus GLO-30 DEM [70], a Digital Surface Model (DSM) at 30 m resolution representing all features on the Earth's surface. Several topographic metrics were derived from the DEM, as terrain morphology is a major influencer on water flow and pooling across landscapes. In addition to elevation, these metrics included slope, the wetness index [71], and Height Above Nearest Drainage (HAND [72]). All terrain metrics were processed using python scripting and with the WhiteboxTools geospatial data analysis platform [73]. Terrain metrics were resampled to 15 m and coregistered to the SAR imagery.

Time-Series Statistical Descriptors
After preprocessing the Sentinel-1 images, we calculated several statistical metrics for each pixel in the time-series stack, for each polarization (VV and VH), and for both coherence and intensity. The statistical descriptors chosen largely followed earlier studies that performed similar analysis [30,31,67]. Multi-temporal, pixel-based statistical descriptors included mean, standard deviation, coefficient of variation, median, maximum, and minimum. These descriptors were derived using the raster package of R [68] and were used as machine learning model inputs.

Meteorological and Hydrometric Environmental Data
Historical environmental measurements were obtained across our study's summer 2020 observational period. This included wind speed, precipitation, water level height, and discharge ( Figure 5). All measurements were acquired from the Government of Canada's Meteorological Service of Canada [69]. Hydrometric data (Figure 5a) were obtained from the Mackenzie River weather station and meteorological data (Figure 5b) from the Inuvik weather station. These environmental datasets aided in the interpretation of SAR intensity and coherence products and machine learning classification results. Remote Sens. 2022, 14, x FOR PEER REVIEW 10 of 26

Random Forest Modelling
A machine learning approach was used to classify the combination of the SAR timeseries data and terrain metrics. For this, we choose the Random Forest algorithm [74], which is very popular for remote sensing applications due to reported accuracies [75], ability to handle non-normally distributed datasets with high dimensionality and multicollinearity [76], computational efficiency [77], and insensitivity to overfitting [78].
Random Forest is a robust, non-parametric ensemble learning algorithm that combines multiple decision trees models for problem solving using a bootstrap aggregating (i.e., bagging) method [79]. With bagging, each decision tree in the forest uses a random subset of samples from the dataset with replacement, resulting in each tree being unique. This training process uses two-thirds of the samples, while the remaining one-third is employed to independently cross-validate the model's performance. This one-third of samples is referred to as out of bag (OOB). A final decision is then made by majority voting, whereby the membership class with the most prediction votes is selected. This leads to more accurate and stable classification results while mitigating overfitting [79]. The premise here is that a large ensemble of uncorrelated models will perform better than any single model.
To implement Random Forest, we set two key parameters-first was the number of randomly sampled variables used to split each node of a decision tree (mtry), and second is the number of generated decision trees (ntree). Rather than arbitrarily setting these parameters, we applied an algorithm-tuning process to find optimal values. A search function was used for mtry, whereby a minimum improvement in error (5%) was required for the search to continue until an optimal value was found, whereas ntree was assessed in incremental steps of 50 trees. Each parameter was assessed using OOB error. Random Forest tuning and classification were run using the randomForest and Caret packages in R [68].
A total of 11 Random Forest models were built in this study, each representing different optimally chosen model parameters (i.e., mtry and ntree) and different input data. These various models were then statistically examined, allowing for an understanding of the unique and combined contributions of SAR/InSAR and topographic data for hydroecological condition classification.

Topographic Data
To aid in the classification of hydro-ecological conditions within our study area, a DEM was acquired. This was the Copernicus GLO-30 DEM [70], a Digital Surface Model (DSM) at 30 m resolution representing all features on the Earth's surface. Several topographic metrics were derived from the DEM, as terrain morphology is a major influencer on water flow and pooling across landscapes. In addition to elevation, these metrics included slope, the wetness index [71], and Height Above Nearest Drainage (HAND [72]). All terrain metrics were processed using python scripting and with the WhiteboxTools geospatial data analysis platform [73]. Terrain metrics were resampled to 15 m and co-registered to the SAR imagery.

Random Forest Modelling
A machine learning approach was used to classify the combination of the SAR timeseries data and terrain metrics. For this, we choose the Random Forest algorithm [74], which is very popular for remote sensing applications due to reported accuracies [75], ability to handle non-normally distributed datasets with high dimensionality and multicollinearity [76], computational efficiency [77], and insensitivity to overfitting [78].
Random Forest is a robust, non-parametric ensemble learning algorithm that combines multiple decision trees models for problem solving using a bootstrap aggregating (i.e., bagging) method [79]. With bagging, each decision tree in the forest uses a random subset of samples from the dataset with replacement, resulting in each tree being unique. This training process uses two-thirds of the samples, while the remaining one-third is employed to independently cross-validate the model's performance. This one-third of samples is referred to as out of bag (OOB). A final decision is then made by majority voting, whereby the membership class with the most prediction votes is selected. This leads to more accurate and stable classification results while mitigating overfitting [79]. The premise here is that a large ensemble of uncorrelated models will perform better than any single model.
To implement Random Forest, we set two key parameters-first was the number of randomly sampled variables used to split each node of a decision tree (mtry), and second is the number of generated decision trees (ntree). Rather than arbitrarily setting these parameters, we applied an algorithm-tuning process to find optimal values. A search function was used for mtry, whereby a minimum improvement in error (5%) was required for the search to continue until an optimal value was found, whereas ntree was assessed in incremental steps of 50 trees. Each parameter was assessed using OOB error. Random Forest tuning and classification were run using the randomForest and Caret packages in R [68].
A total of 11 Random Forest models were built in this study, each representing different optimally chosen model parameters (i.e., mtry and ntree) and different input data. These various models were then statistically examined, allowing for an understanding of the unique and combined contributions of SAR/InSAR and topographic data for hydro-ecological condition classification.

Accuracy Assessment
Random Forest model classifications were validated using the independent reference data polygons described in Section 2.3. Validation statistics included overall accuracy and per-class precision, recall, and F1 score. Statistical equations for these metrics are the following: Overall Accuracy = Number of Correctly Classified Samples Number of Total Samples (8) where TP are true positives (correct hit), FP false positives (false alarm), and FN false negatives (miss). F1 score is the harmonic mean (i.e., weighted average) of precision and recall.

Temporal Observations of Coherence and Intensity
Prior to classification tests, the temporal evolution of SAR variables was observed. Figure 6 shows boxplots of time-series coherence for each class and image pair. For both VV and VH, coherence was found to be highest for most classes at the beginning of the ice-free season when discharge and water levels were greatest from runoff and increased precipitation ( Figure 5). Coherence was then visibly lowest from 25 July to 6 August during the peak of the summer growing season. This suggests that flooded vegetation (e.g., wet woody areas) was maintaining a consistent double-bounce scattering during very wet periods [44,80,81], whereas vegetation phenological changes (i.e., green-up) and landscape drying result in decorrelation. Moreover, image pairs following this period (i.e., during August) showed a minor increasing trend in coherence during a period of relatively stable flow and water levels. For VV, the mountain/unvegetated class exhibited the highest coherence across the time series, followed by tundra and wet woody areas. Dry woody, wet graminoid, and open water areas all displayed rather low coherence in both VV and VH, around or below 0.3, and regardless of image pair. Open water in particular was very low with coherence values closer to 0.25 for both VV and VH. This is because smooth water bodies reflect the SAR signal away from the sensor, causing decorrelation. Nevertheless, these values of coherence were distinctly lower than other classes, producing a unique signal for this open water class. With VH, wet woody areas were the most coherent class, followed by tundra and mountain/unvegetated areas. Wet woody surface types were also far more coherent in VV then wet graminoid surface types. Other studies have noted a similar observation [29,31,82] in that flooded woody vegetation (e.g., in swamps or peatlands) produce greater return signal than herbaceous vegetation due to double-bounce scattering from trunks, branches, and stems. This allows for maintenance of coherence over longer temporal baselines.  Figure 7 shows the time series of backscatter intensities (σ°) for each class of interest. The dry woody class had a relatively strong backscatter time series with VV (−9 dB) and VH (−16 dB) maximums on August 6 and lows on August 30, hence following the peak and conclusion of the growing season. Evidently, the backscatter is being influenced by the amount of leaves and branches (i.e., phenology) that cause volume scattering. Wet woody and mountain/unvegetated areas showed a similar trend with VV and VH backscatter intensity maximums on August 06. Tundra areas showed highest VV (−12 dB) and VH (−17 dB) backscatter intensities in early July, followed by a general and minor decline towards the end of August. Earlier onset of greenness for low shrub tundra landcovers has been observed in other remote sensing research [85]. Open water areas showed a pronounced difference in scattering with the lowest backscatter intensities of all classes in both SAR polarizations, although open water backscatter intensities were stable in the VH time series and highly variable from date to date in VV. Water bodies typically act as specular reflectors of SAR energy (i.e., forward scattering) due to their smooth surfaces [86]. However, wave development and fetch can disrupt the often flat/smooth target geometry of surface water, creating roughened surfaces leading to an increase in diffuse scattering. This was observed in the open water VV time series, in which the backscatter patterns were broadened, and intensities were highest on image acquisition dates corresponding with high wind speeds (Figure 5b). The wet graminoid class demonstrated the most signal variability, with a strong incline in intensity that reached a maximum VV backscatter of −8 dB on August 6 and VH backscatter of −18 dB on August 30. These distinct changes in the intensity time series are attributable to an increase in double-bounce scattering from tall, mature graminoids [87], which become more exposed as their phenology changes and water level heights decrease ( Figure 8). Under such conditions, when the vegetation-water interface is pronounced creating a right angle, deflected backscatter from double-bounce (i.e., dihedral) scattering is high [88]. Several studies have found the co-polarization channel (i.e., VV or HH) of a SAR to maintain better coherence than the cross-polarization channel (i.e., VH or HV), especially for hydrological applications such as inundation monitoring or wetland mapping [82][83][84] and regardless of wavelength (e.g., X-, C-, or L-band). This is because of the physics of Fresnel reflection from dielectric surfaces, which produces stronger backscatter in the HH channel than the VV [31]. This observation holds true in the case of this study, whereby all six hydro-ecological classes of interest maintained highest coherence in the co-polarization VV ( Figure 6). Figure 7 shows the time series of backscatter intensities (σ • ) for each class of interest. The dry woody class had a relatively strong backscatter time series with VV (−9 dB) and VH (−16 dB) maximums on August 6 and lows on August 30, hence following the peak and conclusion of the growing season. Evidently, the backscatter is being influenced by the amount of leaves and branches (i.e., phenology) that cause volume scattering. Wet woody and mountain/unvegetated areas showed a similar trend with VV and VH backscatter intensity maximums on August 06. Tundra areas showed highest VV (−12 dB) and VH (−17 dB) backscatter intensities in early July, followed by a general and minor decline towards the end of August. Earlier onset of greenness for low shrub tundra landcovers has been observed in other remote sensing research [85]. Open water areas showed a pronounced difference in scattering with the lowest backscatter intensities of all classes in both SAR polarizations, although open water backscatter intensities were stable in the VH time series and highly variable from date to date in VV. Water bodies typically act as specular reflectors of SAR energy (i.e., forward scattering) due to their smooth surfaces [86]. However, wave development and fetch can disrupt the often flat/smooth target geometry of surface water, creating roughened surfaces leading to an increase in diffuse scattering. This was observed in the open water VV time series, in which the backscatter patterns were broadened, and intensities were highest on image acquisition dates corresponding with high wind speeds (Figure 5b). The wet graminoid class demonstrated the most signal variability, with a strong incline in intensity that reached a maximum VV backscatter of −8 dB on August 6 and VH backscatter of −18 dB on August 30. These distinct changes in the intensity time series are attributable to an increase in double-bounce scattering from tall, mature graminoids [87], which become more exposed as their phenology changes and water level heights decrease (Figure 8). Under such conditions, when the vegetation-water interface is pronounced creating a right angle, deflected backscatter from double-bounce (i.e., dihedral) scattering is high [88].

Feature Space Analysis
The feature space positions of each class were also visually observed, prior to classification, for SAR intensity and coherence variables. Figure 9 shows the feature space positions based on the mean value derived from the SAR time-series stacks. Open water

Feature Space Analysis
The feature space positions of each class were also visually observed, prior to classification, for SAR intensity and coherence variables. Figure 9 shows the feature space positions based on the mean value derived from the SAR time-series stacks. Open water

Feature Space Analysis
The feature space positions of each class were also visually observed, prior to classification, for SAR intensity and coherence variables. Figure 9 shows the feature space positions based on the mean value derived from the SAR time-series stacks. Open water showed a clear separability from all other classes when the VV intensity variable was plotted against other VH intensity and VV coherence (Figure 9a,c). Combining co-and cross-polarization VV and VH intensity separated the wet graminoid class well (Figure 9a). The wet woody class generally showed high overlap with several classes, both wet and dry, and in all feature space plots, although ellipse centers had less overlap when a combination of coherence and intensity variables were plotted (Figure 9c,d). The dry woody class feature cluster center was best separated using the cross-polarization VH coherence and VH intensity (Figure 9d). Mountain/unvegetated areas showed variable feature space positioning in all graphs, although were best separated using VV and VH coherence (Figure 9b). This aligns with the distinct and high coherence values seen in Figure 6. Tundra areas had relatively high overlap in all feature space plots, indicating the poorest separability. It is likely that a combination of topographic and SAR data, including both intensity and coherence in the co-and cross-polarization channels, is necessary for accurate classification and separability of these detailed hydro-ecological classes. showed a clear separability from all other classes when the VV intensity variable was plotted against other VH intensity and VV coherence (Figure 9a,c). Combining co-and crosspolarization VV and VH intensity separated the wet graminoid class well (Figure 9a). The wet woody class generally showed high overlap with several classes, both wet and dry, and in all feature space plots, although ellipse centers had less overlap when a combination of coherence and intensity variables were plotted (Figure 9c,d). The dry woody class feature cluster center was best separated using the cross-polarization VH coherence and VH intensity (Figure 9d). Mountain/unvegetated areas showed variable feature space positioning in all graphs, although were best separated using VV and VH coherence ( Figure  9b). This aligns with the distinct and high coherence values seen in Figure 6. Tundra areas had relatively high overlap in all feature space plots, indicating the poorest separability. It is likely that a combination of topographic and SAR data, including both intensity and coherence in the co-and cross-polarization channels, is necessary for accurate classification and separability of these detailed hydro-ecological classes.

Effects of Model Hyperparameter Tuning
OOB data were used to obtain an internal and unbiased running estimate of classification error while incrementally adding trees to the forest [89]. This process allowed each Random Forest model to be fine-tuned based on the input variables. Figure 10a-k shows the results of the hyperparameter tuning process for all 11 Random Forest models. The graphs in Figure 10 show the average OOB model error rate curve along with per-class OOB error and the optimal mtry value. mtry values varied by model and ranged from 2 to 4. It is evident that an increase in model complexity (i.e., input variables) reduced overall model error. In general, VV and VH intensity (Figure 10d,e) produced lower OOB error rates than VV and VH coherence (Figure 10b,c). The synergistic use of coherence and intensity greatly decreased OOB error (Figure 10h), which aligns with the findings of previous studies that demonstrated InSAR and backscatter features to be complimentary for capturing hydrological patterns [30,44,61,90,91]. Inclusion of topographic data noticeably improved all Random Forest model performances, and even performed relatively well in isolation with an average OOB error rate of 11.85% (Figure 10a). This is because Arctic tundra biotic communities establish along environmental gradients, manifesting in clear areal patterns [92]. For low lying wet areas, topographic variations create plant zonation patterns in response to flooding frequency and duration, and soil moisture. The wettest areas of the Mackenzie Delta support shallow standing water, wetlands and lakes, whereas areas slightly elevated are subject to variable flooding or pulsing hydroperiods and the ensuing drainage processes, leading to the establishment of wet tolerant graminoids, shrubs, or trees, depending on elevation (i.e., wet graminoid or wet woody land covers [93]. The tussock-forming tundra areas are elevated higher outside the delta, with drier conditions that create their own micro-uplands that are distinct from wet areas. The Random Forest model using all intensity, coherence, and topographic variables produced a lowest OOB error of 5.69% (Figure 10k). In several Random Forest models, the wet woody class had the highest per-class OOB error, whereas open water was often the lowest. Table 1 presents the independent per-class accuracy assessments for all 11 Random Forest model scenarios which varied based on input predictor variables. Overall, these independent assessments relate closely to the internal OOB model estimates presented in Figure 10. It is apparent from these statistical results that the combined use of intensity, coherence, and topography is required to accurately discriminate the complex hydroecological classes of the Mackenzie Delta and surrounding region. Each set of predictor variables offers differing characterization capabilities depending on their sensitivity to hydrological or ecological features.

Classification Accuracy Assessments
For the wet classes (i.e., open water, wet graminoid, and wet woody), the co-polarization VV intensity (model 4) identified open water areas more accurately than coherence or topography with an F1 score of 0.860. In most cases, the open water class had the highest F1 score of the three wet classes, with model 11 achieving a highest F1 score of 0.955. This was followed by the wet graminoid class; wet graminoid areas were mostly incoherent ( Figure 6) and thus VV or VH coherence could not identify them accurately. However, their intensity time series was considerably distinct (Figure 7), especially in VV where double-bounce scattering from flooded vegetation is more prevalent [94,95] depending on phenological stage. This was reflected in the VV intensity model (model 4) with an F1 score of 0.590. A highest F1 score of 0.921 was achieved with model 11 for the wet graminoid class. Overall, wet woody surface types were the most difficult class to classify. This is despite the relatively high coherence from this class ( Figure 6). It was only once topographic data were combined with intensity or coherence data (models 9 and 10) that this class could be classified accurately. Model 11 achieved a highest F1 score of 0.682 for the wet woody class.
Most coherence or intensity only SAR scenarios (models 2 to 7) had difficulty classifying dry land cover types (i.e., dry woody, tundra, and mountain/unvegetated). The dry upland classes contained various combinations of short statured (i.e., height) shrubs, trees, or herbaceous vegetation, often producing minor differences in scattering mechanisms and absent of a strong and separable double-bounce signal more commonly associated with flooded states. Ullmann et al. [19] also noted the difficulty in using the VV/VH dualpolarization mode associated with Sentinel-1 for mapping of Tuktoyaktuk's Arctic tundra landcover types; this earlier study suggested that a multi-frequency, multi-polarization, or multi-sensor approach is necessary for such applications. Nonetheless, the combination of both VV and VH coherence and intensity inputs (model 8) did result in relatively adequate F1 scores for the dry upland classes, ranging from 0.402 to 0.747. A multi-source approach with the inclusion of topographic information significantly reduced upland classification confusion, with model 11 resulting in F1 scores of 0.815-0.826 for these landcover types.     Figure 11 presents the overall accuracy statistics for all 11 Random Forest models. SAR intensity models (models 4 and 5) produced higher overall accuracies than SAR coherence models (models 2 and 3). In both cases, combined dual-polarimetric information (i.e., VV and VH; models 6 and 7) from intensity or coherence performed better than use of only one SAR channel (i.e., VV or VH; models 2 to 5). Several previous studies examining InSAR coherence for hydrological applications have included only the co-polarization channel of a SAR (i.e., VV or HH) with the assumption that this channel is more sensitive to surface water and flooded conditions [29][30][31]. While this is true in many instances, our study highlights the collective contributions of co-and cross-polarization SAR data for hydroecological characterization. This aspect of our study is important and can be attributed to the sensitivity of VH to vegetation canopy structures and volume scattering [95,96]. Merging intensity and coherence data (model 8) produced a relatively high overall accuracy of 64%. This SAR-only result is encouraging when considering the complexity of hydroecological landcover classes identified, the spatial extent of the Sentinel-1A scene, and that high-latitude regions are often cloudy with low-light conditions which limits the use of optical sensors. Mapping of Arctic tundra ecosystems therefore demands the use of SAR methods, which allow for better sampling due to cloud independence [97], although inclusion of topographic data significantly improved classification results, with model 11 achieving the highest overall accuracy of 84%. SAR methods, which allow for better sampling due to cloud independence inclusion of topographic data significantly improved classification results, achieving the highest overall accuracy of 84%. Figure 11. Overall accuracy values for each Random Forest model scenario (see Ta inputs). Overall accuracy was assessed using independent testing data.

Variable Importance
A significant by-product of Random Forest calculations are measures portance [79]. This algorithm characteristic is important for understanding p dictive power. Model 11 (Figure 12), which used all topographic and SAR t dictor variables, was selected for variable importance analysis since it resul significant overall accuracy (84%). For this study, we analyzed predictor p the distribution of the average minimal depth for each input variable [98]. minimal depth provides a measure of the distance of a variable to the root o allowing for an understanding of a variable's role in the model structure a This is because at each node in the model, a random subset of predictor va to make a split in the data; the most strongly associated variable is the one the split. This indicates that variables closer to the root have stronger predic more important, and are most strongly associated with the dependent vari put classes). Figure 13 displays the top 20 variables from model 11 calculated using smaller the mean minimal depth, the more important the predictor varia topographic variables, including elevation, HAND, slope, and the wetn peared as top 20 variables, further demonstrating the role topography pla tundra landscape [92]. The most important predictor variable was the mean which is understandable due to VV's sensitivity to hydrological condit moisture, flooding, and the accompanying double-bounce scattering me mean VV coherence was also ranked very high at number three, furtheri standing. Thus, the temporal signatures (i.e., those that are stable, and tho namic) of VV coherence (e.g., Figure 6) and VV intensity (Figure 7) were within the time-series statistical descriptors, resulting in strong classificat power. Nine of the top 20 variables were SAR intensity variables, indica intensity provides greater predictor power in this landscape than SAR coh  Table 1 for model inputs). Overall accuracy was assessed using independent testing data.

Variable Importance
A significant by-product of Random Forest calculations are measures of feature importance [79]. This algorithm characteristic is important for understanding parameter predictive power. Model 11 (Figure 12), which used all topographic and SAR time-series predictor variables, was selected for variable importance analysis since it resulted in the most significant overall accuracy (84%). For this study, we analyzed predictor power based on the distribution of the average minimal depth for each input variable [98]. The concept of minimal depth provides a measure of the distance of a variable to the root of the tree, thus allowing for an understanding of a variable's role in the model structure and prediction. This is because at each node in the model, a random subset of predictor variables is used to make a split in the data; the most strongly associated variable is the one used to make the split. This indicates that variables closer to the root have stronger predictor power, are more important, and are most strongly associated with the dependent variables (i.e., output classes). Figure 13 displays the top 20 variables from model 11 calculated using top trees. The smaller the mean minimal depth, the more important the predictor variable is. All four topographic variables, including elevation, HAND, slope, and the wetness index appeared as top 20 variables, further demonstrating the role topography plays in this Artic tundra landscape [92]. The most important predictor variable was the mean VV intensity, which is understandable due to VV's sensitivity to hydrological conditions including moisture, flooding, and the accompanying double-bounce scattering mechanism. The mean VV coherence was also ranked very high at number three, furthering this understanding. Thus, the temporal signatures (i.e., those that are stable, and those that are dynamic) of VV coherence (e.g., Figure 6) and VV intensity (Figure 7) were captured well within the time-series statistical descriptors, resulting in strong classification predictive power. Nine of the top 20 variables were SAR intensity variables, indicating that SAR intensity provides greater predictor power in this landscape than SAR coherence.

Limitations and Future Analysis
Supervised data classification using machine learning algorithms such as Random Forest is a convenient and accurate means to land cover determination and delineation. However, there are various limitations that must be considered such as the potential of overfitting, difficulty in transferability from one site to another, run-time performance, and the requirements of time-consuming and sometimes costly training and testing data preparation [79], whether that be by in situ field methods, which are considered "gold standard", or photointerpretation. The former is incredibly challenging in remote Arctic tundra environments, and the latter can be prone to human error and noise, even for an experienced image analyst. Regardless, quality training data of sufficient size (i.e., number of samples) is a requirement for machine learning algorithms such as Random Forests [99,100]. Without such a dataset, acceptable classification accuracies are difficult to achieve, giving rise to the problematic computational threat of "garbage-in-and-garbageout" [101]. Our study presents a workflow that identifies the hydro-ecological state of Figure 13. The distribution of the minimal depth among trees of the forest for the top 20 predictor variables from model 11. Minimal depth is represented by distinct colors. The mean minimal depth is depicted by the black vertical line and the value is labelled. The X axis ranges from 0 to the maximum number of trees in which any predictor variable was used for splitting.

Limitations and Future Analysis
Supervised data classification using machine learning algorithms such as Random Forest is a convenient and accurate means to land cover determination and delineation. However, there are various limitations that must be considered such as the potential of overfitting, difficulty in transferability from one site to another, run-time performance, and the requirements of time-consuming and sometimes costly training and testing data preparation [79], whether that be by in situ field methods, which are considered "gold standard", or photointerpretation. The former is incredibly challenging in remote Arctic tundra environments, and the latter can be prone to human error and noise, even for an experienced image analyst. Regardless, quality training data of sufficient size (i.e., number of samples) is a requirement for machine learning algorithms such as Random Forests [99,100]. Without such a dataset, acceptable classification accuracies are difficult to achieve, giving rise to the problematic computational threat of "garbage-in-and-garbage-out" [101]. Our study presents a workflow that identifies the hydro-ecological state of Arctic tundra land-covers with both quantitatively and qualitatively (i.e., through visual inspection) acceptable results; however, it is recommended that future research assesses unsupervised clustering algorithms. Sometimes labelled sample data are not available, and thus unsupervised learning may present an alternative approach to thematic map generation. For example, in a recent study, Minotti et al. [102] used a Self-Organizing Map (SOM) neural network to cluster InSAR data from Sentinel-1 for wetland hydroperiod pattern.
Our mapping approach was also applied using pixel-based image analysis (PBIA) which, when applied to heterogeneous areas such as coastal deltas or Arctic tundra terrain, has limitations. For example, the heterogeneity in spatially near pixels, the occurrence of mixed pixels, and the effects of hydrological and ecological differences within a single class that may result in speckled noise [103]. Thus, it is suggested that this classification approach applies object-based image analysis (OBIA) in future work to address the heterogeneity of this dynamic landscape, as OBIA has shown to increase classification accuracies in many previous studies [75,104].
The availability of SAR satellites with open data policies and short revisit times, such as Sentinel-1A/1B, has made InSAR analysis more realizable for the geospatial community. This has resulted in many recent studies demonstrating the efficacy of time-series InSAR products over a variety of environments. Despite these promising results, readily available coherence products are still limited. For example, the popular cloud-computing platform Google Earth Engine contains only Ground Range Detected (GRD) Sentinel-1 data, meaning the phase information necessary for coherence is not available [105]. As remote sensing analysis moves farther away from local desktop processing, widespread adoption of both SAR intensity and coherence may be challenging, despite demonstrated applications. Moreover, the learning curve for InSAR processing may be steep for users more accustomed to conventional intensity products. Fortunately, there are some recent options designed to address the underutilization of InSAR products-one of which is the European Space Agency's Geohazards Thematic Exploitation Platform (GEP), an R&D activity designed for large scale Earth observation data processing. Millard et al. [67] used this platform for Sentinel-1 InSAR processing and peatland mapping, although they noted that processing options were limited in comparison to a dedicated InSAR processing software (e.g., SNAP). Piter et al. [106] discuss other cloud-based platforms including CODE-DE and the Alaska Satellite Facility's OpenSARLab and present their advantages and limitations. Improving the adoption of coherence measurements depends on the remote sensing community embracing InSAR cloud computing techniques, and thus future analysis should assess these resources accordingly. Such efforts would relieve SAR/InSAR users of big data downloads and processing time.
Lastly, while Sentinel-1 offers arguably the most consistent and reliable open-source time-series SAR data, its dual-polarimetric channels (i.e., VV and VH) and C-band wavelength are somewhat limiting characteristics. Of the co-polarized SAR channels, HH is favored because of its sensitivity to surface water and flooded conditions [95,107]. As a result, the HH polarization has shown to produce higher coherence than VV over wet environments such as coastal deltas [83]. The medium wavelength microwaves of Sentinel-1 (i.e., C-band) are also more sensitive to surface features which can mask significant coherence information [29]. In contrast, longer-wavelength L-band SAR data, being less sensitive to surface roughness due to canopy penetration capabilities, has proven fruitful for hydrological applications [44]. Future research should emphasize a combined multifrequency SAR approach for hydro-ecological Arctic tundra mapping. Upcoming L-band SAR missions such as ALOS-4 and NASA-ISRO (NiSAR) offer promising opportunities for this [108].

Conclusions
In this study, we presented a machine learning workflow and preceding analysis using SAR/InSAR time-series products derived from Sentinel-1A for high-latitude hydroecological landcover characterization over one growing season. To our knowledge, very lit-tle previous research has been dedicated to this over Arctic tundra environments. Moreover, while knowledge of temporal landscape wetness is important for hydrological analysis, this information lacks the additional vegetation detail necessary for accurate greenhouse gas emission modelling, such as carbon and methane. Our study established a methodology capable of deriving this critical hydro-ecological information, which, considering northern ecosystems and their sensitivity to current climate warming, will be important for updating over forthcoming years as permafrost thaw continues to alter Arctic tundra conditions. Key findings from our study included the following: 1. Wet woody, tundra, and mountain/unvegetated landcovers maintained the highest coherence over this study's observation period, whereas wet graminoid, dry woody and open water landcovers showed the lowest coherence.

2.
Coherence was generally highest at the beginning of this study's observation period, when water levels and discharge were high, whereas decorrelation occurred from phenological changes and landscape drying.

3.
Open water and wet graminoid landcovers demonstrated the most variability in backscatter intensity. 4.
SAR backscatter intensity was able to classify hydro-ecological classes more accurately than InSAR coherence.

5.
When intensity and coherence were combined, overall classification accuracies and per-class F1 score values were improved, suggesting that these SAR/InSAR variables are complimentary. 6.
Inclusion of topographic variables improved all machine learning model outcomes, a result of topography's control on Arctic tundra biotic communities. 7.
A combination of coherence, intensity, and topographic variables resulted in a highest overall classification accuracy of 84%. 8.
The co-polarized VV channel demonstrated stronger predictor power than the crosspolarized VH.
The Arctic tundra plays a significant role in global climate regulation, thus making the mapping and monitoring of these sensitive environments and their structure and function a significant task that is crucial for human adaptation. Our findings will help advance knowledge around these sensitive ecosystems, providing a means for status and trends updates at a suitable spatial and temporal detail.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.