A Large-Scale Inter-Comparison and Evaluation of Spatial Feature Engineering Strategies for Forest Aboveground Biomass Estimation Using Landsat Satellite Imagery

Kilbride, John B.; Kennedy, Robert E.

doi:10.3390/rs16234586

Open AccessArticle

A Large-Scale Inter-Comparison and Evaluation of Spatial Feature Engineering Strategies for Forest Aboveground Biomass Estimation Using Landsat Satellite Imagery

by

John B. Kilbride

^*

and

Robert E. Kennedy

Department of Geography, College of Earth, Ocean, and Atmospheric Sciences, Oregon State University, Ocean Administration Building, 104, 101 SW 26th St., Corvallis, OR 97331, USA

^*

Author to whom correspondence should be addressed.

Remote Sens. 2024, 16(23), 4586; https://doi.org/10.3390/rs16234586

Submission received: 16 August 2024 / Revised: 28 October 2024 / Accepted: 3 December 2024 / Published: 6 December 2024

(This article belongs to the Section Biogeosciences Remote Sensing)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Aboveground biomass (AGB) estimates derived from Landsat’s spectral bands are limited by spectral saturation when AGB densities exceed 150–300 Mg

{ha}^{- 1}

. Statistical features that characterize image texture have been proposed as a means to alleviate spectral saturation. However, apart from Gray Level Co-occurrence Matrix (GLCM) statistics, many spatial feature engineering techniques (e.g., morphological operations or edge detectors) have not been evaluated in the context of forest AGB estimation. Moreover, many prior investigations have been constrained by limited geographic domains and sample sizes. We utilize 176 lidar-derived AGB maps covering ∼9.3 million ha of forests in the Pacific Northwest of the United States to construct an expansive AGB modeling dataset that spans numerous biophysical gradients and contains AGB densities exceeding 1000 Mg

{ha}^{- 1}

. We conduct a large-scale inter-comparison of multiple spatial feature engineering techniques, including GLCMs, edge detectors, morphological operations, spatial buffers, neighborhood vectorization, and neighborhood similarity features. Our numerical experiments indicate that statistical features derived from GLCMs and spatial buffers yield the greatest improvement in AGB model performance out of the spatial feature engineering strategies considered. Including spatial features in Random Forest AGB models reduces the root mean squared error (RMSE) by 9.97 Mg

{ha}^{- 1}

. We contextualize this improvement model performance by comparing to AGB models developed with multi-temporal features derived from the LandTrendr and Continuous Change Detection and Classification algorithms. The inclusion of temporal features reduces the model RMSE by 18.41 Mg

{ha}^{- 1}

. When spatial and temporal features are both included in the model’s feature set, the RMSE decreases by 21.71 Mg

{ha}^{- 1}

. We conclude that spatial feature engineering strategies can yield nominal gains in model performance. However, this improvement came at the cost of increased model prediction bias.

Keywords:

texture metrics; Landsat; aboveground biomass; GLCM; morphology; edge detector; random forests

1. Introduction

Forests are a foundational component of the carbon cycle and support crucial ecosystem services [1,2]. Most terrestrial carbon is stored in forests, but precisely quantifying aboveground biomass (AGB) densities is challenging due to the scarcity of direct biomass measurements and the expense of acquiring ground truth data [3,4]. Remote sensing technologies can leverage sparse reference datasets to generate mapped estimates of AGB [5,6,7]. Among these technologies, lidar (light detection and ranging) stands out for its ability to precisely characterize forest height, making it a valuable tool for modeling and mapping AGB density [8,9,10]. However, lidar data are expensive to acquire and cannot feasibly be used for the continual monitoring of large areas or for a comparison of historical to present conditions.

Landsat satellite imagery, with its broad spatial coverage and historical depth, has emerged as a fundamental tool for monitoring forest ecosystems [5,11,12,13,14]. However, the relationship between moderate resolution spectral bands/indices saturates as biomass density increases, leading to a loss in response as AGB densities approach 200–350 Mg

{ha}^{- 1}

[15,16,17]. To alleviate spectral saturation, new statistical features are engineered. Feature engineering is a process through which the original feature data (i.e., the spectral bands) are mathematically transformed and/or combined into statistical features. Multitemporal features that quantify the timing, severity and recovery associated with previous disturbance events have been extensively evaluated and found to improve the accuracy of AGB models, particularly in high biomass forests dominated by stand-replacement disturbances [5,18,19].

Spatial features, mathematical descriptors of the spatial arrangement of values within a digital image, have been used to alleviate spectral saturation [16,20,21,22]. Gray Level Co-Occurrence Matrices (GLCMs) are one of the oldest and most ubiquitous methodologies for extracting spatial features from satellite imagery [23]. GLCM features provide statistical descriptions of a joint distribution of pixel values, quantifying how frequently particular digital image values are observed adjacent to other values within a spatial neighborhood [23]. Previous research suggests that GLCM texture features improve the correlation between the model predictions derived from moderate resolution multispectral data and field measurements of AGB [20,21,22,24,25,26,27].

Despite the reported success of GLCM features, there has been little research investigating alternative or combined spatial feature engineering techniques. For instance, our search of the literature failed to identify any studies evaluating the use of morphological operations or edge detectors for estimating AGB with Landsat satellite imagery. Many of the biophysical interpretations attributed to GLCM features could be readily applied to edge detectors, morphological operators, and other methodologies that summarize information about a local neighborhood of values. For example, Lu [21] noted a correspondence between GLCM-derived entropy (the degree of randomness in spatial co-occurrence of different values) and the ability to distinguish between successional and mature tropical forest stands by capturing information about canopy shadowing. Many of the aforementioned alternatives to GLCM features also compute features that characterize local variance in a digital image’s values, and thus could potentially yield features that capture information about forest stand development.

Multi-temporal features that characterize past disturbance events and the subsequent recovery of live AGB have been extensively explored using Landsat satellite imagery [7,18,19,28]. The long history of the Landsat program makes it the only dataset with a sufficient length and spatial resolution to characterize changes in forest ecosystems [14]. Present-day AGB densities often reflect the severity and timing of past disturbance events [5,18,19]. Two algorithms, LandTrendr [29,30] and the Continuous Change Detection and Classification (CCDC) algorithm [31,32], have been widely used to characterize forest disturbances and derive covariates for modeling forest biophysical characteristics [19,33,34,35]. Disturbance and recovery information derived from these algorithms are particularly effective in areas with large AGB densities and where the disturbance regime is dominated by stand-replacing disturbances [18,19]. We contextualize the improvements in AGB model performance when spatial features are included in the feature set by developing models that incorporate temporal features and combinations of spatial and temporal features.

The Random Forest (RF) algorithm is ubiquitous in remote sensing analyses due to its lack of distributional assumptions, ability to identify non-linear interactions among features, and relative computational efficiency [36,37]. However, like most machine learning algorithms, RF possesses hyperparameters (i.e., model parameters that are not learned during training) that must be optimized to maximize model performance. In remote sensing studies, RF is typically assumed to be robust to hyperparameter selection [5,18,19,25,28,38,39,40]. Nevertheless, as we demonstrate, this assumption can lead to decreased model performance and introduce an uncontrolled source of variation into comparisons.

When quantifying the performance of remote sensing models, the reference dataset should encompass the complete range of possible values that the biophysical parameter can assume and span the relevant climatological and topographic gradients. We utilize a database of 176 lidar-derived AGB maps distributed throughout the Northwest United States [6,41]. Using these lidar-derived AGB maps, we construct a large modeling dataset (n = 307,500) to explore spatial feature engineering strategies for AGB modeling. The study area of this analysis covers numerous biophysical gradients and includes dense, high-productivity coniferous forests in the Coast Range and Cascade Mountains, as well as sparse, fire-adapted forest ecosystems.

Our primary objective is to clarify the utility of spatial feature engineering methodologies when estimating forest AGB. First, we examine different methods for extracting spatial features to identify which feature engineering strategies yield the greatest improvement for AGB models. Second, we develop models using combinations of spectral, spatial, temporal, and topographic features to determine which feature engineering strategies produce the greatest reduction in AGB model error. Third, we evaluate the impact of hyperparameter optimization on RF AGB model development. Specifically, we answer the following research questions:

Which spatial feature engineering strategies yield the greatest improvement in AGB model performance?
Does combining multiple feature engineering strategies result in significantly better performance than utilizing a single strategy?
How do spatial statistical features compare with multi-temporal features?
Does hyperparameter optimization significantly impact RF AGB model performance?

Given the extensive body of literature utilizing GLCM features, we hypothesize that GLCM features will prove to be the most effective spatial descriptors of AGB.

2. Methods

2.1. Study Area

The study area for this analysis consists of forested landscapes in Oregon, Washington, Idaho, and Eastern Montana (Figure 1). Due to orographic effects, there is a pronounced east–west precipitation gradient. Forests in the Coast Range and the western slopes of the Cascades receive 800–3000 mm of precipitation annually [42,43], while forests in Idaho and Montana receive considerably less precipitation (<400 mm a year) [6]. The Douglas-fir (Pseudotsuga menziesii) and western hemlock (Tsuga heterophylla) forests on the Olympic peninsula in Washington, the Oregon and Washington Coast Range, and the Oregon and Washington Cascades are among the most productive forests on Earth. The AGB densities in these forests can exceed 1200 Mg

{ha}^{- 1}

[44]. These high biomass forest contrast strongly with the open, fire-adapted lodgepole pine (Pinus contorta) and ponderosa pine (Pinus ponderosa) forests found in the rainshadow of the Cascades.

2.2. LiDAR AGB Estimates

A set of 176 lidar-derived AGB maps, distributed throughout Oregon, Washington, Idaho, and Montana, were used to construct the AGB reference dataset (Figure 1) [6,41]. These lidar acquisitions were obtained between 2002 and 2016 and have a spatial resolution of 30

m^{2}

. The lidar AGB maps were developed using a landscape-scale RF AGB model that related plot-level AGB densities (n = 3805; Fekety et al. [45]) with lidar height features, DEM features, and climate features. We used these lidar-derived AGB maps as they enabled us to extract a large sample of reference values across the numerous environmental gradients in the study region. We post-processed the lidar AGB maps to isolate only high-quality observations. We computed binary forest cover maps for 2000–2016 from the Landscape Change Monitoring System (LCMS) [46] (v2021.7) dataset. Pixels in the LCMS dataset with a forest probability <0.5 were labeled as non-forest. The non-forested areas were masked from the AGB maps using the binary forest cover map from the same year as the lidar acquisition. Lastly, a 30

m^{2}

building/no-building mask of Oregon, Washington, Idaho, and Montana was derived from the Microsoft Building Footprints dataset and applied to each of the AGB maps [47].

2.3. Reference Dataset

We sampled reference AGB values for our numerical experiments from the pre-processed lidar AGB maps. We then used these reference values to develop Landsat-based AGB models. We utilized a training/development/testing split to construct and validate the Landsat-based AGB models. The development set was used to assess model performance during hyperparameter tuning. The testing set was used to produce a final estimate of the model’s error. First, a single AGB layer was produced by compositing the 176 lidar maps, selecting at each XY location the most recent AGB estimate. The composited AGB map covered 9,361,622 ha of forested land. Second, the AGB layer was stratified into 30 bins using 25 Mg

{ha}^{- 1}

increments. Next, the testing set was constructed by drawing 250 samples from each strata (n = 7500 ). Then, samples from the training and development sets were sampled. Training and development sets were not sampled within 500 m of any test-set samples to ensure our estimate of the models’ generalization error was not impacted by spatial autocorrelation (Figure 2). A threshold of 500 m was selected, as previous analyses concluded that the structure of Pacific Northwest forests varies at spatial lags <500 m [48,49,50]. Finally, 10,000 samples were drawn from each of the 30 strata (n = 300,000) and were allocated, by strata, to the training set (n = 240,000) and the development set (n = 60,000) using an 80–20%split.

2.4. Landsat Satellite Imagery

Satellite imagery was acquired from the United States Geological Survey’s (USGS) Landsat Collection 2 Tier 1 Surface Reflectance (C2SR) dataset. All scenes, acquired between 1990 and 2022 and that intersected the study area, were selected for analysis. Three time series of C2SR scenes were derived: (1) a time series of annual, medoid composited images [51], (2) a time series of all Landsat scenes, and (3) a time series of all Landsat scenes obtained between 1 May and 30 November. The medoid composites were produced using imagery obtained from the period of maximum phenological activity (1 June to 1 September). The medoid composite time series was used to derive spatial statistical features and LandTrendr-derived temporal features. The two time series consisting of non-composited imagery were used to compute temporal features using the CCDC algorithm. The Google Earth Engine (GEE) cloud computing platform was used to perform the image processing and the subsequent derivation of statistical features [52].

2.5. Feature Engineering

2.5.1. Overview

The primary goal of this analysis was to determine which spatial feature engineering strategies yielded the greatest improvement in the accuracy of AGB models. To accomplish this, we developed an extensive set of predictive features comprised of spectral (n = 14), spatial (n = 420; excluding duplicates in the neighborhood vectorization group), temporal (n = 152), and topographic (n = 5) features (Figure 3). We additionally compared how combining multiple feature engineering strategies impacted model performance. The spectral, spatial, and temporal features were all derived from Landsat satellite imagery. Topographic features were derived from a digital elevation model (DEM). A time series (2002–2016) of statistical features (described in Section 2.5.2, Section 2.5.3, Section 2.5.4, Section 2.5.5, Section 2.5.6, Section 2.5.7, Section 2.5.8 and Section 2.5.9) was generated over the study domain. Features were extracted over each location in the training, development, testing sets by sampling from the features that are temporally concurrent with the lidar AGB map from which the samples were drawn. All statistical features used in this analysis are described in Appendix A.

2.5.2. Baseline Features

To assess the impact of spatial feature engineering and temporal feature engineering, we first developed a set of baseline model features. These baseline features represent spectral bands/indices and topographic features that almost all Landsat-based AGB models integrate [7,18,38,53]. The spectral features were derived from LandTrendr-fitted imagery [54,55,56]. Topographic features were derived from the USGS 3D Elevation Program’s (3DEP) 1/3 arc-second (10 m) digital elevation model (DEM) [57]. In total, 19 baseline features were computed (Table A1).

The LandTrendr algorithm is a disturbance detection algorithm designed to operate on time series of annual Landsat imagery [17,29]. LandTrendr models the time series of spectral reflectance values at each pixel location using a piecewise linear model (see Kennedy et al. [29] for details). LandTrendr was run using the Normalized Burn Ratio (NBR; Key and Benson [58]). From the LandTrendr outputs, we derived linearly interpolated (i.e., fitted) values for the 6 spectral bands (blue, green, red, near-infrared, and shortwave-infrared 1 and 2) (see Hopkins et al. [56] or Johnston et al. [50] for details). Fitted values smooth ephemeral variations in the time series of annual composites introduced by phenomena such as variations in image acquisition timing. From the time series of fitted images, we derived the spectral and spatial features.

Eight spectral indices/transformations were derived from the time series of LandTrendr fitted imagery. The Tasseled Cap transformation was used to derive tasseled cap brightness (TCB), greenness (TCG), and wetness (TCW) [59]. Additionally, we computed the Tasseled Cap Angle (TCA), a combination of TCB and TCG that characterizes the proportion of vegetated area to non-vegetated area in a pixel [38]. We computed four vegetation indices: the Normalized Difference Vegetation Index, the Enhanced Vegetation Index (EVI), the Normalized Difference Moisture Index (NDMI), and NBR [58,60,61,62]. In total, 14 spectral features were computed (Table A1).

Topographic features were derived from the USGS 3DEP 10 m DEM. The 10 m DEM was resampled to a 30 m resolution using bilinear interpolation. The 3DEP DEM is principally derived from aerial lidar acquisitions with gaps filled with Shuttle Radar Topography Mission data [63]. From the 3DEP DEM, we derived 5 features characterizing topography: elevation, slope, cosine embedding of slope aspect, sine embedding of slope aspect, and a hillshade layer (Table A1). The sine and cosine embeddings are a decomposition of the slope aspect given by

\begin{matrix} {Aspect}_{s i n} = s i n (Aspect * \frac{π}{180}) \end{matrix}

(1)

\begin{matrix} {Aspect}_{c o s} = c o s (Aspect * \frac{π}{180}) \end{matrix}

(2)

2.5.3. Buffer Features

A simple technique for characterizing local variations in a digital image is to use buffers that summarize basic statistics about the distribution values [56,64]. By using circular kernels with different sizes, true buffers can be efficiently approximated and used to produce maps of the features for inference. We derived the local mean and the standard deviation of the LandTrendr-fitted TCB, TCG, and TCW medoids using

3 \times 3

,

7 \times 7

,

11 \times 11

, and

15 \times 15

circular kernels that correspond, approximately, to buffers with radii of 45 m, 105 m, 165 m, and 225 m, respectively. In total, 24 buffer features were computed (Table A2).

2.5.4. Gray Level Co-Occurrence Matrix Features

GLCM features were derived from the LandTrendr-fitted TCB, TCG, and TCW medoids using GLCMs [23]. GLCM features summarize the joint distribution of adjacent gray-tone values within a moving window (see Haralick et al. [23] for details). The tasseled cap components were quantized to 128 values (0–127). This was accomplished by linearly re-scaling the range of each band, using the 1st and 99th percentiles prior to the computation of the GLCM features [65]. We computed 8 GLCM metrics—angular second moment, contrast, correlation, variance, inverse difference moment, sum average, entropy, and inertia—using the default settings for the GLCM function implemented in GEE. These metrics were computed using three moving window sizes (

3 \times 3

,

5 \times 5

, and

7 \times 7

). In total, 72 GLCM features were computed (Table A3).

2.5.5. Edge Detector Features

Edge detector features were computed using the LandTrendr-fitted TCB, TCG, and TCW medoids. Four different operators that characterize the first spatial derivative of the digital image values, the Roberts operator [66], the Sobel operator [67], the Prewitt operator [68], and the Kirsch operator [69], were used to extract features. The Roberts, Sobel, Prewitt, and Kirsch operators were applied twice, once using default settings and again after rotating the kernels by 90 degrees. Information about the second spatial derivative was obtained from applying a

3 \times 3

Laplacian edge detector using a 4-connected kernel and 8-connected kernel. A Canny edge detector [70] was applied to the TCB, TCG, and TCW bands, linearly quantized in the same manner as the GLCM metrics to a range of 0–127, using three different thresholds: 16, 32, and 64 units. In total, 39 edge detector features were computed (Table A4).

2.5.6. Morphological Features

Morphological operations are image processing techniques that draw upon mathematical set theory to characterize shapes in digital imagery [71]. Plaza et al. [72] proposed a framework for morphological analysis that generalizes more traditional notions of morphological image analysis that utilize gray-tone or binary imagery to multichannel imagery. Pixel observations within a moving window are ranked according to a user-specified distance metric. The multichannel image can then be eroded (i.e., the selection of the lowest ranked value within a spatial neighborhood) and dilated (i.e., the selection of the largest value within a spatial neighborhood). Additionally, the gradient between the eroded and dilated images can be computed. We processed the LandTrendr-fitted TCW, TCB, and TCG medoids using a

3 \times 3

,

5 \times 5

, and

7 \times 7

moving window and computed the spatial erosion, dilation, and the gradient using four distance metrics: the spectral angle distance [72]), spectral information divergence [73], squared Euclidean distance, and Earth mover’s distance. In total, 84 morphological features were computed (Table A5).

2.5.7. Neighborhood Vectorization Features

Instead of designing features that summarize the spatial–spectral relationships among neighboring pixels, we can directly include all satellite measurements within a given

n \times n

neighborhood about each reference location. For example, given a 3-channel multispectral satellite image, a set of 27 features can be extracted by flattening a

3 \times 3 \times 3

neighborhood of multispectral values about each location. While this approach has the benefit of being conceptually simple, it can result in a large number of features for larger window sizes. Additionally, this approach relies entirely upon the model to discover the non-linear interactions from the structured inputs to estimate AGB density. Neighborhood vectorization (NV) features were derived from the fitted TCB, TCG, and TCW values using

3 \times 3

(n = 27),

5 \times 5

(n = 75), and

7 \times 7

neighborhoods (n = 147). In total, 147 unique neighborhood vectorization features were computed (Table A7).

2.5.8. Neighborhood Similarity Features

We considered an extension of NV, wherein the spatial neighborhood is summarized by its similarity to the centroid of the neighborhood. First, the observations in an

n \times n

neighborhood were ranked by their Euclidean distance (in multispectral space) from the centroid of the neighborhood. Then, we subtracted the centroid from each of the ranked observations. Finally, we summarized the mean and standard deviation of the top 25%, top 50%, and top 75% of the ranked observations. This produced a set of features that characterizes, across different window sizes, the similarity of nearby observations to the centroid. We derived neighborhood similarity (NS) features from the fitted TCB, TCG, and TCW composites using

5 \times 5

,

7 \times 7

, and

11 \times 11

neighborhoods. In total, 54 NS features were computed (Table A6).

2.5.9. Temporal Features

Temporal features characterizing the duration and severity of past disturbance events and forest recovery are commonly used to improve the accuracy of forest structure and land use/land cover models [7,19,74,75]. In total, 152 temporal features were computed using the LandTrendr and CCDC algorithms.

One set of temporal features was derived from an ensemble of LandTrendr models. Combining multiple LandTrendr models, developed using different spectral bands, can produce a more robust characterization of past deforestation [34,35]. We developed six LandTrendr models using 3 spectral bands, TCA, TCW, and NBR, and two LandTrendr parametrizations. We derived features that quantify the largest gain/recovery segment and the largest disturbance segment from each LandTrendr model. The largest gain/recovery event and disturbance were characterized by their disturbance magnitude, duration, pre-disturbance spectral value, and the disturbance signal-to-noise ratio [18,35]. In total, 48 LandTrendr disturbance and recovery features were computed (Table A9).

The CCDC algorithm was designed to operate on intra-annual time series of Landsat imagery [31]. The CCDC algorithm models the time series at each pixel location using piecewise harmonic models. Following Arévalo et al. [76], all scenes with more than 80% cloud coverage were removed from the two time series. CCDC was run twice, once using the time series consisting of all available images and again using a time series of Landsat scenes obtained between 1 May and 30 November. We produced two sets of CCDC metrics, as we observed that the presence of snow introduced abrupt discontinuities in the intra-annual surface reflectance trends. This negatively impacted the CCDC algorithm’s ability to capture the harmonic signal associated with vegetation phenology and vegetation type. CCDC was run using blue, green, red, near-infrared, and the shortwave-infrared 1 and 2 bands with the default GEE parameters. Using the CCDC Toolbox [77], we extracted features characterizing the largest disturbance event, the number of years since the largest disturbance, the harmonic coefficients fitted by the CCDC algorithm for the most recent segment (i.e., the year for which the features were being calculated), and the root mean squared error (RMSE) of the CCDC fit. In total, 104 CCDC features were computed (Table A8).

2.6. Random Forest Algorithm

AGB models were developed using the SciKit-Learn library’s RandomForestRegressor implementation of the RF algorithm [78] using the Python programming language (v3.10). Random Forest is a ubiquitous algorithm in the field of remote sensing [36]. Random Forests can be used to effectively model high-dimensional data with multicollinear features. The SciKit-Learn RF implementation possesses four hyperparameters that most strongly influence the trained model’s structure. The max_depth parameter controls the maximum allowable size of the decision tree. The min_samples_split parameter determines the minimum number of examples that must exist in a node for a split to be made. The min_samples_leaf parameter restricts the minimum number of samples that must exist in each leaf produced by a candidate split. The max_features parameter characterizes the proportion of features that are randomly selected and that are then used to determine a split. All RF models were developed used 500 decision trees.

2.7. Bayesian Hyperparameter Optimization

Bayesian optimization is a sequential model selection strategy, in which successive evaluations of the model are used to identify a hyperparameter set that maximizes model performance according to an objective function [79,80]. Bayesian optimization features two key components: a Gaussian process model and an acquisition function. The Gaussian process model is used to construct a posterior distribution of the objective function over the space of the parameters (Table 1). The objective function characterizes the performance of the model; here, mean squared error (MSE) was used. An acquisition function uses the information from the posterior distribution to select the next candidate hyperparameter set to be evaluated (see Frazier [80] for details). During each evaluation, one of three acquisition functions was randomly selected: the lower confidence bound, the probability of improvement, and the expected improvement. This results in a more comprehensive exploration of the hyperparameter space. Hyperparameter optimization was conducted using the SciKit-Optimize Python package with the “gp_minimize” function with default settings and 100 model evaluations [81]. The optimization procedure was initialized using 10 randomly selected hyperparameter configurations.

2.8. Experiment 1: Comparison of Spatial Feature Engineering Strategies

The objective of our first experiment was to determine which of the spatial feature engineering strategies yielded the greatest improvement in AGB model performance. We developed a baseline RF AGB model that was trained using the spectral and topographic features. Then, we developed additional RF models by adding each of the spatial feature groups (buffered, GLCM, edge, morphological, NV, and NS) to the baseline model’s feature set. Model development consisted of two stages. First, Bayesian hyperparameter tuning was used to identify an optimal set of RF hyperparameters for each feature set. The models were trained using the training set and evaluated using the development set. Second, the performance of each feature set was assessed using 250 iterations of repeated hold-out validation. When performing the repeated hold-out validation, the training set and the development set were combined, and then 67% of observations were sampled, using stratified sampling with the same 25 Mg

{ha}^{- 1}

bins used when developing the modeling dataset. Using the subsets, RF models were trained, using the feature group specific hyperparameter configuration, and were evaluated using the testing set. A final AGB model was developed for each feature group by training an RF model using the entire training and development set. Feature importance scores were extracted from the final model.

The test-set error was summarized using

R^{2}

, root mean squared error (RMSE;

\sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(y_{i} - \hat{y_{i}})}^{2}}

), mean absolute error (MAE;

\frac{1}{N} \sum_{i = 1}^{N} | y_{i} - \hat{y_{i}} |

), and mean error bias (MBE;

\frac{1}{N} \sum_{i = 1}^{N} y_{i} - \hat{y_{i}}

), where N is the number of elements being scored, and i indexes the reference AGB values (

y_{i}

) and the associated AGB predictions (

\hat{y_{i}}

). The error distributions produced by the models developed using spatial features were compared with the baseline model’s error distribution with independent samples t-tests, with an assumption of equal variance and an significance threshold of 0.001. The p-values computed for t-tests used in experiments 1, 2, and 3 were pooled and adjusted for multiple comparisons using a Bonferroni post hoc correction [82]. Throughout the analysis, we report the adjusted p-values.

2.9. Experiment 2: Inter-Comparison of Feature Engineering Strategies

The objective of our second experiment was to compare models developed using spatial features with AGB models developed using temporal features. We developed 3 RF models to compare to the baseline RF model (the same as in experiment 1): (1) the baseline + spatial features, (2) the baseline + temporal features, and (3) the baseline + all temporal and spatial features. Here, we included all spatial features except for the NV group. The NV group was excluded, as it does not represent a conventional feature engineering strategy and would introduce a very large number of additional variables that are highly correlated. As in experiment 1, model development for each feature set consisted of tuning the RF hyperparameters via Bayesian optimization and then performing an accuracy assessment using repeated hold-out validation with 250 stratified subsets (67% of observations) of the combined training and development sets. The models were scored using

R^{2}

, RMSE, MAE, and MBE, and the error distributions were compared to the baseline model using independent samples t-tests with an assumption of equal variance and a significance level of 0.001. Final AGB models were developed for each feature group by training an RF model using the entire training and development set. These models were used for visualization to develop the predicted vs. observed plots and when spatially applying the RF models.

2.10. Experiment 3: Assessment of the Bayesian Optimization

Lastly, the relative improvement resulting from optimizing the hyperparameters of the RF model was quantified. To quantify the improvement, we developed RF models for all feature sets assessed in experiments 1 and 2 using the default RF hyperparameters. The default RF parameters were the following: with max_features set as

\sqrt{p}

, where p is the number of features in the dataset [83]; an unspecified max_depth parameter (i.e., tree depth is grown until each terminal node is pure); min_samples_split was set as 1; and min_samples_leaf was set as 2. Each of the RF models was then evaluated using repeated hold-out validation with 250 stratified subsamples (67% of observations) of the combined training and development sets. Then, we compared the RMSE and

R^{2}

error distributions produced by the default and optimized RF model for each feature group using independent samples t-tests with an assumption of equal variances and a significance level of 0.001.

3. Results

3.1. Experiment 1: Spatial Feature Engineering Comparison

Compared with the baseline model, the GLCM features and buffer features produced the greatest improvement in AGB model performance of all the spatial feature engineering that were methodologies evaluated (Figure 4; Table 2). A comparison of the baseline model and the GLCM model error distributions indicated that the inclusion of GLCM features reduced the RMSE by 9.14 to 9.20 Mg

{ha}^{- 1}

(95% confidence interval (CI); t = 642.89; p-adj ≤ 0.0001) and improved the

R^{2}

score by 4.31 to 4.34% (95% CI; t = −604.06; p-adj ≤ 0.0001). Buffer features improved the RMSE by 8.49 to 8.55 Mg

{ha}^{- 1}

(95% CI; t = 634.18; p-adj ≤ 0.0001) and

R^{2}

by 4.01% to 4.04% (95% CI; t = −587.48; p-adj ≤ 0.0001). Morphological features and NV features yielded worse performance than GLCM and buffer features. The NS features were the worst performing spatial feature engineering techniques. Edge detector features nominally outperformed the NS features.

Ranking the features included in the baseline, GLCM, and buffer models by their permuted RF feature importance scores indicated that similar features in each model were consistently ranked highly (Table 3). For all models, the topographic features were consistently the highest ranked. The most important GLCM-derived features were the TCW-derived GLCM features, specifically with variance and GLCM sum average, which represents the mean of values in the GLCM distribution, being ranked the highest. The most important buffer features were the standard deviation and average of TCW reflectance. Thus, the highest ranked spatial features in the GLCM and buffer models both express the center and spread of the distribution of values in the neighborhood. The GLCM and buffer features derived from TCG and TCB were consistently ranked lower than the TCW-derived features.

3.2. Experiment 2: Inter-Comparison of Spatial and Temporal Features

As expected, both spatial and temporal features reduced the RMSE of the RF AGB models (Figure 5; Table 4). The addition of spatial features to the baseline feature set reduced the RMSE of the AGB estimates by 9.91–9.97 Mg

{ha}^{- 1}

(95% CI; t = 768.25; p-adj ≤ 0.0001) and increased the

R^{2}

by 4.62–4.68% (95% CI; t = −714.57; p-adj ≤ 0.0001). Pooling all of the spatial features (excluding the NV features), thus allowing for interactions among the different spatial features groups, only produced a nominal improvement in AGB model performance relative to solely including the GLCM or buffer features. Adding temporal features derived from the LandTrendr and CCDC algorithms produced a greater improvement in RF model performance than incorporating the spatial features. The addition of temporal features reduced the model’s RMSE by 18.36–18.41 Mg

{ha}^{- 1}

(95% CI; t = 1415.1; p-adj ≤ 0.0001) and improved the

R^{2}

by 8.32–8.34% (95% CI; t = −1283.6; p-adj ≤ 0.0001). Combining both spatial and temporal features yielded the greatest improvement in model accuracy, reducing the RMSE by 21.65–21.71 Mg

{ha}^{- 1}

(95% CI; t = 1754.8; p-adj ≤ 0.0001) and improved

R^{2}

of 9.67–9.69% (95% CI; t = −1588.9; p-adj ≤ 0.0001). Our results indicate that the inclusion of temporal features resulted in a significantly greater reduction in RMSE compared to the inclusion of spatial features.

Spatial residuals were visualized over four 15

{km}^{2}

subsets of the study region by differencing the reference lidar AGB maps and the RF model predictions (using predictors from the same year as the corresponding lidar acquisition) (Figure 6). The subsets were located in Western Oregon, Eastern Oregon, Central Idaho, North Central Washington. The Western Oregon site contained wet Douglas-fir/western hemlock, with AGB densities exceeding 1000 Mg

{ha}^{- 1}

. The Eastern Oregon site contained fire-adapted Ponderosa and lodgepole pine forests stands. The Central Idaho site contained mix spruce-fir forest stands. The North Central Washington site contained coniferous forests, with lodgepole pine occurring at lower elevations and Douglas fir occurring at higher elevations.

The inclusion of temporal features improved the overestimation of AGB in the Idaho and Washington sites. However, no feature engineering strategy fully alleviated spectral saturation. AGB was notably systematically underestimated in the Coast Range, which possessed the largest AGB densities in the study domain (Figure 7). The inclusion of spatial features appeared to have little impact on the residual structure in the Coast Range site or in the Eastern Oregon site (Figure 8). The inclusion of spatial features exacerbated the overestimation of AGB densities in the Idaho site (Figure 9) and the North Central Washington site (Figure 10). When both temporal and spatial features were included in the AGB models, the visual appearance of the residuals in the Oregon and Washington sites did not substantially change compared to including just the temporal features. However, the addition of the spatial features to the temporal features appeared to increase the degree of overestimation in the Idaho site.

3.3. Experiment 3: Impact of Hyperparameter Optimization

Bayesian optimization yielded significant improvements in model performance for all feature groups (Table 5). For models developed in the first experiment, Bayesian optimization yielded reductions in AGB RMSE ranging from 0.66 Mg

{ha}^{- 1}

to 5.25 Mg

{ha}^{- 1}

. Despite the reduction in RMSE, there were only marginal, though significant, improvements in the

R^{2}

of RF models. The RF models developed in the second analysis reduced the RMSE by 2 to 3.4 Mg

{ha}^{- 1}

. Similarly small but significant reductions in

R^{2}

were observed.

4. Discussion

4.1. Effectiveness of Spatial Features for AGB Modeling

Our comparison of different spatial feature engineering strategies found that GLCM features outperformed the other spatial feature engineering techniques. This confirms the findings of previous analyses that indicated GLCM features can improve AGB estimation [20,22,24,84]. Our analysis also found that buffer features exhibited similar performance to GLCM features. Based on the permuted feature importance scores, the most important GLCM features similarly characterizes the center and spread of values in the co-occurrence matrices. Although the inclusion of these features did improve model RMSE, this came at a cost of increased bias. In experiment 1, integrating GLCM metrics increased bias from the baseline model by 2.2 Mg

{ha}^{- 1}

and incorporating buffer features increased bias by 2.1 Mg

{ha}^{- 1}

. This bias appears to not be evenly spatially distributed. The bias of the models developed using spatial features increased by 1 Mg

{ha}^{- 1}

relative to the baseline in experiment 2. When the models developed in Experiment 2 were applied to spatial subsets with lower AGB densities in Eastern Oregon (Figure 8) and Central Idaho (Figure 9), there was an increase in the underestimation of AGB densities. It is possible that the increase in bias is the result of regional overfitting due to the greater density in high AGB areas of the study domain (i.e., the Coast Range and the Cascades). Incorporating temporal features in experiment 2 produced a 2.22 Mg

{ha}^{- 1}

increase in bias relative to the baseline model, giving credence to the idea that imbalanced sampling across different ecoregions might be the cause. To our knowledge, this is the first analysis to evaluate edge detectors and morphological features in the context of AGB modeling with Landsat data. While these features did significantly improve AGB model performance, relative to the baseline model, the overall improvement was less than that of the GLCM or buffer features.

Two feature engineering strategies were proposed by this analysis, neighborhood vectorization (NV) and neighborhood similarity (NS) features. Neighborhood vectorization was inspired by the success of deep learning techniques, which are capable of learning hierarchical spatial and space-time features from unstructured (i.e., raw image bands) satellite imagery [85,86,87]. By taking all pixels in a spatial window and vectorizing them, we gave the Random Forest model access to similar information. Random Forest, unlike conventional Convolutional Neural Networks (CNNs) [88,89] or Vision Transformers (ViTs) [90], operates by recursively partitioning the input feature space. This enables it to learn complex interactions between high-dimensional features, but the algorithm does not learn hierarchical features in the same way that deep learning models do. NS metrics were motivated by the use of buffer features in Hopkins et al. [56]. Given that buffer features present simple means and standard deviations of spectral values, computing similar metrics using ranked subsets within a spatial window based on their similarity to the centroid seemed like a way to gain additional information. Both feature engineering strategies significantly improved the AGB model performance relative to the baseline AGB model. However, neither were as effective as the GLCM or buffer features and the NS metrics were the worst performing of all of the metrics evaluated. The limited success of the NV methods indicates that decision tree models like RF are not suited for extracting features from raw values in the same manner as a CNN or ViT.

A key finding in our analysis is that temporal features, derived from the LandTrendr and CCDC algorithms, explain a significantly larger amount of the variation in AGB density than spatial features. This is because present forest AGB densities are often a function of the time elapsed since previous disturbances [18,19]. Statistical features that characterize disturbance and recovery appeared to improve the spatial structure across all regions of the analysis. This was not the case for spatial features, which appeared to produce underestimates of AGB in regions with lower AGB densities. Additionally, we observed that the degree of underestimation appeared to correspond with the topographic residuals. Zhao et al. [22] found that topographic effects confounded AGB models developed with GLCM texture features. While this was not explicitly evaluated, the structure of the spatial residuals (Figure 9 and Figure 10) does appear to correspond to changes in slope aspect and to the presence of ridgelines. This suggests that the inclusion of spatial features may not be appropriate unless a spatial stratification procedure is employed during model development (e.g., Zhao et al. [22]) or topographic corrections are applied to the Landsat surface reflectance values. Our study indicates that, while the inclusion of spatial features may reduce the variance in AGB estimates, these features do not meaningfully alleviate the problem of spectral saturation.

4.2. Optimization of Black Box Algorithms

Our results indicate that optimization of the RF hyperparameters significantly improves AGB model performance. Our comparison was motivated by conversations with colleagueswho were surprised that performing hyperparameter tuning with algorithms like Random Forests may offer some benefit. Indeed, many previous studies that utilized Landsat multispectral satellite imagery and the Random Forest algorithm to map AGB did not tune the RF algorithm’s hyperparameters [6,18,19,25,28,38,39]. Tuning machine learning algorithms like RF eliminates a potential source of unwanted variation in statistical comparisons between different model types or in comparisons of feature groups. However, the key takeaway from our results is not that a specific type of optimization should be used (i.e., Bayesian optimization) but rather that some form of hyperparameter tuning should be incorporated into the model development procedure when using machine learning algorithms. There are numerous libraries that enable sequential or parallelized hyperparameter search. In Python, Optuna [91] and Tune [92] are flexible, high-performance hyperparameter tuning libraries. We note that development support for Sci-Kit Optimize, used in this analysis, has ended.

4.3. Analysis Limitations

A significant caveat to our analysis is that we utilized a modeled AGB product as our reference dataset instead of inventory plot data. However, the reference data product demonstrated a clear monotonic relationship between the observed and predicted AGB densities and lacked significant bias (1.9 Mg

{ha}^{- 1}

; 0.8% of the mean) [6]. Consequently, we believe the relationships between AGB density and different feature engineering methodologies identified in this analysis are robust. It is important to note that our ultimate objective was not to obtain area-based estimates of AGB but rather to evaluate different strategies for AGB estimation. As discussed by Hudak et al. [6], Landsat AGB models developed using the lidar-AGB maps should ideally be calibrated using an independent forest inventory dataset (e.g., the United States Forest Service’s Forest Inventory Analysis division’s Phase II forest inventory plot database). Alternatively, area-based estimates of AGB can be developed using a hierarchical modeling approach (e.g., Saarela et al. [93]), which can explicitly carry the error from the initial model that linked plot-level AGB to lidar height features into the development of the Landsat-based model.

4.4. Future Research

We utilized a single, region-wide RF to derive AGB estimates. Such an approach may have potentially limited model performance. With a region-wide model, RF may split a node containing examples from both Western Oregon and Eastern Montana. Partitioning the variance in these two biophysically distinct regions jointly likely forces the model to identify a suboptimal split for either region individually. Hudak et al. [6] developed AGB maps using the same lidar AGB maps and utilized a stratified random sampling scheme. Stratified sampling ensures that all biomass densities are well represented within the modeling dataset. However, this does not ensure that each ecoregion is equally well represented. Modeling frameworks that allow response covariant relationships to vary on a subregion level, for example, the spatio-temporal ecological model (STEM) framework [94,95] or frameworks that allow for spatially varying coefficients [96,97], may improve AGB model performance.

5. Conclusions

Improving our AGB monitoring capabilities is imperative for planning and prioritizing forest management, clarifying our scientific understanding of the carbon cycle, and for addressing climate change. We investigated spatial feature engineering techniques to determine which methodologies yielded the greatest improvement in AGB models derived from multispectral satellite imagery. We found that GLCM texture features produced greatest improvement in AGB model performance. More significantly, our results confirmed the value of time series features characterizing past forest disturbance events. Additionally, we identified simple buffer features, developed using circular kernels, to provide a similar level of improvement in model performance. The other techniques evaluated, such as edge detectors, morphological features, the neighborhood-similarity features proposed in this analysis, or spatial vectorization, failed to produce comparable improvements in AGB model accuracy. We also identified that temporal features yielded a more significant improvement in model performance. Lastly, we highlighted the importance of tuning the hyperparameters associated with machine learning models when constructing AGB models that compare different image processing methodologies.

Author Contributions

Conceptualization, J.B.K. and R.E.K.; methodology, J.B.K.; formal analysis, J.B.K.; writing—original draft preparation, J.B.K.; writing—review and editing, J.B.K. and R.E.K.; visualization, J.B.K.; project administration, R.E.K.; supervision, R.E.K.; funding acquisition, R.E.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by a United States Geological Survey research grant (Award G21AC10227).

Data Availability Statement

The modeling dataset used in this analysis is available upon request.

Acknowledgments

We would like to thank Jamon Van Den Hoek, Garrett W. Meigs, Julia Jones, and Mark E. Harmon and for their feedback and suggestions on earlier drafts of this manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AGB	Aboveground biomass
BUFF	buffer features
CCDC	Continuous Change Detection and Classification
DEM	Digital elevation model
C2SR	Landsat Collection 2 Tier 1 Level 2 Surface Reflectance
CI	confidence interval
EVI	Enhanced Vegetation Index
GLCM	Gray Level Co-Occurrence Matrix
MAE	Mean absolute error
MSE	Mean squared error
NDMI	Normalized Difference Moisture Index
NBR	Normalized Burn Ratio
NS	Neighborhood similarity
NV	Neighborhood vectorization
RF	Random Forest
RMSE	Root mean squared error
SV	Spatial vectorization
TCA	Tasseled Cap Angle
TCB	Tasseled Cap Brightness
TCG	Tasseled Cap Greenness
TCW	Tasseled Cap Wetness
USGS	United States Geological Survey

Appendix A. Description of Modeling Features

Table A1. A description of the LandTrendr-fitted spectral features and topographic features that were included in the “baseline” models during our experiments.

Variable Name	Description	Source
SPEC_B1	Blue reflectance captured by Band 1 on the TM and ETM+ sensors and by Band 2 on the OLI-1 and OLI-2 sensors	N/A
SPEC_B2	Green reflectance captured by Band 2 on the TM and ETM+ sensors and by Band 3 on the OLI-1 and OLI-2 sensors	N/A
SPEC_B3	Red reflectance captured by Band 3 on the TM and ETM+ sensors and by Band 4 on the OLI-1 and OLI-2 sensors	N/A
SPEC_B4	Near-infrared reflectance captured by Band 4 on the TM and ETM+ sensors and by Band 5 on the OLI-1 and OLI-2 sensors	N/A
SPEC_B5	Shortwave-infrared reflectance captured by Band 5 on the TM and ETM+ sensors and by Band 6 on the OLI-1 and OLI-2 sensors	N/A
SPEC_B7	Shortwave-infrared reflectance captured by Band 7 on the TM and ETM+ sensors and by Band 7 on the OLI-1 and OLI-2 sensors	N/A
SPEC_EVI	The Enhanced Vegetation Index is a modification of the NDVI formula to improve the linearity of its relationship with biophysical parameters	[62]
SPEC_NBR	The Normalized Burn Ratio was developed to quantify fire-related landscape change	[58]
SPEC_NDMI	The Normalized Difference Moisture Index was developed to monitor harvesting events in regions with partial-disturbance events.	[61]
SPEC_NDVI	A vegetation index which exploits the “red-edge”.	[60]
SPEC_TCA	Relates to the proportion of vegetated and non-vegetated land within a given pixel.	[38]
SPEC_TCB	The first component from a principal components ordination of the Landsat bands performed over agricultural fields. Corresponds to soil brightness.	[59]
SPEC_TCG	The second component from a principal components ordination of the Landsat bands performed over agricultural fields. Corresponds to vegetation density.	[59]
SPEC_TCW	The third component from a principal components ordination of the Landsat bands performed over agricultural fields. Corresponds to moisture content.	[59]
TOPO_ELEVATION	Elevation; derived from the National Elevation Dataset DEM.	[57]
TOPO_HILLSHADE	Hillshade; derived from the National Elevation Dataset DEM.	[57]
TOPO_SLOPE	Slope; derived from the National Elevation Dataset DEM.	[57]
TOPO_ASPECT_COS	The cosine embedding of aspect derived from the National Elevation Dataset DEM.	[57]
TOPO_ASPECT_SIN	The sine embedding of aspect derived from the National Elevation Dataset DEM.	[57]

Table A2. A description of the “buffered” statistical features that were used in the numerical experiments.

Variable Name	Description	Source
BUFF_TCB_mean_3	The mean TCB within a 3 × 3 circular kernel.	N/A
BUFF_TCB_stdDev_3	The standard deviation TCB within a 3 × 3 circular kernel.	N/A
BUFF_TCG_mean_3	The mean TCG within a 3 × 3 circular kernel.	N/A
BUFF_TCG_stdDev_3	The standard deviation TCG within a 3 × 3 circular kernel.	N/A
BUFF_TCW_mean_3	The mean TCW within a 3 × 3 circular kernel.	N/A
BUFF_TCW_stdDev_3	The standard deviation TCW within a 3 × 3 circular kernel.	N/A
BUFF_TCB_mean_7	The mean TCB within a 7 × 7 circular kernel.	N/A
BUFF_TCB_stdDev_7	The standard deviation TCB within a 7 × 7 circular kernel.	N/A
BUFF_TCG_mean_7	The mean TCG within a 7 × 7 circular kernel.	N/A
BUFF_TCG_stdDev_7	The standard deviation TCG within a 7 × 7 circular kernel.	N/A
BUFF_TCW_mean_7	The mean TCW within a 7 × 7 circular kernel.	N/A
BUFF_TCW_stdDev_7	The standard deviation TCW within a 7 × 7 circular kernel.	N/A
BUFF_TCB_mean_11	The mean TCB within a 11 × 11 circular kernel.	N/A
BUFF_TCB_stdDev_11	The standard deviation TCB within a 11 × 11 circular kernel.	N/A
BUFF_TCG_mean_11	The mean TCG within a 11 × 11 circular kernel.	N/A
BUFF_TCG_stdDev_11	The standard deviation TCG within a 11 × 11 circular kernel.	N/A
BUFF_TCW_mean_11	The mean TCW within a 11 × 11 circular kernel.	N/A
BUFF_TCW_stdDev_11	The standard deviation TCW within a 11 × 11 circular kernel.	N/A
BUFF_TCB_mean_17 × 17	The mean TCB within a 17 × 17 circular kernel.	N/A
BUFF_TCB_stdDev_17 × 17	The standard deviation TCB within a 17 × 17 circular kernel.	N/A
BUFF_TCG_mean_17 × 17	The mean TCG within a 17 × 17 circular kernel.	N/A
BUFF_TCG_stdDev_17 × 17	The standard deviation TCG within a 17 × 17 circular kernel.	N/A
BUFF_TCW_mean_17 × 17	The mean TCW within a 17 × 17 circular kernel.	N/A
BUFF_TCW_stdDev_17 × 17	The standard deviation TCW within a 17 × 17 circular kernel.	N/A

Table A3. A description of the Gray Level Co-Occurrence Matrix (GLCM) features used in the numerical experiments.

Variable Name	Description	Source
GLCM_TCB_asm_3	The angular second moment, a GLCM metric, derived from TCB using a 3 × 3 input window.	[23]
GLCM_TCG_asm_3	The angular second moment, a GLCM metric, derived from TCG using a 3 × 3 input window.	[23]
GLCM_TCW_asm_3	The angular second moment, a GLCM metric, derived from TCW using a 3 × 3 input window.	[23]
GLCM_TCB_contrast_3	The contrast, a GLCM metric, derived from TCB using a 3 × 3 input window.	[23]
GLCM_TCG_contrast_3	The contrast, a GLCM metric, derived from TCG using a 3 × 3 input window.	[23]
GLCM_TCW_contrast_3	The contrast, a GLCM metric, derived from TCW using a 3 × 3 input window.	[23]
GLCM_TCB_corr_3	The correlation, a GLCM metric, derived from TCB using a 3 × 3 input window.	[23]
GLCM_TCG_corr_3	The correlation, a GLCM metric, derived from TCG using a 3 × 3 input window.	[23]
GLCM_TCW_corr_3	The correlation, a GLCM metric, derived from TCW using a 3 × 3 input window.	[23]
GLCM_TCB_var_3	The variance, a GLCM metric, derived from TCB using a 3 × 3 input window.	[23]
GLCM_TCG_var_3	The variance, a GLCM metric, derived from TCG using a 3 × 3 input window.	[23]
GLCM_TCW_var_3	The variance, a GLCM metric, derived from TCW using a 3 × 3 input window.	[23]
GLCM_TCB_idm_3	The inverse difference moment, a GLCM metric, derived from TCB using a 3 × 3 input window.	[23]
GLCM_TCG_idm_3	The inverse difference moment, a GLCM metric, derived from TCG using a 3 × 3 input window.	[23]
GLCM_TCW_idm_3	The inverse difference moment, a GLCM metric, derived from TCW using a 3 × 3 input window.	[23]
GLCM_TCB_savg_3	The sum average, a GLCM metric, derived from TCB using a 3 × 3 input window.	[23]
GLCM_TCG_savg_3	The sum average, a GLCM metric, derived from TCG using a 3 × 3 input window.	[23]
GLCM_TCW_savg_3	The sum average, a GLCM metric, derived from TCW using a 3 × 3 input window.	[23]
GLCM_TCB_ent_3	The entropy, a GLCM metric, derived from TCB using a 3 × 3 input window.	[23]
GLCM_TCG_ent_3	The entropy, a GLCM metric, derived from TCG using a 3 × 3 input window.	[23]
GLCM_TCW_ent_3	The entropy, a GLCM metric, derived from TCW using a 3 × 3 input window.	[23]
GLCM_TCB_inertia_3	The inertia, a GLCM metric, derived from TCB using a 3 × 3 input window.	[98]
GLCM_TCG_inertia_3	The inertia, a GLCM metric, derived from TCG using a 3 × 3 input window.	[98]
GLCM_TCW_inertia_3	The inertia, a GLCM metric, derived from TCW using a 3 × 3 input window.	[98]
GLCM_TCB_asm_7	The angular second moment, a GLCM metric, derived from TCB using a 7 × 7 input window.	[23]
GLCM_TCG_asm_7	The angular second moment, a GLCM metric, derived from TCG using a 7 × 7 input window.	[23]
GLCM_TCW_asm_7	The angular second moment, a GLCM metric, derived from TCW using a 7 × 7 input window.	[23]
GLCM_TCB_contrast_7	The contrast, a GLCM metric, derived from TCB using a 7 × 7 input window.	[23]
GLCM_TCG_contrast_7	The contrast, a GLCM metric, derived from TCG using a 7 × 7 input window.	[23]
GLCM_TCW_contrast_7	The contrast, a GLCM metric, derived from TCW using a 7 × 7 input window.	[23]
GLCM_TCB_corr_7	The correlation, a GLCM metric, derived from TCB using a 7 × 7 input window.	[23]
GLCM_TCG_corr_7	The correlation, a GLCM metric, derived from TCG using a 7 × 7 input window.	[23]
GLCM_TCW_corr_7	The correlation, a GLCM metric, derived from TCW using a 7 × 7 input window.	[23]
GLCM_TCB_var_7	The variance, a GLCM metric, derived from TCB using a 7 × 7 input window.	[23]
GLCM_TCG_var_7	The variance, a GLCM metric, derived from TCG using a 7 × 7 input window.	[23]
GLCM_TCW_var_7	The variance, a GLCM metric, derived from TCW using a 7 × 7 input window.	[23]
GLCM_TCB_idm_7	The inverse difference moment, a GLCM metric, derived from TCB using a 7 × 7 input window.	[23]
GLCM_TCG_idm_7	The inverse difference moment, a GLCM metric, derived from TCG using a 7 × 7 input window.	[23]
GLCM_TCW_idm_7	The inverse difference moment, a GLCM metric, derived from TCW using a 7 × 7 input window.	[23]
GLCM_TCB_savg_7	The sum average, a GLCM metric, derived from TCB using a 7 × 7 input window.	[23]
GLCM_TCG_savg_7	The sum average, a GLCM metric, derived from TCG using a 7 × 7 input window.	[23]
GLCM_TCW_savg_7	The sum average, a GLCM metric, derived from TCW using a 7 × 7 input window.	[23]
GLCM_TCB_ent_7	The entropy, a GLCM metric, derived from TCB using a 7 × 7 input window.	[23]
GLCM_TCG_ent_7	The entropy, a GLCM metric, derived from TCG using a 7 × 7 input window.	[23]
GLCM_TCW_ent_7	The entropy, a GLCM metric, derived from TCW using a 7 × 7 input window.	[23]
GLCM_TCB_inertia_7	The inertia, a GLCM metric, derived from TCB using a 7 × 7 input window.	[98]
GLCM_TCG_inertia_7	The inertia, a GLCM metric, derived from TCG using a 7 × 7 input window.	[98]
GLCM_TCW_inertia_7	The inertia, a GLCM metric, derived from TCW using a 7 × 7 input window.	[98]
GLCM_TCB_asm_11	The angular second moment, a GLCM metric, derived from TCB using a 11 × 11 input window.	[23]
GLCM_TCG_asm_11	The angular second moment, a GLCM metric, derived from TCG using a 11 × 11 input window.	[23]
GLCM_TCW_asm_11	The angular second moment, a GLCM metric, derived from TCW using a 11 × 11 input window.	[23]
GLCM_TCB_contrast_11	The contrast, a GLCM metric, derived from TCB using a 11 × 11 input window.	[23]
GLCM_TCG_contrast_11	The contrast, a GLCM metric, derived from TCG using a 11 × 11 input window.	[23]
GLCM_TCW_contrast_11	The contrast, a GLCM metric, derived from TCW using a 11 × 11 input window.	[23]
GLCM_TCB_corr_11	The correlation, a GLCM metric, derived from TCB using a 11 × 11 input window.	[23]
GLCM_TCG_corr_11	The correlation, a GLCM metric, derived from TCG using a 11 × 11 input window.	[23]
GLCM_TCW_corr_11	The correlation, a GLCM metric, derived from TCW using a 11 × 11 input window.	[23]
GLCM_TCB_var_11	The variance, a GLCM metric, derived from TCB using a 11 × 11 input window.	[23]
GLCM_TCG_var_11	The variance, a GLCM metric, derived from TCG using a 11 × 11 input window.	[23]
GLCM_TCW_var_11	The variance, a GLCM metric, derived from TCW using a 11 × 11 input window.	[23]
GLCM_TCB_idm_11	The inverse difference moment, a GLCM metric, derived from TCB using a 11 × 11 input window.	[23]
GLCM_TCG_idm_11	The inverse difference moment, a GLCM metric, derived from TCG using a 11 × 11 input window.	[23]
GLCM_TCW_idm_11	The inverse difference moment, a GLCM metric, derived from TCW using a 11 × 11 input window.	[23]
GLCM_TCB_savg_11	The sum average, a GLCM metric, derived from TCB using a 11 × 11 input window.	[23]
GLCM_TCG_savg_11	The sum average, a GLCM metric, derived from TCG using a 11 × 11 input window.	[23]
GLCM_TCW_savg_11	The sum average, a GLCM metric, derived from TCW using a 11 × 11 input window.	[23]
GLCM_TCB_ent_11	The entropy, a GLCM metric, derived from TCB using a 11 × 11 input window.	[23]
GLCM_TCG_ent_11	The entropy, a GLCM metric, derived from TCG using a 11 × 11 input window.	[23]
GLCM_TCW_ent_11	The entropy, a GLCM metric, derived from TCW using a 11 × 11 input window.	[23]
GLCM_TCB_inertia_11	The inertia, a GLCM metric, derived from TCB using a 11 × 11 input window.	[98]
GLCM_TCG_inertia_11	The inertia, a GLCM metric, derived from TCG using a 11 × 11 input window.	[98]
GLCM_TCW_inertia_11	The inertia, a GLCM metric, derived from TCW using a 11 × 11 input window.	[98]

Table A4. A description of the edge detector statistical features that were used in the numerical experiments.

Variable Name	Description	Source
EDGE_canny_TCB_low	TCB-derived Canny edge detector features with a low threshold.	[70]
EDGE_canny_TCG_low	TCG-derived Canny edge detector features with a low threshold.	[70]
EDGE_canny_TCW_low	TCW-derived Canny edge detector features with a low threshold.	[70]
EDGE_canny_TCB_medium	TCB-derived Canny edge detector features with a medium threshold.	[70]
EDGE_canny_TCG_medium	TCG-derived Canny edge detector features with a medium threshold.	[70]
EDGE_canny_TCW_medium	TCW-derived Canny edge detector features with a medium threshold.	[70]
EDGE_canny_TCB_high	TCB-derived Canny edge detector features with a high threshold.	[70]
EDGE_canny_TCG_high	TCG-derived Canny edge detector features with a high threshold.	[70]
EDGE_canny_TCW_high	TCW-derived Canny edge detector features with a high threshold.	[70]
EDGE_laplacian_TCB_4	TCB convolved using a 3 × 3, 4-connected Laplacian kernel.	N/A
EDGE_laplacian_TCG_4	TCG convolved using a 3 × 3, 4-connected Laplacian kernel.	N/A
EDGE_laplacian_TCW_4	TCW convolved using a 3 × 3, 4-connected Laplacian kernel.	N/A
EDGE_laplacian_TCB_8	TCB convolved using a 3 × 3, 8-connected Laplacian kernel.	N/A
EDGE_laplacian_TCG_8	TCG convolved using a 3 × 3, 8-connected Laplacian kernel.	N/A
EDGE_laplacian_TCW_8	TCW convolved using a 3 × 3, 8-connected Laplacian kernel.	N/A
EDGE_kirsch_TCB_0deg	TCB convolved using a Kirsch kernel.	[69]
EDGE_kirsch_TCG_0deg	TCG convolved using a Kirsch kernel.	[69]
EDGE_kirsch_TCW_0deg	TCW convolved using a Kirsch kernel.	[69]
EDGE_kirsch_TCB_90deg	TCB convolved using a Kirsch kernel rotated 90 degrees.	[69]
EDGE_kirsch_TCG_90deg	TCG convolved using a Kirsch kernel rotated 90 degrees.	[69]
EDGE_kirsch_TCW_90deg	TCW convolved using a Kirsch kernel rotated 90 degrees.	[69]
EDGE_prewitt_TCB_0deg	TCB convolved using a Prewitt kernel.	[68]
EDGE_prewitt_TCG_0deg	TCG convolved using a Prewitt kernel.	[68]
EDGE_prewitt_TCW_0deg	TCW convolved using a Prewitt kernel.	[68]
EDGE_prewitt_TCB_90deg	TCB convolved using a Prewitt kernel rotated 90 degrees.	[68]
EDGE_prewitt_TCG_90deg	TCG convolved using a Prewitt kernel rotated 90 degrees.	[68]
EDGE_prewitt_TCW_90deg	TCW convolved using a Prewitt kernel rotated 90 degrees.	[68]
EDGE_roberts_TCB_0deg	TCB convolved using a Roberts kernel.	[66]
EDGE_roberts_TCG_0deg	TCG convolved using a Roberts kernel.	[66]
EDGE_roberts_TCW_0deg	TCW convolved using a Roberts kernel.	[66]
EDGE_roberts_TCB_90deg	TCB convolved using a Roberts kernel rotated 90 degrees.	[66]
EDGE_roberts_TCG_90deg	TCG convolved using a Roberts kernel rotated 90 degrees.	[66]
EDGE_roberts_TCW_90deg	TCW convolved using a Roberts kernel rotated 90 degrees.	[66]
EDGE_sobel_TCB_0deg	TCB convolved using a Sobel kernel.	[67]
EDGE_sobel_TCG_0deg	TCG convolved using a Sobel kernel.	[67]
EDGE_sobel_TCW_0deg	TCW convolved using a Sobel kernel.	[67]
EDGE_sobel_TCB_90deg	TCB convolved using a Sobel kernel rotated 90 degrees.	[67]
EDGE_sobel_TCG_90deg	TCG convolved using a Sobel kernel rotated 90 degrees.	[67]
EDGE_sobel_TCW_90deg	TCW convolved using a Sobel kernel rotated 90 degrees.	[67]

Table A5. A description of the morphological statistical features that were used in the numerical experiments.

Variable Name	Description	Source
MORPH_TCB_dil_3_emd	The dilation of TCB computed using a 3 × 3 window using Earth mover’s distance.	[72]
MORPH_TCB_dil_3_sam	The dilation of TCB computed using a 3 × 3 window using spectral angle mapper distance.	[72]
MORPH_TCB_dil_3_sed	The dilation of TCB computed using a 3 × 3 window using squared Euclidean distance.	[72]
MORPH_TCB_dil_3_sid	The dilation of TCB computed using a 3 × 3 window using spectral information divergence.	[72]
MORPH_TCB_dil_5_emd	The dilation of TCB computed using a 5 × 5 window using Earth mover’s distance.	[72]
MORPH_TCB_dil_5_sam	The dilation of TCB computed using a 5 × 5 window using spectral angle mapper distance.	[72]
MORPH_TCB_dil_5_sed	The dilation of TCB computed using a 5 × 5 window using squared Euclidean distance.	[72]
MORPH_TCB_dil_5_sid	The dilation of TCB computed using a 5 × 5 window using spectral information divergence.	[72]
MORPH_TCB_dil_7_emd	The dilation of TCB computed using a 7 × 7 window using Earth mover’s distance.	[72]
MORPH_TCB_dil_7_sam	The dilation of TCB computed using a 7 × 7 window using spectral angle mapper distance.	[72]
MORPH_TCB_dil_7_sed	The dilation of TCB computed using a 7 × 7 window using squared Euclidean distance.	[72]
MORPH_TCB_dil_7_sid	The dilation of TCB computed using a 7 × 7 window using spectral information divergence.	[72]
MORPH_TCB_ero_3_emd	The erosion of TCB computed using a 3 × 3 window using Earth mover’s distance.	[72]
MORPH_TCB_ero_3_sam	The erosion of TCB computed using a 3 × 3 window using spectral angle mapper distance.	[72]
MORPH_TCB_ero_3_sed	The erosion of TCB computed using a 3 × 3 window using squared Euclidean distance.	[72]
MORPH_TCB_ero_3_sid	The erosion of TCB computed using a 3 × 3 window using spectral information divergence.	[72]
MORPH_TCB_ero_5_emd	The erosion of TCB computed using a 5 × 5 window using Earth mover’s distance.	[72]
MORPH_TCB_ero_5_sam	The erosion of TCB computed using a 5 × 5 window using spectral angle mapper distance.	[72]
MORPH_TCB_ero_5_sed	The erosion of TCB computed using a 5 × 5 window using squared Euclidean distance.	[72]
MORPH_TCB_ero_5_sid	The erosion of TCB computed using a 5 × 5 window using spectral information divergence.	[72]
MORPH_TCB_ero_7_emd	The erosion of TCB computed using a 7 × 7 window using Earth mover’s distance.	[72]
MORPH_TCB_ero_7_sam	The erosion of TCB computed using a 7 × 7 window using spectral angle mapper distance.	[72]
MORPH_TCB_ero_7_sed	The erosion of TCB computed using a 7 × 7 window using squared Euclidean distance.	[72]
MORPH_TCB_ero_7_sid	The erosion of TCB computed using a 7 × 7 window using spectral information divergence.	[72]
MORPH_TCG_dil_3_emd	The dilation of TCG computed using a 3 × 3 window using Earth mover’s distance.	[72]
MORPH_TCG_dil_3_sam	The dilation of TCG computed using a 3 × 3 window using spectral angle mapper distance.	[72]
MORPH_TCG_dil_3_sed	The dilation of TCG computed using a 3 × 3 window using squared Euclidean distance.	[72]
MORPH_TCG_dil_3_sid	The dilation of TCG computed using a 3 × 3 window using spectral information divergence	[72]
MORPH_TCG_dil_5_emd	The dilation of TCG computed using a 5 × 5 window using Earth mover’s distance.	[72]
MORPH_TCG_dil_5_sam	The dilation of TCG computed using a 5 × 5 window using spectral angle mapper distance.	[72]
MORPH_TCG_dil_5_sed	The dilation of TCG computed using a 3 × 3 window using squared Euclidean distance.	[72]
MORPH_TCG_dil_5_sid	The dilation of TCG computed using a 5 × 5 window using spectral information divergence	[72]
MORPH_TCG_dil_7_emd	The dilation of TCB computed using a 7 × 7 window using Earth mover’s distance.	[72]
MORPH_TCG_dil_7_sam	The dilation of TCB computed using a 7 × 7 window using spectral angle mapper distance.	[72]
MORPH_TCG_dil_7_sed	The dilation of TCG computed using a 7 × 7 window using squared Euclidean distance.	[72]
MORPH_TCG_dil_7_sid	The dilation of TCG computed using a 7 × 7 window using spectral information divergence	[72]
MORPH_TCG_ero_3_emd	The erosion of TCG computed using a 3 × 3 window using Earth mover’s distance.	[72]
MORPH_TCG_ero_3_sam	The erosion of TCG computed using a 3 × 3 window using spectral angle mapper distance.	[72]
MORPH_TCG_ero_3_sed	The erosion of TCG computed using a 3 × 3 window using squared Euclidean distance.	[72]
MORPH_TCG_ero_3_sid	The erosion of TCG computed using a 3 × 3 window using spectral information divergence.	[72]
MORPH_TCG_ero_5_emd	The erosion of TCG computed using a 5 × 5 window using Earth mover’s distance.	[72]
MORPH_TCG_ero_5_sam	The erosion of TCG computed using a 5 × 5 window using spectral angle mapper distance.	[72]
MORPH_TCG_ero_5_sed	The erosion of TCG computed using a 5 × 5 window using squared Euclidean distance.	[72]
MORPH_TCG_ero_5_sid	The erosion of TCG computed using a 5 × 5 window using spectral information divergence.	[72]
MORPH_TCG_ero_7_emd	The erosion of TCG computed using a 7 × 7 window using Earth mover’s distance.	[72]
MORPH_TCG_ero_7_sam	The erosion of TCG computed using a 7 × 7 window using spectral angle mapper distance.	[72]
MORPH_TCG_ero_7_sed	The erosion of TCG computed using a 7 × 7 window using squared Euclidean distance.	[72]
MORPH_TCG_ero_7_sid	The erosion of TCG computed using a 7 × 7 window using spectral information divergence.	[72]
MORPH_TCW_dil_3_emd	The dilation of TCW computed using a 3 × 3 window using Earth mover’s distance.	[72]
MORPH_TCW_dil_3_sam	The dilation of TCW computed using a 3 × 3 window using spectral angle mapper distance.	[72]
MORPH_TCW_dil_3_sed	The dilation of TCW computed using a 3 × 3 window using squared Euclidean distance.	[72]
MORPH_TCW_dil_3_sid	The dilation of TCW computed using a 3 × 3 window using spectral information divergence.	[72]
MORPH_TCW_dil_5_emd	The dilation of TCW computed using a 5 × 5 window using Earth mover’s distance.	[72]
MORPH_TCW_dil_5_sam	The dilation of TCW computed using a 3 × 3 window using spectral angle mapper distance.	[72]
MORPH_TCW_dil_5_sed	The dilation of TCW computed using a 5 × 5 window using squared Euclidean distance.	[72]
MORPH_TCW_dil_5_sid	The dilation of TCW computed using a 5 × 5 window using spectral information divergence.	[72]
MORPH_TCW_dil_7_emd	The dilation of TCW computed using a 7 × 7 window using Earth mover’s distance.	[72]
MORPH_TCW_dil_7_sam	The dilation of TCW computed using a 7 × 7 window using spectral angle mapper distance.	[72]
MORPH_TCW_dil_7_sed	The dilation of TCW computed using a 7 × 7 window using squared Euclidean distance.	[72]
MORPH_TCW_dil_7_sid	The dilation of TCW computed using a 7 × 7 window using spectral information divergence.	[72]
MORPH_TCW_ero_3_emd	The erosion of TCW computed using a 3 × 3 window using Earth mover’s distance.	[72]
MORPH_TCW_ero_3_sam	The erosion of TCW computed using a 3 × 3 window using spectral angle mapper distance.	[72]
MORPH_TCW_ero_3_sed	The erosion of TCW computed using a 3 × 3 window using squared Euclidean distance.	[72]
MORPH_TCW_ero_3_sid	The erosion of TCW computed using a 3 × 3 window using spectral information divergence.	[72]
MORPH_TCW_ero_5_emd	The erosion of TCW computed using a 5 × 5 window using Earth mover’s distance.	[72]
MORPH_TCW_ero_5_sam	The erosion of TCW computed using a 5 × 5 window using spectral angle mapper distance.	[72]
MORPH_TCW_ero_5_sed	The erosion of TCW computed using a 5 × 5 window using squared Euclidean distance.	[72]
MORPH_TCW_ero_5_sid	The erosion of TCW computed using a 5 × 5 window using spectral information divergence.	[72]
MORPH_TCW_ero_7_emd	The erosion of TCW computed using a 7 × 7 window using Earth mover’s distance.	[72]
MORPH_TCW_ero_7_sam	The erosion of TCW computed using a 7 × 7 window using spectral angle mapper distance.	[72]
MORPH_TCW_ero_7_sed	The erosion of TCW computed using a 7 × 7 window using squared Euclidean distance.	[72]
MORPH_TCW_ero_7_sid	The erosion of TCW computed using a 7 × 7 window using spectral information divergence.	[72]
MORPH_gradient_3_emd	The gradient computed using a 3 × 3 window using Earth mover’s distance.	[72]
MORPH_gradient_3_sam	The gradient computed using a 3 × 3 window using spectral angle mapper distance.	[72]
MORPH_gradient_3_sed	The gradient computed using a 3 × 3 window using squared Euclidean distance.	[72]
MORPH_gradient_3_sid	The gradient computed using a 3 × 3 window using spectral information divergence distance.	[72]
MORPH_gradient_5_emd	The gradient computed using a 5 × 5 window using Earth mover’s distance.	[72]
MORPH_gradient_5_sam	The gradient computed using a 5 × 5 window using spectral angle mapper distance.	[72]
MORPH_gradient_5_sed	The gradient computed using a 5 × 5 window using squared Euclidean distance.	[72]
MORPH_gradient_5_sid	The gradient computed using a 5 × 5 window using spectral information divergence distance.	[72]
MORPH_gradient_7_emd	The gradient computed using a 7 × 7 window using Earth mover’s distance.	[72]
MORPH_gradient_7_sam	The gradient computed using a 7 × 7 window using spectral angle mapper distance.	[72]
MORPH_gradient_7_sed	The gradient computed using a 7 × 7 window using squared Euclidean distance.	[72]
MORPH_gradient_7_sid	The gradient computed using a 7 × 7 window using spectral information divergence distance.	[72]

Table A6. A description of the neighborhood similarity features that were used in the numerical experiments.

Variable Name	Description	Source
NS_TOP25_5_TCB_MEAN	The mean of TCB of the top 25% most similar pixels to the centroid of a 5 × 5 window.	N/A
NS_TOP25_5_TCG_MEAN	The mean of TCG of the top 25% most similar pixels to the centroid of a 5 × 5 window.	N/A
NS_TOP25_5_TCW_MEAN	The mean of TCW of the top 25% most similar pixels to the centroid of a 5 × 5 window.	N/A
NS_TOP25_5_TCB_STDDEV	The standard deviation of TCB of the top 25% most similar pixels to the centroid of a 5 × 5 window.	N/A
NS_TOP25_5_TCG_STDDEV	The standard deviation of TCG of the top 25% most similar pixels to the centroid of a 5 × 5 window.	N/A
NS_TOP25_5_TCW_STDDEV	The standard deviation of TCW of the top 25% most similar pixels to the centroid of a 5 × 5 window.	N/A
NS_TOP25_7_TCB_MEAN	The mean of TCB of the top 25% most similar pixels to the centroid of a 7 × 7 window.	N/A
NS_TOP25_7_TCG_MEAN	The mean of TCG of the top 25% most similar pixels to the centroid of a 7 × 7 window.	N/A
NS_TOP25_7_TCW_MEAN	The mean of TCW of the top 25% most similar pixels to the centroid of a 7 × 7 window.	N/A
NS_TOP25_7_TCB_STDDEV	The standard deviation of TCB of the top 25% most similar pixels to the centroid of a 7 × 7 window.	N/A
NS_TOP25_7_TCG_STDDEV	The standard deviation of TCG of the top 25% most similar pixels to the centroid of a 7 × 7 window.	N/A
NS_TOP25_7_TCW_STDDEV	The standard deviation of TCW of the top 25% most similar pixels to the centroid of a 7 × 7 window.	N/A
NS_TOP25_11_TCB_MEAN	The mean of TCB of the top 25% most similar pixels to the centroid of an 11 × 11 window.	N/A
NS_TOP25_11_TCG_MEAN	The mean of TCG of the top 25% most similar pixels to the centroid of an 11 × 11 window.	N/A
NS_TOP25_11_TCW_MEAN	The mean of TCW of the top 25% most similar pixels to the centroid of an 11 × 11 window.	N/A
NS_TOP25_11_TCB_STDDEV	The standard deviation of TCB of the top 25% most similar pixels to the centroid of an 11 × 11 window.	N/A
NS_TOP25_11_TCG_STDDEV	The standard deviation of TCG of the top 25% most similar pixels to the centroid of an 11 × 11 window.	N/A
NS_TOP25_11_TCW_STDDEV	The standard deviation of TCW of the top 25% most similar pixels to the centroid of an 11 × 11 window.	N/A
NS_TOP50_5_TCB_MEAN	The mean of TCB of the top 50% most similar pixels to the centroid of a 5 × 5 window.	N/A
NS_TOP50_5_TCG_MEAN	The mean of TCG of the top 50% most similar pixels to the centroid of a 5 × 5 window.	N/A
NS_TOP50_5_TCW_MEAN	The mean of TCW of the top 50% most similar pixels to the centroid of a 5 × 5 window.	N/A
NS_TOP50_5_TCB_STDDEV	The standard deviation of TCB of the top 50% most similar pixels to the centroid of a 5 × 5 window.	N/A
NS_TOP50_5_TCG_STDDEV	The standard deviation of TCG of the top 50% most similar pixels to the centroid of a 5 × 5 window.	N/A
NS_TOP50_5_TCW_STDDEV	The standard deviation of TCW of the top 50% most similar pixels to the centroid of a 5 × 5 window.	N/A
NS_TOP50_7_TCB_MEAN	The mean of TCB of the top 50% most similar pixels to the centroid of a 7 × 7 window.	N/A
NS_TOP50_7_TCG_MEAN	The mean of TCG of the top 50% most similar pixels to the centroid of a 7 × 7 window.	N/A
NS_TOP50_7_TCW_MEAN	The mean of TCW of the top 50% most similar pixels to the centroid of a 7 × 7 window.	N/A
NS_TOP50_7_TCB_STDDEV	The standard deviation of TCB of the top 50% most similar pixels to the centroid of a 7 × 7 window.	N/A
NS_TOP50_7_TCG_STDDEV	The standard deviation of TCG of the top 50% most similar pixels to the centroid of a 7 × 7 window.	N/A
NS_TOP50_7_TCW_STDDEV	The standard deviation of TCW of the top 50% most similar pixels to the centroid of a 7 × 7 window.	N/A
NS_TOP50_11_TCB_MEAN	The mean of TCB of the top 50% most similar pixels to the centroid of an 11 × 11 window.	N/A
NS_TOP50_11_TCG_MEAN	The mean of TCG of the top 50% most similar pixels to the centroid of an 11 × 11 window.	N/A
NS_TOP50_11_TCW_MEAN	The mean of TCW of the top 50% most similar pixels to the centroid of an 11 × 11 window.	N/A
NS_TOP50_11_TCB_STDDEV	The standard deviation of TCB of the top 50% most similar pixels to the centroid of an 11 × 11 window.	N/A
NS_TOP50_11_TCG_STDDEV	The standard deviation of TCG of the top 50% most similar pixels to the centroid of an 11 × 11 window.	N/A
NS_TOP50_11_TCW_STDDEV	The standard deviation of TCW of the top 50% most similar pixels to the centroid of an 11 × 11 window.	N/A
NS_TOP75_5_TCB_MEAN	The mean of TCB of the top 75% most similar pixels to the centroid of a 5 × 5 window.	N/A
NS_TOP75_5_TCG_MEAN	The mean of TCG of the top 75% most similar pixels to the centroid of a 5 × 5 window.	N/A
NS_TOP75_5_TCW_MEAN	The mean of TCW of the top 75% most similar pixels to the centroid of a 5 × 5 window.	N/A
NS_TOP75_5_TCB_STDDEV	The standard deviation of TCB of the top 75% most similar pixels to the centroid of a 5 × 5 window.	N/A
NS_TOP75_5_TCG_STDDEV	The standard deviation of TCG of the top 75% most similar pixels to the centroid of a 5 × 5 window.	N/A
NS_TOP75_5_TCW_STDDEV	The standard deviation of TCW of the top 75% most similar pixels to the centroid of a 5 × 5 window.	N/A
NS_TOP75_7_TCB_MEAN	The mean of TCB of the top 75% most similar pixels to the centroid of a 7 × 7 window.	N/A
NS_TOP75_7_TCG_MEAN	The mean of TCG of the top 75% most similar pixels to the centroid of a 7 × 7 window.	N/A
NS_TOP75_7_TCW_MEAN	The mean of TCW of the top 75% most similar pixels to the centroid of a 7 × 7 window.	N/A
NS_TOP75_7_TCB_STDDEV	The standard deviation of TCB of the top 75% most similar pixels to the centroid of a 7 × 7 window.	N/A
NS_TOP75_7_TCG_STDDEV	The standard deviation of TCG of the top 75% most similar pixels to the centroid of a 7 × 7 window.	N/A
NS_TOP75_7_TCW_STDDEV	The standard deviation of TCW of the top 75% most similar pixels to the centroid of a 7 × 7 window.	N/A
NS_TOP75_11_TCB_MEAN	The mean of TCB of the top 75% most similar pixels to the centroid of an 11 × 11 window.	N/A
NS_TOP75_11_TCG_MEAN	The mean of TCG of the top 75% most similar pixels to the centroid of an 11 × 11 window.	N/A
NS_TOP75_11_TCW_MEAN	The mean of TCW of the top 75% most similar pixels to the centroid of an 11 × 11 window.	N/A
NS_TOP75_11_TCB_STDDEV	The standard deviation of TCB of the top 75% most similar pixels to the centroid of an 11 × 11 window.	N/A
NS_TOP75_11_TCG_STDDEV	The standard deviation of TCG of the top 75% most similar pixels to the centroid of an 11 × 11 window.	N/A
NS_TOP75_11_TCW_STDDEV	The standard deviation of TCW of the top 75% most similar pixels to the centroid of an 11 × 11 window.	N/A

Table A7. A description of the spatial vectorization features that were used in the numerical experiments.

Variable Name	Description	Source
SV_3_TCB_-1_-1	An element produced by vectorizing a 3 × 3 TCB window.	N/A
SV_3_TCB_-1_0	An element produced by vectorizing a 3 × 3 TCB window.	N/A
SV_3_TCB_-1_1	An element produced by vectorizing a 3 × 3 TCB window.	N/A
SV_3_TCB_0_-1	An element produced by vectorizing a 3 × 3 TCB window.	N/A
SV_3_TCB_0_1	An element produced by vectorizing a 3 × 3 TCB window.	N/A
SV_3_TCB_1_-1	An element produced by vectorizing a 3 × 3 TCB window.	N/A
SV_3_TCB_1_0	An element produced by vectorizing a 3 × 3 TCB window.	N/A
SV_3_TCB_1_1	An element produced by vectorizing a 3 × 3 TCB window.	N/A
SV_3_TCG_-1_-1	An element produced by vectorizing a 3 × 3 TCG window.	N/A
SV_3_TCG_-1_0	An element produced by vectorizing a 3 × 3 TCG window.	N/A
SV_3_TCG_-1_1	An element produced by vectorizing a 3 × 3 TCG window.	N/A
SV_3_TCG_0_-1	An element produced by vectorizing a 3 × 3 TCG window.	N/A
SV_3_TCG_0_1	An element produced by vectorizing a 3 × 3 TCG window.	N/A
SV_3_TCG_1_-1	An element produced by vectorizing a 3 × 3 TCG window.	N/A
SV_3_TCG_1_0	An element produced by vectorizing a 3 × 3 TCG window.	N/A
SV_3_TCG_1_1	An element produced by vectorizing a 3 × 3 TCB window.	N/A
SV_3_TCW_-1_-1	An element produced by vectorizing a 3 × 3 TCW window.	N/A
SV_3_TCW_-1_0	An element produced by vectorizing a 3 × 3 TCW window.	N/A
SV_3_TCW_-1_1	An element produced by vectorizing a 3 × 3 TCW window.	N/A
SV_3_TCW_0_-1	An element produced by vectorizing a 3 × 3 TCW window.	N/A
SV_3_TCW_0_1	An element produced by vectorizing a 3 × 3 TCW window.	N/A
SV_3_TCW_1_-1	An element produced by vectorizing a 3 × 3 TCW window.	N/A
SV_3_TCW_1_0	An element produced by vectorizing a 3 × 3 TCW window.	N/A
SV_3_TCW_1_1	An element produced by vectorizing a 3 × 3 TCW window.	N/A
SV_5_TCB_-2_-2	An element produced by vectorizing a 5 × 5 TCB window.	N/A
SV_5_TCB_-2_-1	An element produced by vectorizing a 5 × 5 TCB window.	N/A
SV_5_TCB_-2_0	An element produced by vectorizing a 5 × 5 TCB window.	N/A
SV_5_TCB_-2_1	An element produced by vectorizing a 5 × 5 TCB window.	N/A
SV_5_TCB_-2_2	An element produced by vectorizing a 5 × 5 TCB window.	N/A
SV_5_TCB_-1_-2	An element produced by vectorizing a 5 × 5 TCB window.	N/A
SV_5_TCB_-1_-1	An element produced by vectorizing a 5 × 5 TCB window.	N/A
SV_5_TCB_-1_0	An element produced by vectorizing a 5 × 5 TCB window.	N/A
SV_5_TCB_-1_1	An element produced by vectorizing a 5 × 5 TCB window.	N/A
SV_5_TCB_-1_2	An element produced by vectorizing a 5 × 5 TCB window.	N/A
SV_5_TCB_0_-2	An element produced by vectorizing a 5 × 5 TCB window.	N/A
SV_5_TCB_0_-1	An element produced by vectorizing a 5 × 5 TCB window.	N/A
SV_5_TCB_0_1	An element produced by vectorizing a 5 × 5 TCB window.	N/A
SV_5_TCB_0_2	An element produced by vectorizing a 5 × 5 TCB window.	N/A
SV_5_TCB_1_-2	An element produced by vectorizing a 5 × 5 TCB window.	N/A
SV_5_TCB_1_-1	An element produced by vectorizing a 5 × 5 TCB window.	N/A
SV_5_TCB_1_0	An element produced by vectorizing a 5 × 5 TCB window.	N/A
SV_5_TCB_1_1	An element produced by vectorizing a 5 × 5 TCB window.	N/A
SV_5_TCB_1_2	An element produced by vectorizing a 5 × 5 TCB window.	N/A
SV_5_TCB_2_-2	An element produced by vectorizing a 5 × 5 TCB window.	N/A
SV_5_TCB_2_-1	An element produced by vectorizing a 5 × 5 TCB window.	N/A
SV_5_TCB_2_0	An element produced by vectorizing a 5 × 5 TCB window.	N/A
SV_5_TCB_2_1	An element produced by vectorizing a 5 × 5 TCB window.	N/A
SV_5_TCB_2_2	An element produced by vectorizing a 5 × 5 TCB window.	N/A
SV_5_TCG_-2_-2	An element produced by vectorizing a 5 × 5 TCG window.	N/A
SV_5_TCG_-2_-1	An element produced by vectorizing a 5 × 5 TCG window.	N/A
SV_5_TCG_-2_0	An element produced by vectorizing a 5 × 5 TCG window.	N/A
SV_5_TCG_-2_1	An element produced by vectorizing a 5 × 5 TCG window.	N/A
SV_5_TCG_-2_2	An element produced by vectorizing a 5 × 5 TCG window.	N/A
SV_5_TCG_-1_-2	An element produced by vectorizing a 5 × 5 TCG window.	N/A
SV_5_TCG_-1_-1	An element produced by vectorizing a 5 × 5 TCG window.	N/A
SV_5_TCG_-1_0	An element produced by vectorizing a 5 × 5 TCG window.	N/A
SV_5_TCG_-1_1	An element produced by vectorizing a 5 × 5 TCG window.	N/A
SV_5_TCG_-1_2	An element produced by vectorizing a 5 × 5 TCG window.	N/A
SV_5_TCG_0_-2	An element produced by vectorizing a 5 × 5 TCG window.	N/A
SV_5_TCG_0_-1	An element produced by vectorizing a 5 × 5 TCG window.	N/A
SV_5_TCG_0_1	An element produced by vectorizing a 5 × 5 TCG window.	N/A
SV_5_TCG_0_2	An element produced by vectorizing a 5 × 5 TCG window.	N/A
SV_5_TCG_1_-2	An element produced by vectorizing a 5 × 5 TCG window.	N/A
SV_5_TCG_1_-1	An element produced by vectorizing a 5 × 5 TCG window.	N/A
SV_5_TCG_1_0	An element produced by vectorizing a 5 × 5 TCG window.	N/A
SV_5_TCG_1_1	An element produced by vectorizing a 5 × 5 TCG window.	N/A
SV_5_TCG_1_2	An element produced by vectorizing a 5 × 5 TCG window.	N/A
SV_5_TCG_2_-2	An element produced by vectorizing a 5 × 5 TCG window.	N/A
SV_5_TCG_2_-1	An element produced by vectorizing a 5 × 5 TCG window.	N/A
SV_5_TCG_2_0	An element produced by vectorizing a 5 × 5 TCG window.	N/A
SV_5_TCG_2_1	An element produced by vectorizing a 5 × 5 TCG window.	N/A
SV_5_TCG_2_2	An element produced by vectorizing a 5 × 5 TCG window.	N/A
SV_5_TCW_-2_-2	An element produced by vectorizing a 5 × 5 TCW window.	N/A
SV_5_TCW_-2_-1	An element produced by vectorizing a 5 × 5 TCW window.	N/A
SV_5_TCW_-2_0	An element produced by vectorizing a 5 × 5 TCW window.	N/A
SV_5_TCW_-2_1	An element produced by vectorizing a 5 × 5 TCW window.	N/A
SV_5_TCW_-2_2	An element produced by vectorizing a 5 × 5 TCW window.	N/A
SV_5_TCW_-1_-2	An element produced by vectorizing a 5 × 5 TCW window.	N/A
SV_5_TCW_-1_-1	An element produced by vectorizing a 5 × 5 TCW window.	N/A
SV_5_TCW_-1_0	An element produced by vectorizing a 5 × 5 TCW window.	N/A
SV_5_TCW_-1_1	An element produced by vectorizing a 5 × 5 TCW window.	N/A
SV_5_TCW_-1_2	An element produced by vectorizing a 5 × 5 TCW window.	N/A
SV_5_TCW_0_-2	An element produced by vectorizing a 5 × 5 TCW window.	N/A
SV_5_TCW_0_-1	An element produced by vectorizing a 5 × 5 TCW window.	N/A
SV_5_TCW_0_1	An element produced by vectorizing a 5 × 5 TCW window.	N/A
SV_5_TCW_0_2	An element produced by vectorizing a 5 × 5 TCW window.	N/A
SV_5_TCW_1_-2	An element produced by vectorizing a 5 × 5 TCW window.	N/A
SV_5_TCW_1_-1	An element produced by vectorizing a 5 × 5 TCW window.	N/A
SV_5_TCW_1_0	An element produced by vectorizing a 5 × 5 TCW window.	N/A
SV_5_TCW_1_1	An element produced by vectorizing a 5 × 5 TCW window.	N/A
SV_5_TCW_1_2	An element produced by vectorizing a 5 × 5 TCW window.	N/A
SV_5_TCW_2_-2	An element produced by vectorizing a 5 × 5 TCW window.	N/A
SV_5_TCW_2_-1	An element produced by vectorizing a 5 × 5 TCW window.	N/A
SV_5_TCW_2_0	An element produced by vectorizing a 5 × 5 TCW window.	N/A
SV_5_TCW_2_1	An element produced by vectorizing a 5 × 5 TCW window.	N/A
SV_5_TCW_2_2	An element produced by vectorizing a 5 × 5 TCW window.	N/A
SV_7_TCB_-3_-3	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_-3_-2	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_-3_-1	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_-3_0	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_-3_1	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_-3_2	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_-3_3	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_-2_-3	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_-2_-2	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_-2_-1	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_-2_0	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_-2_1	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_-2_2	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_-2_3	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_-1_-3	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_-1_-2	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_-1_-1	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_-1_0	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_-1_1	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_-1_2	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_-1_3	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_0_-3	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_0_-2	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_0_-1	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_0_1	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_0_2	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_0_3	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_1_-3	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_1_-2	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_1_-1	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_1_0	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_1_1	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_1_2	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_1_3	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_2_-3	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_2_-2	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_2_-1	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_2_0	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_2_1	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_2_2	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_2_3	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_3_-3	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_3_-2	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_3_-1	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_3_0	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_3_1	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_3_2	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCB_3_3	An element produced by vectorizing a 7 × 7 TCB window.	N/A
SV_7_TCG_-3_-3	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_-3_-2	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_-3_-1	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_-3_0	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_-3_1	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_-3_2	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_-3_3	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_-2_-3	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_-2_-2	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_-2_-1	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_-2_0	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_-2_1	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_-2_2	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_-2_3	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_-1_-3	An element produced by vectorizing a 7 × 7 TCG window.	1367 N/A
SV_7_TCG_-1_-2	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_-1_-1	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_-1_0	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_-1_1	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_-1_2	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_-1_3	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_0_-3	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_0_-2	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_0_-1	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_0_1	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_0_2	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_0_3	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_1_-3	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_1_-2	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_1_-1	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_1_0	An element produced by vectorizing a 7 × 7 TCG window.	1387 N/A
SV_7_TCG_1_1	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_1_2	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_1_3	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_2_-3	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_2_-2	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_2_-1	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_2_0	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_2_1	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_2_2	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_2_3	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_3_-3	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_3_-2	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_3_-1	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_3_0	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_3_1	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_3_2	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCG_3_3	An element produced by vectorizing a 7 × 7 TCG window.	N/A
SV_7_TCW_-3_-3	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_-3_-2	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_-3_-1	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_-3_0	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_-3_1	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_-3_2	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_-3_3	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_-2_-3	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_-2_-2	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_-2_-1	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_-2_0	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_-2_1	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_-2_2	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_-2_3	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_-1_-3	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_-1_-2	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_-1_-1	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_-1_0	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_-1_1	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_-1_2	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_-1_3	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_0_-3	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_0_-2	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_0_-1	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_0_1	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_0_2	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_0_3	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_1_-3	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_1_-2	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_1_-1	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_1_0	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_1_1	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_1_2	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_1_3	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_2_-3	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_2_-2	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_2_-1	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_2_0	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_2_1	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_2_2	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_2_3	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_3_-3	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_3_-2	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_3_-1	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_3_0	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_3_1	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_3_2	An element produced by vectorizing a 7 × 7 TCW window.	N/A
SV_7_TCW_3_3	An element produced by vectorizing a 7 × 7 TCW window.	N/A

Table A8. A description of the Continuous Change Detection and Classification (CCDC) statistical features that were used in the numerical experiments. The all-dates CCDC model, denoted with the suffix “ALL” was fit using all available imagery in a given year. The no-winter CCDC model, denoted with the suffix “SUMMER”, excluded dates that typically possessed snow cover. Snow cover was found to negatively impact the CCDC fitting in some areas of the study region.

Variable Name	Description	Source
CCDC_B2_SIN_ALL	The sine coefficient of the B2 all-dates CCDC model	[77]
CCDC_B3_SIN_ALL	The sine coefficient of the B3 all-dates CCDC model	[77]
CCDC_B4_SIN_ALL	The sine coefficient of the B4 all-dates CCDC model	[77]
CCDC_B5_SIN_ALL	The sine coefficient of the B5 all-dates CCDC model	[77]
CCDC_B7_SIN_ALL	The sine coefficient of the B7 all-dates CCDC model	[77]
CCDC_B2_SIN_SUMMER	The sine coefficient of the B2 no-winter CCDC model	[77]
CCDC_B3_SIN_SUMMER	The sine coefficient of the B3 no-winter CCDC model	[77]
CCDC_B4_SIN_SUMMER	The sine coefficient of the B4 no-winter CCDC model	[77]
CCDC_B5_SIN_SUMMER	The sine coefficient of the B5 no-winter CCDC model	[77]
CCDC_B7_SIN_SUMMER	The sine coefficient of the B7 no-winter CCDC model	[77]
CCDC_B2_COS_ALL	The cosine coefficient of the B2 all-dates CCDC model	[77]
CCDC_B3_COS_ALL	The cosine coefficient of the B3 all-dates CCDC model	[77]
CCDC_B4_COS_ALL	The cosine coefficient of the B4 all-dates CCDC model	[77]
CCDC_B5_COS_ALL	The cosine coefficient of the B5 all-dates CCDC model	[77]
CCDC_B7_COS_ALL	The cosine coefficient of the B7 all-dates CCDC model	[77]
CCDC_B2_COS_SUMMER	The cosine coefficient of the B2 no-winter CCDC model	[77]
CCDC_B3_COS_SUMMER	The cosine coefficient of the B3 no-winter CCDC model	[77]
CCDC_B4_COS_SUMMER	The cosine coefficient of the B4 no-winter CCDC model	[77]
CCDC_B5_COS_SUMMER	The cosine coefficient of the B5 no-winter CCDC model	[77]
CCDC_B7_COS_SUMMER	The cosine coefficient of the B7 no-winter CCDC model	[77]
CCDC_B2_SIN2_ALL	The second sine coefficient of the B2 all-dates CCDC model	[77]
CCDC_B3_SIN2_ALL	The second sine coefficient of the B3 all-dates CCDC model	[77]
CCDC_B4_SIN2_ALL	The second sine coefficient of the B4 all-dates CCDC model	[77]
CCDC_B5_SIN2_ALL	The second sine coefficient of the B5 all-dates CCDC model	[77]
CCDC_B7_SIN2_ALL	The second sine coefficient of the B7 all-dates CCDC model	[77]
CCDC_B2_SIN2_SUMMER	The second sine coefficient of the B2 no-winter CCDC model	[77]
CCDC_B3_SIN2_SUMMER	The second sine coefficient of the B3 no-winter CCDC model	[77]
CCDC_B4_SIN2_SUMMER	The second sine coefficient of the B4 no-winter CCDC model	[77]
CCDC_B5_SIN2_SUMMER	The second sine coefficient of the B5 no-winter CCDC model	[77]
CCDC_B7_SIN2_SUMMER	The second sine coefficient of the B7 no-winter CCDC model	[77]
CCDC_B2_COS2_ALL	The second cosine coefficient of the B2 all-dates CCDC model	[77]
CCDC_B3_COS2_ALL	The second cosine coefficient of the B3 all-dates CCDC model	[77]
CCDC_B4_COS2_ALL	The second cosine coefficient of the B4 all-dates CCDC model	[77]
CCDC_B5_COS2_ALL	The second cosine coefficient of the B5 all-dates CCDC model	[77]
CCDC_B7_COS2_ALL	The second cosine coefficient of the B7 all-dates CCDC model	[77]
CCDC_B2_COS2_SUMMER	The second cosine coefficient of the B2 no-winter CCDC model	[77]
CCDC_B3_COS2_SUMMER	The second cosine coefficient of the B3 no-winter CCDC model	[77]
CCDC_B4_COS2_SUMMER	The second cosine coefficient of the B4 no-winter CCDC model	[77]
CCDC_B5_COS2_SUMMER	The second cosine coefficient of the B5 no-winter CCDC model	[77]
CCDC_B7_COS2_SUMMER	The second cosine coefficient of the B7 no-winter CCDC model	[77]
CCDC_B2_SIN3_ALL	The third sine coefficient of the B2 all-dates CCDC model	[77]
CCDC_B3_SIN3_ALL	The third sine coefficient of the B3 all-dates CCDC model	[77]
CCDC_B4_SIN3_ALL	The third sine coefficient of the B4 all-dates CCDC model	[77]
CCDC_B5_SIN3_ALL	The third sine coefficient of the B5 all-dates CCDC model	[77]
CCDC_B7_SIN3_ALL	The third sine coefficient of the B7 all-dates CCDC model	[77]
CCDC_B2_SIN3_SUMMER	The third sine coefficient of the B2 no-winter CCDC model	[77]
CCDC_B3_SIN3_SUMMER	The third sine coefficient of the B3 no-winter CCDC model	[77]
CCDC_B4_SIN3_SUMMER	The third sine coefficient of the B4 no-winter CCDC model	[77]
CCDC_B5_SIN3_SUMMER	The third sine coefficient of the B5 no-winter CCDC model	[77]
CCDC_B7_SIN3_SUMMER	The third sine coefficient of the B7 no-winter CCDC model	[77]
CCDC_B2_COS3_ALL	The third cosine coefficient of the B2 all-dates CCDC model	[77]
CCDC_B3_COS3_ALL	The third cosine coefficient of the B3 all-dates CCDC model	[77]
CCDC_B4_COS3_ALL	The third cosine coefficient of the B4 all-dates CCDC model	[77]
CCDC_B5_COS3_ALL	The third cosine coefficient of the B5 all-dates CCDC model	[77]
CCDC_B7_COS3_ALL	The third cosine coefficient of the B7 all-dates CCDC model	[77]
CCDC_B2_COS3_SUMMER	The third cosine coefficient of the B2 no-winter CCDC model	[77]
CCDC_B3_COS3_SUMMER	The third cosine coefficient of the B3 no-winter CCDC model	[77]
CCDC_B4_COS3_SUMMER	The third cosine coefficient of the B4 no-winter CCDC model	[77]
CCDC_B5_COS3_SUMMER	The third cosine coefficient of the B5 no-winter CCDC model	[77]
CCDC_B7_COS3_SUMMER	The third cosine coefficient of the B7 no-winter CCDC model	[77]
CCDC_B2_INTP_ALL	The intercept coefficient of the B2 all-dates CCDC model	[77]
CCDC_B3_INTP_ALL	The intercept coefficient of the B3 all-dates CCDC model	[77]
CCDC_B4_INTP_ALL	The intercept coefficient of the B4 all-dates CCDC model	[77]
CCDC_B5_INTP_ALL	The intercept coefficient of the B5 all-dates CCDC model	[77]
CCDC_B7_INTP_ALL	The intercept coefficient of the B7 all-dates CCDC model	[77]
CCDC_B2_INTP_SUMMER	The intercept coefficient of the B2 no-winter CCDC model	[77]
CCDC_B3_INTP_SUMMER	The intercept coefficient of the B3 no-winter CCDC model	[77]
CCDC_B4_INTP_SUMMER	The intercept coefficient of the B4 no-winter CCDC model	[77]
CCDC_B5_INTP_SUMMER	The intercept coefficient of the B5 no-winter CCDC model	[77]
CCDC_B7_INTP_SUMMER	The intercept coefficient of the B7 no-winter CCDC model	[77]
CCDC_B2_SLP_ALL	The slope coefficient of the B2 all-dates CCDC model	[77]
CCDC_B3_SLP_ALL	The slope coefficient of the B3 all-dates CCDC model	[77]
CCDC_B4_SLP_ALL	The slope coefficient of the B4 all-dates CCDC model	[77]
CCDC_B5_SLP_ALL	The slope coefficient of the B5 all-dates CCDC model	[77]
CCDC_B7_SLP_ALL	The slope coefficient of the B7 all-dates CCDC model	[77]
CCDC_B2_SLP_SUMMER	The slope coefficient of the B2 no-winter CCDC model	[77]
CCDC_B3_SLP_SUMMER	The slope coefficient of the B3 no-winter CCDC model	[77]
CCDC_B4_SLP_SUMMER	The slope coefficient of the B4 no-winter CCDC model	[77]
CCDC_B5_SLP_SUMMER	The slope coefficient of the B5 no-winter CCDC model	[77]
CCDC_B7_SLP_SUMMER	The slope coefficient of the B7 no-winter CCDC model	[77]
CCDC_B2_RMSE_ALL	The RMSE coefficient of the B2 all-dates CCDC model	[77]
CCDC_B3_RMSE_ALL	The RMSE coefficient of the B3 all-dates CCDC model	[77]
CCDC_B4_RMSE_ALL	The RMSE coefficient of the B4 all-dates CCDC model	[77]
CCDC_B5_RMSE_ALL	The RMSE coefficient of the B5 all-dates CCDC model	[77]
CCDC_B7_RMSE_ALL	The RMSE coefficient of the B7 all-dates CCDC model	[77]
CCDC_B2_RMSE_SUMMER	The RMSE coefficient of the B2 no-winter CCDC model	[77]
CCDC_B3_RMSE_SUMMER	The RMSE coefficient of the B3 no-winter CCDC model	[77]
CCDC_B4_RMSE_SUMMER	The RMSE coefficient of the B4 no-winter CCDC model	[77]
CCDC_B5_RMSE_SUMMER	The RMSE coefficient of the B5 no-winter CCDC model	[77]
CCDC_B7_RMSE_SUMMER	The RMSE coefficient of the B7 no-winter CCDC model	[77]
CCDC_B2_MAG_ALL	The magnitude of the GMD for B2 from the all-dates CCDC model	[77]
CCDC_B3_MAG_ALL	The magnitude of the GMD for B3 from the all-dates CCDC model	[77]
CCDC_B4_MAG_ALL	The magnitude of the GMD for B4 from the all-dates CCDC model	[77]
CCDC_B5_MAG_ALL	The magnitude of the GMD for B5 from the all-dates CCDC model	[77]
CCDC_B7_MAG_ALL	The magnitude of the GMD for B7 from the all-dates CCDC model	[77]
CCDC_B2_MAG_SUMMER	The magnitude of the GMD for B2 from the no-winter CCDC model	[77]
CCDC_B3_MAG_SUMMER	The magnitude of the GMD for B3 from the no-winter CCDC model	[77]
CCDC_B4_MAG_SUMMER	The magnitude of the GMD for B4 from the no-winter CCDC model	[77]
CCDC_B5_MAG_SUMMER	The magnitude of the GMD for B5 from the no-winter CCDC model	[77]
CCDC_B7_MAG_SUMMER	The magnitude of the GMD for B7 from the no-winter CCDC model	[77]
CCDC_NUMTBREAK_ALL	The number of breakpoints in the all-dates CCDC model	[77]
CCDC_NUMTBREAK_SUMMER	The number of breakpoints in the no-winter CCDC model	[77]
CCDC_TBREAK_ALL	The number of years since the GMD in the all-dates CCDC model	[77]
CCDC_TBREAK_SUMMER	The number of years since the GMD in the no-winter CCDC model	[77]

Table A9. A description of the LandTrendr-derived time series features that were used in the numerical experiments.

Variable Name	Description	Source
LandTendr_TCW_1_gmd_mag	The TCW magnitude of the GMD from the more conservative parameterization	[18]
LandTendr_TCW_2_gmd_mag	The TCW magnitude of the GMD from the more liberal parameterization	[18]
LandTendr_TCA_1_gmd_mag	The TCA magnitude of the GMD from the more conservative parameterization	[18]
LandTendr_TCA_2_gmd_mag	The TCA magnitude of the GMD from the more liberal parameterization	[18]
LandTendr_NBR_1_gmd_mag	The NBR magnitude of the GMD from the more conservative parameterization	[18]
LandTendr_NBR_2_gmd_mag	The NBR magnitude of the GMD from the more liberal parameterization	[18]
LandTendr_TCW_1_gmd_dur	The TCW duration of the GMD from the more conservative parameterization	[18]
LandTendr_TCW_2_gmd_dur	The TCW duration of the GMD from the more liberal parameterization	[18]
LandTendr_TCA_1_gmd_dur	The TCA duration of the GMD from the more conservative parameterization	[18]
LandTendr_TCA_2_gmd_dur	The TCA duration of the GMD from the more liberal parameterization	[18]
LandTendr_NBR_1_gmd_dur	The NBR duration of the GMD from the more conservative parameterization	[18]
LandTendr_NBR_2_gmd_dur	The NBR duration of the GMD from the more liberal parameterization	[18]
LandTendr_TCW_1_gmd_preval	The TCW pre-disturbance value of the GMD from the more conservative parameterization	[18]
LandTendr_TCW_2_gmd_preval	The TCW pre-disturbance value of the GMD from the more liberal parameterization	[18]
LandTendr_TCA_1_gmd_preval	The TCA pre-disturbance value of the GMD from the more conservative parameterization	[18]
LandTendr_TCA_2_gmd_preval	The TCA pre-disturbance value of the GMD from the more liberal parameterization	[18]
LandTendr_NBR_1_gmd_preval	The NBR pre-disturbance value of the GMD from the more conservative parameterization	[18]
LandTendr_NBR_2_gmd_preval	The NBR pre-disturbance value of the GMD from the more liberal parameterization	[18]
LandTendr_TCW_1_gmd_dsnr	The TCW DSNR value of the GMD from the more conservative parameterization	[35]
LandTendr_TCW_2_gmd_dsnr	The TCW DSNR value of the GMD from the more liberal parameterization	[35]
LandTendr_TCA_1_gmd_dsnr	The TCA DSNR value of the GMD from the more conservative parameterization	[35]
LandTendr_TCA_2_gmd_dsnr	The TCA DSNR value of the GMD from the more liberal parameterization	[35]
LandTendr_NBR_1_gmd_dsnr	The NBR DSNR value of the GMD from the more conservative parameterization	[35]
LandTendr_NBR_2_gmd_dsnr	The NBR DNSR value of the GMD from the more liberal parameterization	[35]
LandTendr_TCW_1_gain_mag	The TCW magnitude of the greatest gain/recovery segment from the more conservative parameterization	[18]
LandTendr_TCW_2_gain_mag	The TCW magnitude of the greatest gain/recovery segment from the more liberal parameterization	[18]
LandTendr_TCA_1_gain_mag	The TCA magnitude of the greatest gain/recovery segment from the more conservative parameterization	[18]
LandTendr_TCA_2_gain_mag	The TCA magnitude of the greatest gain/recovery segment from the more liberal parameterization	[18]
LandTendr_NBR_1_gain_mag	The NBR magnitude of the greatest gain/recovery segment from the more conservative parameterization	[18]
LandTendr_NBR_2_gain_mag	The NBR magnitude of the greatest gain/recovery segment from the more liberal parameterization	[18]
LandTendr_TCW_1_gain_dur	The TCW duration of the greatest gain/recovery segment from the more conservative parameterization	[18]
LandTendr_TCW_2_gain_dur	The TCW duration of the greatest gain/recovery segment from the more liberal parameterization	[18]
LandTendr_TCA_1_gain_dur	The TCA duration of the greatest gain/recovery segment from the more conservative parameterization	[18]
LandTendr_TCA_2_gain_dur	The TCA duration of the greatest gain/recovery segment from the more liberal parameterization	[18]
LandTendr_NBR_1_gain_dur	The NBR duration of the greatest gain/recovery segment from the more conservative parameterization	[18]
LandTendr_NBR_2_gain_dur	The NBR duration of the greatest gain/recovery segment from the more liberal parameterization	[18]
LandTendr_TCW_1_gain_preval	The TCW pre-disturbance value of the greatest gain/recovery segment from the more conservative parameterization	[18]
LandTendr_TCW_2_gain_preval	The TCW pre-disturbance value of the greatest gain/recovery segment from the more liberal parameterization	[18]
LandTendr_TCA_1_gain_preval	The TCA pre-disturbance value of the greatest gain/recovery segment from the more conservative parameterization	[18]
LandTendr_TCA_2_gain_preval	The TCA pre-disturbance value of the greatest gain/recovery segment from the more liberal parameterization	[18]
LandTendr_NBR_1_gain_preval	The NBR pre-disturbance value of the greatest gain/recovery segment from the more conservative parameterization	[18]
LandTendr_NBR_2_gain_preval	The NBR pre-disturbance value of the greatest gain/recovery segment from the more liberal parameterization	[18]
LandTendr_TCW_1_gain_dsnr	The TCW DSNR value of the greatest gain/recovery segment from the more conservative parameterization	[35]
LandTendr_TCW_2_gain_dsnr	The TCW DSNR value of the greatest gain/recovery segment from the more liberal parameterization	[35]
LandTendr_TCA_1_gain_dsnr	The TCA DSNR value of the greatest gain/recovery segment from the more conservative parameterization	[35]
LandTendr_TCA_2_gain_dsnr	The TCA DSNR value of the greatest gain/recovery segment from the more liberal parameterization	[35]
LandTendr_NBR_1_gain_dsnr	The NBR DSNR value of the greatest gain/recovery segment from the more conservative parameterization	[35]
LandTendr_NBR_2_gain_dsnr	The NBR DNSR value of the greatest gain/recovery segment from the more liberal parameterization	[35]

References

Pan, Y.; Birdsey, R.A.; Fang, J.; Houghton, R.; Kauppi, P.E.; Kurz, W.A.; Phillips, O.L.; Shvidenko, A.; Lewis, S.L.; Canadell, J.G.; et al. A large and persistent carbon sink in the world’s forests. Science 2011, 333, 988–993. [Google Scholar] [CrossRef] [PubMed]
Heinrich, V.H.; Vancutsem, C.; Dalagnol, R.; Rosan, T.M.; Fawcett, D.; Silva-Junior, C.H.; Cassol, H.L.; Achard, F.; Jucker, T.; Silva, C.A.; et al. The carbon sink of secondary and degraded humid tropical forests. Nature 2023, 615, 436–442. [Google Scholar] [CrossRef] [PubMed]
Bar-On, Y.M.; Phillips, R.; Milo, R. The biomass distribution on Earth. Proc. Natl. Acad. Sci. USA 2018, 115, 6506–6511. [Google Scholar] [CrossRef] [PubMed]
Erb, K.H.; Kastner, T.; Plutzar, C.; Bais, A.L.S.; Carvalhais, N.; Fetzel, T.; Gingrich, S.; Haberl, H.; Lauk, C.; Niedertscheider, M.; et al. Unexpectedly large impact of forest management and grazing on global vegetation biomass. Nature 2018, 553, 73–76. [Google Scholar] [CrossRef] [PubMed]
Matasci, G.; Hermosilla, T.; Wulder, M.A.; White, J.C.; Coops, N.C.; Hobart, G.W.; Zald, H.S. Large-area mapping of Canadian boreal forest cover, height, biomass and other structural attributes using Landsat composites and lidar plots. Remote Sens. Environ. 2018, 209, 90–106. [Google Scholar] [CrossRef]
Hudak, A.T.; Fekety, P.A.; Kane, V.R.; Kennedy, R.E.; Filippelli, S.K.; Falkowski, M.J.; Tinkham, W.T.; Smith, A.M.; Crookston, N.L.; Domke, G.M.; et al. A carbon monitoring system for mapping regional, annual aboveground biomass across the northwestern USA. Environ. Res. Lett. 2020, 15, 095003. [Google Scholar] [CrossRef]
Arévalo, P.; Baccini, A.; Woodcock, C.E.; Olofsson, P.; Walker, W.S. Continuous mapping of aboveground biomass using Landsat time series. Remote Sens. Environ. 2023, 288, 113483. [Google Scholar] [CrossRef]
Lefsky, M.A.; Cohen, W.B.; Parker, G.G.; Harding, D.J. Lidar Remote Sensing for Ecosystem Studies: Lidar, an emerging remote sensing technology that directly measures the three-dimensional distribution of plant canopies, can accurately estimate vegetation structural attributes and should be of particular interest to forest, landscape, and global ecologists. Bioscience 2002, 52, 19–30. [Google Scholar] [CrossRef]
Zolkos, S.G.; Goetz, S.J.; Dubayah, R. A meta-analysis of terrestrial aboveground biomass estimation using lidar remote sensing. Remote Sens. Environ. 2013, 128, 289–298. [Google Scholar] [CrossRef]
Coops, N.C.; Tompalski, P.; Goodbody, T.R.; Queinnec, M.; Luther, J.E.; Bolton, D.K.; White, J.C.; Wulder, M.A.; van Lier, O.R.; Hermosilla, T. Modelling lidar-derived estimates of forest attributes over space and time: A review of approaches and future trends. Remote Sens. Environ. 2021, 260, 112477. [Google Scholar] [CrossRef]
Hansen, M.C.; Krylov, A.; Tyukavina, A.; Potapov, P.V.; Turubanova, S.; Zutta, B.; Ifo, S.; Margono, B.; Stolle, F.; Moore, R. Humid tropical forest disturbance alerts using Landsat data. Environ. Res. Lett. 2016, 11, 034008. [Google Scholar] [CrossRef]
Hansen, M.C.; Potapov, P.V.; Moore, R.; Hancher, M.; Turubanova, S.A.; Tyukavina, A.; Thau, D.; Stehman, S.; Goetz, S.J.; Loveland, T.R.; et al. High-resolution global maps of 21st-century forest cover change. Science 2013, 342, 850–853. [Google Scholar] [CrossRef] [PubMed]
Goetz, S.; Dubayah, R. Advances in remote sensing technology and implications for measuring and monitoring forest carbon stocks and change. Carbon Manag. 2011, 2, 231–244. [Google Scholar] [CrossRef]
Kennedy, R.E.; Andréfouët, S.; Cohen, W.B.; Gómez, C.; Griffiths, P.; Hais, M.; Healey, S.P.; Helmer, E.H.; Hostert, P.; Lyons, M.B.; et al. Bringing an ecological view of change to Landsat-based remote sensing. Front. Ecol. Environ. 2014, 12, 339–346. [Google Scholar] [CrossRef]
Lu, D.; Chen, Q.; Wang, G.; Moran, E.; Batistella, M.; Zhang, M.; Laurin, G.V.; Saah, D. Aboveground forest biomass estimation with Landsat and LiDAR data and uncertainty analysis of the estimates. Int. J. For. Res. 2012, 2012, 436537. [Google Scholar] [CrossRef]
Dube, T.; Mutanga, O. Evaluating the utility of the medium-spatial resolution Landsat 8 multispectral sensor in quantifying aboveground biomass in uMgeni catchment, South Africa. Isprs J. Photogramm. Remote Sens. 2015, 101, 36–46. [Google Scholar] [CrossRef]
Kennedy, R.E.; Ohmann, J.; Gregory, M.; Roberts, H.; Yang, Z.; Bell, D.M.; Kane, V.; Hughes, M.J.; Cohen, W.B.; Powell, S.; et al. An empirical, integrated forest biomass monitoring system. Environ. Res. Lett. 2018, 13, 025004. [Google Scholar] [CrossRef]
Pflugmacher, D.; Cohen, W.B.; Kennedy, R.E. Using Landsat-derived disturbance history (1972–2010) to predict current forest structure. Remote Sens. Environ. 2012, 122, 146–165. [Google Scholar] [CrossRef]
Pflugmacher, D.; Cohen, W.B.; Kennedy, R.E.; Yang, Z. Using Landsat-derived disturbance and recovery history and lidar to map forest biomass dynamics. Remote Sens. Environ. 2014, 151, 124–137. [Google Scholar] [CrossRef]
Lu, D.; Batistella, M. Exploring TM image texture and its relationships with biomass estimation in Rondonia, Brazilian Amazon. Acta Amaz. 2005, 35, 249–257. [Google Scholar] [CrossRef]
Lu, D. Aboveground biomass estimation using Landsat TM data in the Brazilian Amazon. Int. J. Remote Sens. 2005, 26, 2509–2525. [Google Scholar] [CrossRef]
Zhao, P.; Lu, D.; Wang, G.; Wu, C.; Huang, Y.; Yu, S. Examining spectral reflectance saturation in Landsat imagery and corresponding solutions to improve forest aboveground biomass estimation. Remote Sens. 2016, 8, 469. [Google Scholar] [CrossRef]
Haralick, R.M.; Shanmugam, K.; Dinstein, I.H. Textural features for image classification. IEEE Trans. Syst. Man Cybern. 1973, 3, 610–621. [Google Scholar] [CrossRef]
Kelsey, K.C.; Neff, J.C. Estimates of aboveground biomass from texture analysis of Landsat imagery. Remote Sens. 2014, 6, 6407–6422. [Google Scholar] [CrossRef]
Karlson, M.; Ostwald, M.; Reese, H.; Sanou, J.; Tankoano, B.; Mattsson, E. Mapping tree canopy cover and aboveground biomass in Sudano-Sahelian woodlands using Landsat 8 and random forest. Remote Sens. 2015, 7, 10017–10041. [Google Scholar] [CrossRef]
Dube, T.; Mutanga, O. Investigating the robustness of the new Landsat-8 Operational Land Imager derived texture metrics in estimating plantation forest aboveground biomass in resource constrained areas. Isprs J. Photogramm. Remote Sens. 2015, 108, 12–32. [Google Scholar] [CrossRef]
Sanchez-Ruiz, S.; Moreno-Martinez, A.; Izquierdo-Verdiguier, E.; Chiesi, M.; Maselli, F.; Gilabert, M.A. Growing stock volume from multi-temporal landsat imagery through google earth engine. Int. J. Appl. Earth Obs. Geoinf. 2019, 83, 101913. [Google Scholar] [CrossRef]
Frazier, R.J.; Coops, N.C.; Wulder, M.A.; Kennedy, R. Characterization of aboveground biomass in an unmanaged boreal forest using Landsat temporal segmentation metrics. Isprs J. Photogramm. Remote Sens. 2014, 92, 137–146. [Google Scholar] [CrossRef]
Kennedy, R.E.; Yang, Z.; Cohen, W.B. Detecting trends in forest disturbance and recovery using yearly Landsat time series: 1. LandTrendr — Temporal segmentation algorithms. Remote Sens. Environ. 2010, 114, 2897–2910. [Google Scholar] [CrossRef]
Kennedy, R.E.; Yang, Z.; Gorelick, N.; Braaten, J.; Cavalcante, L.; Cohen, W.B.; Healey, S. Implementation of the LandTrendr algorithm on google earth engine. Remote Sens. 2018, 10, 691. [Google Scholar] [CrossRef]
Zhu, Z.; Woodcock, C.E. Continuous change detection and classification of land cover using all available Landsat data. Remote Sens. Environ. 2014, 144, 152–171. [Google Scholar] [CrossRef]
Pasquarella, V.J.; Arévalo, P.; Bratley, K.H.; Bullock, E.L.; Gorelick, N.; Yang, Z.; Kennedy, R.E. Demystifying LandTrendr and CCDC temporal segmentation. Int. J. Appl. Earth Obs. Geoinf. 2022, 110, 102806. [Google Scholar] [CrossRef]
Myroniuk, V.; Bell, D.M.; Gregory, M.J.; Vasylyshyn, R.; Bilous, A. Uncovering forest dynamics using historical forest inventory data and Landsat time series. For. Ecol. Manag. 2022, 513, 120184. [Google Scholar] [CrossRef]
Healey, S.P.; Cohen, W.B.; Yang, Z.; Brewer, C.K.; Brooks, E.B.; Gorelick, N.; Hernandez, A.J.; Huang, C.; Hughes, M.J.; Kennedy, R.E. Mapping forest change using stacked generalization: An ensemble approach. Remote Sens. Environ. 2018, 204, 717–728. [Google Scholar] [CrossRef]
Cohen, W.B.; Yang, Z.; Healey, S.P.; Kennedy, R.E.; Gorelick, N. A LandTrendr multispectral ensemble for forest disturbance detection. Remote Sens. Environ. 2018, 205, 131–140. [Google Scholar] [CrossRef]
Belgiu, M.; Drăguţ, L. Random forest in remote sensing: A review of applications and future directions. Isprs J. Photogramm. Remote Sens. 2016, 114, 24–31. [Google Scholar] [CrossRef]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Powell, S.L.; Cohen, W.B.; Healey, S.P.; Kennedy, R.E.; Moisen, G.G.; Pierce, K.B.; Ohmann, J.L. Quantification of live aboveground forest biomass dynamics with Landsat time-series and field inventory data: A comparison of empirical modeling approaches. Remote Sens. Environ. 2010, 114, 1053–1068. [Google Scholar] [CrossRef]
Powell, S.L.; Cohen, W.B.; Kennedy, R.E.; Healey, S.P.; Huang, C. Observation of trends in biomass loss as a result of disturbance in the conterminous US: 1986–2004. Ecosystems 2014, 17, 142–157. [Google Scholar] [CrossRef]
Hermosilla, T.; Wulder, M.A.; White, J.C.; Coops, N.C.; Hobart, G.W. Regional detection, characterization, and attribution of annual forest change from 1984 to 2012 using Landsat-derived time-series metrics. Remote Sens. Environ. 2015, 170, 121–132. [Google Scholar] [CrossRef]
Fekety, P.; Hudak, A. LiDAR-Derived Forest Aboveground Biomass Maps, Northwestern USA, 2002–2016; ORNL DAAC: Oak Ridge, TN, USA, 2020. [Google Scholar] [CrossRef]
Long, C.J.; Whitlock, C. Fire and vegetation history from the coastal rain forest of the western Oregon Coast Range. Quat. Res. 2002, 58, 215–225. [Google Scholar] [CrossRef]
Spies, T.A.; Franklin, J.F. The structure of natural young, mature, and old-growth Douglas-fir forests in Oregon and Washington. Wildl. Veg. Unmanaged-Douglas-Fir For. 1991, 1, 91–109. [Google Scholar]
Lefsky, M.A.; Cohen, W.; Acker, S.; Parker, G.G.; Spies, T.; Harding, D. Lidar remote sensing of the canopy structure and biophysical properties of Douglas-fir western hemlock forests. Remote Sens. Environ. 1999, 70, 339–361. [Google Scholar] [CrossRef]
Fekety, P.A.; Hudak, A.T.; Bright, B.C. Field Observations for “A Carbon Monitoring System for Mapping Regional, Annual Aboveground Biomass Across the Northwestern USA”; Forest Service Research Data Archive: Fort Collins, CO, USA, 2020. [CrossRef]
Housman, I.; Campbell, L.; Goetz, W.; Finco, M.; Pugh, N.; Megown, K. US Forest Service Landscape Change Monitoring System Methods; U.S. Department of Agriculture, Forest Service, Geospatial Technology and Applications Center: Salt Lake City, UT, USA, 2021. [Google Scholar]
Microsoft Team. Computer Generated Building Footprints for the United States. 2018. Available online: https://github.com/microsoft/USBuildingFootprints (accessed on 15 August 2024).
Milne, B.T.; Cohen, W.B. Multiscale assessment of binary and continuous landcover variables for MODIS validation, mapping, and modeling applications. Remote Sens. Environ. 1999, 70, 82–98. [Google Scholar] [CrossRef]
Hudak, A.T.; Lefsky, M.A.; Cohen, W.B.; Berterretche, M. Integration of lidar and Landsat ETM+ data for estimating and mapping forest canopy height. Remote Sens. Environ. 2002, 82, 397–416. [Google Scholar] [CrossRef]
Johnston, J.D.; Kilbride, J.B.; Meigs, G.W.; Dunn, C.J.; Kennedy, R.E. Does conserving roadless wildland increase wildfire activity in western US national forests? Environ. Res. Lett. 2021, 16, 084040. [Google Scholar] [CrossRef]
Flood, N. Seasonal composite Landsat TM/ETM images using the Medoid (a multi-dimensional median). Remote Sens. 2013, 5, 6481–6500. [Google Scholar] [CrossRef]
Gorelick, N.; Hancher, M.; Dixon, M.; Ilyushchenko, S.; Thau, D.; Moore, R. Google Earth Engine: Planetary-scale geospatial analysis for everyone. Remote Sens. Environ. 2017, 202, 18–27. [Google Scholar] [CrossRef]
Zhang, Y.; Ma, J.; Liang, S.; Li, X.; Liu, J. A stacking ensemble algorithm for improving the biases of forest aboveground biomass estimations from multiple remotely sensed datasets. GISci. Remote Sens. 2022, 59, 234–249. [Google Scholar] [CrossRef]
Kennedy, R.E.; Yang, Z.; Braaten, J.; Copass, C.; Antonova, N.; Jordan, C.; Nelson, P. Attribution of disturbance change agent from Landsat time-series in support of habitat monitoring in the Puget Sound region, USA. Remote Sens. Environ. 2015, 166, 271–285. [Google Scholar] [CrossRef]
Roberts-Pierel, B.M.; Kirchner, P.B.; Kilbride, J.B.; Kennedy, R.E. Changes over the Last 35 Years in Alaska’s Glaciated Landscape: A Novel Deep Learning Approach to Mapping Glaciers at Fine Temporal Granularity. Remote Sens. 2022, 14, 4582. [Google Scholar] [CrossRef]
Hopkins, L.M.; Hallman, T.A.; Kilbride, J.; Robinson, W.D.; Hutchinson, R.A. A comparison of remotely sensed environmental predictors for avian distributions. Landsc. Ecol. 2022, 37, 997–1016. [Google Scholar] [CrossRef]
Gesch, D.; Oimoen, M.; Greenlee, S.; Nelson, C.; Steuck, M.; Tyler, D. The national elevation dataset. Photogramm. Eng. Remote Sens. 2002, 68, 5–32. [Google Scholar]
Key, C.H.; Benson, N.C. Landscape assessment (LA). In FIREMON: Fire Effects Monitoring and Inventory System. Gen. Tech. Rep. RMRS-GTR-164-CD; Lutes, D.C., Keane, R.E., Caratti, J.F., Key, C.H., Benson, N.C., Sutherland, S., Gangi, L.J., Eds.; US Department of Agriculture, Forest Service, Rocky Mountain Research Station: Fort Collins, CO, USA, 2006; Volume 164, p. LA-1-55. [Google Scholar]
Crist, E.P.; Cicone, R.C. A Physically-Based Transformation of Thematic Mapper Data—The TM Tasseled Cap. IEEE Trans. Geosci. Remote Sens. 1984, GE-22, 256–263. [Google Scholar] [CrossRef]
Rouse, J.; Haas, R.; Schell, J.; Deering, D. Monitoring vegetation systems in the Great Plains with ERTS. In Third Earth Resources Technology Satellite-1 Symposium. Volume 1: Technical Presentations, Section A; NASA Special Publication: Washington, DC, USA, 1974; Volume 19740022614, pp. 309–317. [Google Scholar]
Wilson, E.H.; Sader, S.A. Detection of forest harvest type using multiple dates of Landsat TM imagery. Remote Sens. Environ. 2002, 80, 385–396. [Google Scholar] [CrossRef]
Huete, A.; Didan, K.; Miura, T.; Rodriguez, E.P.; Gao, X.; Ferreira, L.G. Overview of the radiometric and biophysical performance of the MODIS vegetation indices. Remote Sens. Environ. 2002, 83, 195–213. [Google Scholar] [CrossRef]
Farr, T.G.; Rosen, P.A.; Caro, E.; Crippen, R.; Duren, R.; Hensley, S.; Kobrick, M.; Paller, M.; Rodriguez, E.; Roth, L.; et al. The Shuttle Radar Topography Mission. Rev. Geophys. 2007, 45. [Google Scholar] [CrossRef]
Hallman, T.A.; Robinson, W.D. Comparing multi-and single-scale species distribution and abundance models built with the boosted regression tree algorithm. Landsc. Ecol. 2020, 35, 1161–1174. [Google Scholar] [CrossRef]
Cutler, M.; Boyd, D.; Foody, G.; Vetrivel, A. Estimating tropical forest biomass with a combination of SAR image texture and Landsat TM data: An assessment of predictions between regions. Isprs J. Photogramm. Remote Sens. 2012, 70, 66–77. [Google Scholar] [CrossRef]
Roberts, L.G. Machine Perception of Three-Dimensional Solids. Ph.D. Thesis, Massachusetts Institute of Technology, Massachusetts, MA, USA, 1963. [Google Scholar]
Sobel, I.; Feldman, G. A 3 × 3 isotropic gradient operator for image processing. Talk Stanf. Artif. Proj. 1968, 271–272. [Google Scholar]
Prewitt, J.M. Object enhancement and extraction. Pict. Process. Psychopictorics 1970, 10, 15–19. [Google Scholar]
Kirsch, R.A. Computer determination of the constituent structure of biological images. Comput. Biomed. Res. 1971, 4, 315–328. [Google Scholar] [CrossRef] [PubMed]
Canny, J. A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 1986, 8, 679–698. [Google Scholar] [CrossRef] [PubMed]
Haralick, R.M.; Sternberg, S.R.; Zhuang, X. Image analysis using mathematical morphology. IEEE Trans. Pattern Anal. Mach. Intell. 1987, 9, 532–550. [Google Scholar] [CrossRef] [PubMed]
Plaza, A.; Martinez, P.; Pérez, R.; Plaza, J. Spatial/spectral endmember extraction by multidimensional morphological operations. IEEE Trans. Geosci. Remote Sens. 2002, 40, 2025–2041. [Google Scholar] [CrossRef]
Du, Y.; Chang, C.I.; Ren, H.; Chang, C.C.; Jensen, J.O.; D’Amico, F.M. New hyperspectral discrimination measure for spectral characterization. Opt. Eng. 2004, 43, 1777–1786. [Google Scholar]
Ayrey, E.; Hayes, D.J.; Kilbride, J.B.; Fraver, S.; Kershaw, J.A.; Cook, B.D.; Weiskittel, A.R. Synthesizing Disparate LiDAR and Satellite Datasets through Deep Learning to Generate Wall-to-Wall Regional Inventories for the Complex, Mixed-Species Forests of the Eastern United States. Remote Sens. 2021, 13, 5113. [Google Scholar] [CrossRef]
Pasquarella, V.J.; Kilbride, J.B. Not-so-random forests: Comparing voting and decision tree ensembles for characterizing partial harvest events in complex forested landscapes. Int. J. Appl. Earth Obs. Geoinf. 2023, 125, 103561. [Google Scholar]
Arévalo, P.; Olofsson, P.; Woodcock, C.E. Continuous monitoring of land change activities and post-disturbance dynamics from Landsat time series: A test methodology for REDD+ reporting. Remote Sens. Environ. 2020, 238, 111051. [Google Scholar] [CrossRef]
Arévalo, P.; Bullock, E.L.; Woodcock, C.E.; Olofsson, P. A suite of tools for continuous land change monitoring in google earth engine. Front. Clim. 2020, 2, 576740. [Google Scholar] [CrossRef]
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
Wu, J.; Chen, X.Y.; Zhang, H.; Xiong, L.D.; Lei, H.; Deng, S.H. Hyperparameter optimization for machine learning models based on Bayesian optimization. J. Electron. Sci. Technol. 2019, 17, 26–40. [Google Scholar]
Frazier, P.I. A tutorial on Bayesian optimization. arXiv 2018, arXiv:1807.02811. [Google Scholar]
Head, T.; MechCoder, G.L.; Shcherbatyi, I. scikit-optimize/scikit-optimize: v0. 5.2. 2018. Available online: https://scikit-optimize.github.io/stable/whats_new/v0.5.html (accessed on 15 August 2024).
Dunn, O.J. Multiple comparisons among means. J. Am. Stat. Assoc. 1961, 56, 52–64. [Google Scholar] [CrossRef]
Liaw, A.; Wiener, M. Classification and regression by randomForest. R News 2002, 2, 18–22. [Google Scholar]
Liu, Z.; Long, J.; Lin, H.; Sun, H.; Ye, Z.; Zhang, T.; Yang, P.; Ma, Y. Mapping and analyzing the spatiotemporal dynamics of forest aboveground biomass in the ChangZhuTan urban agglomeration using a time series of Landsat images and meteorological data from 2010 to 2020. Sci. Total Environ. 2024, 944, 173940. [Google Scholar] [CrossRef]
Tarasiou, M.; Chavez, E.; Zafeiriou, S. Vits for sits: Vision transformers for satellite image time series. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 10418–10428. [Google Scholar]
Yuan, Q.; Shen, H.; Li, T.; Li, Z.; Li, S.; Jiang, Y.; Xu, H.; Tan, W.; Yang, Q.; Wang, J.; et al. Deep learning in environmental remote sensing: Achievements and challenges. Remote Sens. Environ. 2020, 241, 111716. [Google Scholar] [CrossRef]
Lang, N.; Jetz, W.; Schindler, K.; Wegner, J.D. A high-resolution canopy height model of the Earth. Nat. Ecol. Evol. 2023, 7, 1778–1789. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Commun. Acm 2017, 60, 84–90. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 2623–2631. [Google Scholar]
Liaw, R.; Liang, E.; Nishihara, R.; Moritz, P.; Gonzalez, J.E.; Stoica, I. Tune: A Research Platform for Distributed Model Selection and Training. arXiv 2018, arXiv:1807.05118. [Google Scholar]
Saarela, S.; Holm, S.; Healey, S.P.; Andersen, H.E.; Petersson, H.; Prentius, W.; Patterson, P.L.; Næsset, E.; Gregoire, T.G.; Ståhl, G. Generalized hierarchical model-based estimation for aboveground biomass assessment using GEDI and landsat data. Remote Sens. 2018, 10, 1832. [Google Scholar] [CrossRef]
Chen, Q.; Han, R.; Ye, F.; Li, W. Spatio-temporal ecological models. Ecol. Inform. 2011, 6, 37–43. [Google Scholar] [CrossRef]
Hooper, S.; Kennedy, R.E. A spatial ensemble approach for broad-area mapping of land surface properties. Remote Sens. Environ. 2018, 210, 473–489. [Google Scholar] [CrossRef]
Finley, A.O.; Banerjee, S. Bayesian spatially varying coefficient models in the spBayes R package. Environ. Model. Softw. 2020, 125, 104608. [Google Scholar] [CrossRef]
Wheeler, D.C.; Calder, C.A. An assessment of coefficient accuracy in linear regression models with spatially varying coefficients. J. Geogr. Syst. 2007, 9, 145–166. [Google Scholar] [CrossRef]
Conners, R.W.; Trivedi, M.M.; Harlow, C.A. Segmentation of a high-resolution urban scene using texture operators. Comput. Vision, Graph. Image Process. 1984, 25, 273–310. [Google Scholar] [CrossRef]

Figure 1. The perimeters of the lidar AGB maps that were used as reference data in this analysis. The average forest AGB (Mg

{ha}^{- 1}

) in each perimeter is depicted.

Figure 1. The perimeters of the lidar AGB maps that were used as reference data in this analysis. The average forest AGB (Mg

{ha}^{- 1}

) in each perimeter is depicted.

Figure 2. An illustration of the sampling and data partitioning scheme used to generate the modeling dataset. A 500 m buffer was placed around test set locations to exclude samples from the training and development. This mitigates the impact of of spatial autocorrelation on our numerical experiments. Plots are superimposed over Landsat imagery (shortwave infrared-2, near-infrared, red reflectance; left panel) and true color National Agricultural Imagery Program 1 m imagery (right panel).

Figure 3. An overview of the image processing and feature engineering workflow used in this analysis.

Figure 4. RMSE distributions for the RF models developed in experiment 1.

Figure 5. Predicted vs. observed AGB values from the second experiment comparing the AGB predictions generated by Random Forest models over the testing set. Models were produced using (A) the baseline features, (B) the baseline and spatial features, (C) the baseline and temporal features, (D) the baseline, spatial, and temporal features. The relationships are summarized using an ordinary least square regression curve (red line). The black dashed line is the one-to-one curve.

Figure 6. The location of the four 15

{km}^{2}

subsets (red squares) that were selected to visualize the outputs from the AGB models developed in experiment 2. The subsets are located in (A) the Coast Range in Oregon, (B) Eastern Oregon, (C) North Central Washington, and (D) Central Idaho.

Figure 6. The location of the four 15

{km}^{2}

subsets (red squares) that were selected to visualize the outputs from the AGB models developed in experiment 2. The subsets are located in (A) the Coast Range in Oregon, (B) Eastern Oregon, (C) North Central Washington, and (D) Central Idaho.

Figure 7. The reference lidar AGB map and the spatial residuals from each of the four models applied to a 15

{km}^{2}

area in the Oregon Coast Range. Red colors indicate that the model overestimated the lidar AGB density. Blue indicates that the model underestimated the lidar AGB density.

Figure 7. The reference lidar AGB map and the spatial residuals from each of the four models applied to a 15

{km}^{2}

area in the Oregon Coast Range. Red colors indicate that the model overestimated the lidar AGB density. Blue indicates that the model underestimated the lidar AGB density.

Figure 8. The reference lidar AGB map and the spatial residuals from each of the four models applied to a 15

{km}^{2}

area in Eastern Oregon. Red colors indicate that the model overestimated the lidar AGB density. Blue indicates that the model underestimated the lidar AGB density.

Figure 8. The reference lidar AGB map and the spatial residuals from each of the four models applied to a 15

{km}^{2}

area in Eastern Oregon. Red colors indicate that the model overestimated the lidar AGB density. Blue indicates that the model underestimated the lidar AGB density.

Figure 9. The reference lidar AGB map and the spatial residuals from each of the four models applied to a 15

{km}^{2}

area in Central Idaho. Red colors indicate that the model overestimated the lidar AGB density. Blue indicates that the model underestimated the lidar AGB density.

Figure 9. The reference lidar AGB map and the spatial residuals from each of the four models applied to a 15

{km}^{2}

area in Central Idaho. Red colors indicate that the model overestimated the lidar AGB density. Blue indicates that the model underestimated the lidar AGB density.

Figure 10. The reference lidar AGB map and the spatial residuals from each of the four models applied to a 15

{km}^{2}

area in North Central Washington. Red colors indicate the model overestimated the lidar AGB density. Blue indicates the model underestimated the the lidar AGB density.

Figure 10. The reference lidar AGB map and the spatial residuals from each of the four models applied to a 15

{km}^{2}

area in North Central Washington. Red colors indicate the model overestimated the lidar AGB density. Blue indicates the model underestimated the the lidar AGB density.

Table 1. The Random Forest hyperparameter space used for all instances of Bayesian optimization. We note that the max_features range corresponds to the percentage of features randomly subselected when identifying a split at a node.

		Search Range
Parameter	Parameter Type	Minimum	Maximum
max_depth	Integer	1	100
min_samples_split	Integer	2	150
min_samples_leaf	Integer	1	100
max_features	Real	0.25	1

Table 2. Summary statistics characterizing the performance of the RF AGB models developed in experiment 1. The mean and standard deviation of the error statistics were derived from 250 RF models fitted to stratified subsets of the combined training and development sets and were then evaluated with an independent testing set.

Feature Group	Num. Features	R²	RMSE	MAE	Bias
Baseline	19	0.7 (0.0)	122.46 (0.16)	89.89 (0.13)	1.51 (0.2)
Buffer	29	0.74 (0.0)	113.94 (0.14)	83.41 (0.12)	3.81 (0.19)
GLCM	77	0.74 (0.0)	113.29 (0.16)	82.92 (0.12)	3.71 (0.18)
Edge detector	44	0.73 (0.0)	116.75 (0.15)	85.8 (0.12)	2.88 (0.18)
Morphological	89	0.73 (0.0)	116.01 (0.12)	85.31 (0.11)	3.13 (0.17)
NS	77	0.73 (0.0)	117.04 (0.13)	86.42 (0.11)	4.08 (0.18)
NV_3×3	32	0.73 (0.0)	115.85 (0.14)	85.3 (0.13)	4.36 (0.19)
NV_5×5	80	0.73 (0.0)	115.67 (0.13)	85.15 (0.11)	3.99 (0.16)
NV_7×7	152	0.73 (0.0)	115.77 (0.13)	85.33 (0.11)	4.45 (0.17)

Table 3. The 25 most important features, ranked by permuted feature importance, for the baseline model as well as the GLCM and buffer (BUFF) models. The number at the end of the GLCM and BUFF feature name indicates the size of the moving window used to derive the feature.

Ranking	Baseline	GLCM	BUFF
1	TOPO_ASPECT_COS	TOPO_ASPECT_COS	TOPO_ASPECT_COS
2	TOPO_ASPECT_SIN	TOPO_ASPECT_SIN	TOPO_ASPECT_SIN
3	TOPO_HILLSHADE	TOPO_HILLSHADE	TOPO_HILLSHADE
4	TOPO_SLOPE	TOPO_SLOPE	TOPO_SLOPE
5	TOPO_ELEVATION	TOPO_ELEVATION	TOPO_ELEVATION
6	SPEC_TCW	GLCM_TCW_var_7	BUFF_TCW_stdDev_7
7	SPEC_TCG	GLCM_TCW_var_3	BUFF_TCW_stdDev_3
8	SPEC_TCB	GLCM_TCW_var_11	BUFF_TCW_stdDev_15
9	SPEC_TCA	GLCM_TCW_savg_7	BUFF_TCW_stdDev_11
10	SPEC_NDVI	GLCM_TCW_savg_3	BUFF_TCW_mean_7
11	SPEC_NDMI	GLCM_TCW_savg_11	BUFF_TCW_mean_3
12	SPEC_NBR	GLCM_TCW_inertia_7	BUFF_TCW_mean_15
13	SPEC_EVI	GLCM_TCW_inertia_3	BUFF_TCW_mean_11
14	SPEC_B7	GLCM_TCW_inertia_11	BUFF_TCG_stdDev_7
15	SPEC_B5	GLCM_TCW_idm_7	BUFF_TCG_stdDev_3
16	SPEC_B4	GLCM_TCW_idm_3	BUFF_TCG_stdDev_15
17	SPEC_B3	GLCM_TCW_idm_11	BUFF_TCG_stdDev_11
18	SPEC_B2	GLCM_TCW_ent_7	BUFF_TCG_mean_7
19	SPEC_B1	GLCM_TCW_ent_3	BUFF_TCG_mean_3
20	-	GLCM_TCW_ent_11	BUFF_TCG_mean_15
21	-	GLCM_TCW_corr_7	BUFF_TCG_mean_11
22	-	GLCM_TCW_corr_3	BUFF_TCB_stdDev_7
23	-	GLCM_TCW_corr_11	BUFF_TCB_stdDev_3
24	-	GLCM_TCW_contrast_7	BUFF_TCB_stdDev_15
25	-	GLCM_TCW_contrast_3	BUFF_TCB_stdDev_11

Table 4. Summary statistics characterizing the performance of the RF AGB models developed in experiment 2. The mean and standard deviation of the error statistics were derived from 250 RF models fitted to stratified subsets of the combined training and development sets and were then evaluated with the independent test sets. Here, the baseline model is the same as in experiment 1.

Feature Group	Num. Features	R²	RMSE	MAE	Bias
Baseline	19	0.7 (0.0)	122.46 (0.16)	89.89 (0.13)	1.51 (0.2)
Baseline + Spatial	292	0.75 (0.0)	112.52 (0.13)	82.18 (0.1)	2.51 (0.16)
Baseline + Temporal	171	0.79 (0.0)	104.07 (0.13)	74.66 (0.1)	3.73 (0.15)
Baseline + All	444	0.8 (0.0)	100.78 (0.11)	72.15 (0.09)	3.36 (0.15)

Table 5. A summary of the AGB model improvement, as assessed by RMSE and

R^{2}

, of the tuned models from experiments 1 and 2 with respect to the RF models developed using default parameters. An asterisk indicates that the comparison was significant (

α

= 0.001).

Table 5. A summary of the AGB model improvement, as assessed by RMSE and

R^{2}

, of the tuned models from experiments 1 and 2 with respect to the RF models developed using default parameters. An asterisk indicates that the comparison was significant (

α

= 0.001).

Feature Group	RMSE Change	R² Change
Baseline	−1.75 to −1.7 *	0.01 *
Buffer	−2.89 to −2.85 *	0.01 *
GLCM	−1.66 to −1.61 *	0.01 *
Edge detectors	−3.91 to −3.87 *	0.02 *
Morphological	−0.69 to −0.64 *	0.0
NS	−4.98 to −4.94 *	0.02 *
NV_3×3	−2.03 to −1.98 *	0.01 *
NV_5×5	−5.27 to −5.23 *	0.02 *
NV_7×7	−2.8 to −2.76 *	0.01 *
Baseline + Spatial	−2.81 to −2.76 *	0.01 *
Baseline + Temporal	−2.04 to −2.0 *	0.01 *
Baseline + Spatial + Temporal	−3.41 to −3.37 *	0.02 *

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kilbride, J.B.; Kennedy, R.E. A Large-Scale Inter-Comparison and Evaluation of Spatial Feature Engineering Strategies for Forest Aboveground Biomass Estimation Using Landsat Satellite Imagery. Remote Sens. 2024, 16, 4586. https://doi.org/10.3390/rs16234586

AMA Style

Kilbride JB, Kennedy RE. A Large-Scale Inter-Comparison and Evaluation of Spatial Feature Engineering Strategies for Forest Aboveground Biomass Estimation Using Landsat Satellite Imagery. Remote Sensing. 2024; 16(23):4586. https://doi.org/10.3390/rs16234586

Chicago/Turabian Style

Kilbride, John B., and Robert E. Kennedy. 2024. "A Large-Scale Inter-Comparison and Evaluation of Spatial Feature Engineering Strategies for Forest Aboveground Biomass Estimation Using Landsat Satellite Imagery" Remote Sensing 16, no. 23: 4586. https://doi.org/10.3390/rs16234586

APA Style

Kilbride, J. B., & Kennedy, R. E. (2024). A Large-Scale Inter-Comparison and Evaluation of Spatial Feature Engineering Strategies for Forest Aboveground Biomass Estimation Using Landsat Satellite Imagery. Remote Sensing, 16(23), 4586. https://doi.org/10.3390/rs16234586

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Large-Scale Inter-Comparison and Evaluation of Spatial Feature Engineering Strategies for Forest Aboveground Biomass Estimation Using Landsat Satellite Imagery

Abstract

1. Introduction

2. Methods

2.1. Study Area

2.2. LiDAR AGB Estimates

2.3. Reference Dataset

2.4. Landsat Satellite Imagery

2.5. Feature Engineering

2.5.1. Overview

2.5.2. Baseline Features

2.5.3. Buffer Features

2.5.4. Gray Level Co-Occurrence Matrix Features

2.5.5. Edge Detector Features

2.5.6. Morphological Features

2.5.7. Neighborhood Vectorization Features

2.5.8. Neighborhood Similarity Features

2.5.9. Temporal Features

2.6. Random Forest Algorithm

2.7. Bayesian Hyperparameter Optimization

2.8. Experiment 1: Comparison of Spatial Feature Engineering Strategies

2.9. Experiment 2: Inter-Comparison of Feature Engineering Strategies

2.10. Experiment 3: Assessment of the Bayesian Optimization

3. Results

3.1. Experiment 1: Spatial Feature Engineering Comparison

3.2. Experiment 2: Inter-Comparison of Spatial and Temporal Features

3.3. Experiment 3: Impact of Hyperparameter Optimization

4. Discussion

4.1. Effectiveness of Spatial Features for AGB Modeling

4.2. Optimization of Black Box Algorithms

4.3. Analysis Limitations

4.4. Future Research

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A. Description of Modeling Features

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI