Combining ALS and Satellite Data to Develop High-Resolution Forest Growth Potential Maps for Plantation Stands in Western Canada

Khalifeh Soltanian, Faezeh; Henrique Terezan, Luiz; Chisholm, Colin E.; Dykstra, Pamela; MacKenzie, William H.; Elkin, Che

doi:10.3390/rs18030406

Open AccessArticle

Combining ALS and Satellite Data to Develop High-Resolution Forest Growth Potential Maps for Plantation Stands in Western Canada

by

Faezeh Khalifeh Soltanian

^1,*,

Luiz Henrique Terezan

¹,

Colin E. Chisholm

²,

Pamela Dykstra

^3,†,

William H. MacKenzie

⁴ and

Che Elkin

¹

Department of Forest Ecology and Management, University of Northern British Columbia, 3333 University Way, Prince George, BC V2N 4Z9, Canada

²

Aleza Lake Research Forest, 3333 University Way, Prince George, BC V2N 4M8, Canada

³

Ministry of Forests, P.O. Box 9513, Victoria, BC V8W 9C2, Canada

⁴

Ministry of Forests, BAG 6000, Smithers, BC V0J 2N0, Canada

^*

Author to whom correspondence should be addressed.

^†

Current address: Rockland Canada Management Corporation, 2–1320 Rockland Avenue, Victoria, BC V8S 1V6, Canada.

Remote Sens. 2026, 18(3), 406; https://doi.org/10.3390/rs18030406

Submission received: 6 November 2025 / Revised: 29 December 2025 / Accepted: 16 January 2026 / Published: 26 January 2026

(This article belongs to the Topic Forest Productivity, Carbon Dynamics and Eco-Environmental Response: Potential, Development and Challenges)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

Integrating ALS-terrain metrics with Sentinel-2 spectral indices enables accurate high-resolution, age-independent mapping of forest growth potential in plantation stands in Western Canada.
Developing a single generalized model provided Site Index predictions comparable to site-specific models, demonstrating strong transferability and enabling reliable productivity mapping across diverse regions without building separate models for each site.

What are the implications of the main findings?

This approach enables more precise and efficient forest management, allowing managers to identify high- and low-productivity zones and plan silviculture and harvesting more effectively.
The model can be applied across different regions with minimal loss in accuracy, supporting large-scale forest productivity assessment and climate-adaptive planning without requiring separate models for each area.

Abstract

Mapping forest growth potential across varying environments is challenging, especially when field measurements are limited. In this study, we integrated Airborne Laser Scanning (ALS) terrain derivatives and Sentinel-2 spectral indices to model Site Index (SI), using forest plantations, at 10-m spatial resolution across three ecologically distinct regions in British Columbia (Aleza Lake, Deception, and Eagle Hills). Random Forest regression models were calibrated using field-measured SI and a multistep variable-selection procedure that included Variance Inflation Factor (VIF) screening followed by model-based variable importance assessment. Model performance was evaluated using repeated 10-fold cross-validation. The combined ALS–Sentinel-2 models substantially outperformed single-source models, yielding cross-validated R² values of 0.63, 0.44, and 0.56 for Aleza Lake, Deception, and Eagle Hills, respectively, compared with R² values of 0.40, 0.40, and 0.46 for ALS-only models. Key predictors consistently included terrain metrics, such as the Topographic Position Index (TPI) and the Topographic Wetness Index (TWI), along with satellite-derived chlorophyll-sensitive indices including S2REP (Sentinel-2 red-edge position), MTCI (MERIS terrestrial chlorophyll), and GNDVI (Greenness Normalized Difference Vegetation Index). A general model using predictors common to all regions performed comparably (R² = 0.63, 0.41, 0.52), demonstrating the transferability and operational potential of the approach. These findings demonstrate that integrating ALS-derived terrain metrics with Sentinel-2 spectral indices provides a robust, age-independent framework for capturing spatial variability in forest productivity across landscapes. This multi-sensor fusion approach enhances traditional SI methods and single-sensor models, providing a scalable and operational tool for forest management and long-term planning in changing environmental conditions.

Keywords:

site index; remote sensing; random forest; forest productivity; ALS terrain; Sentinel-2

1. Introduction

The long-term sustainability of forest ecosystems and the services they provide are being impacted by changing socio-economic demands [1], shifting ecological conditions, and challenges associated with climate change [2,3]. Improving our ability to manage forest ecosystems effectively will be contingent on having good forest data and a good understanding of the main processes that influence forest development. Recent advancements in remote sensing have the potential to significantly contribute to this goal, including enhancing our ability to measure and understand the biological restrictions limiting tree and forest growth [4,5,6], refining the application of silvicultural techniques [7], and improving our ability to better match our forestry goals and management operations with our natural landscapes [8].

Accurate information on forest growth and yield is essential for developing optimal forest management strategies, including determining when, where, and how operations should be implemented [9,10]. Such information is also critical for forecasting forest development under various management regimes and future climate scenarios [11]. Site index (SI), defined as height of tree at breast height age 50 [12], plays a pivotal role in forest estate planning models, serving as a fundamental input variable for predicting stand development, determining rotation age, and optimizing long-term management and harvest schedules [13,14]. Site index (SI) is a key indicator of potential forest productivity and is commonly used to project future wood volume and biomass accumulation over time [15,16,17]. SI is influenced by multiple interacting factors, including tree species and provenance, climatic conditions, soil and topographic characteristics, and stand age. Therefore, SI represents an integrated indicator of site quality that reflects the combined effects of biotic and abiotic drives. Beyond informing yield predictions, SI can also serve as an indicator of underlying ecosystem processes such as soil moisture dynamics, nutrient cycling, and successional change, which are vital for maintaining ecosystem services [18,19]. Therefore, accurate estimates of SI are essential for evaluating species- and site-specific productivity and form the foundation of effective forest management and productivity assessment [20,21,22].

Due to variation in local site conditions, the accuracy of site index models and projections tends to be low at large landscape scales [23,24]. Significant differences may exist between the estimated site indices of the same tree species under similar climates but different site conditions [25,26,27]. To account for local conditions, topographic features are often incorporated into site index models, such as elevation [28,29,30], slope and aspect [31,32,33] and geographic position [34,35] or alternatively through climate- and site-based ecological units (i.e., PEM).

Traditionally, site index (SI) models have relied on field measurements of tree height and age [20]. While foundational for assessing forest productivity, these methods present notable challenges. Mapping forest growth potential solely through field measurements can be cost-prohibitive, operationally impractical, and time-intensive [22,36,37]. Additionally, traditional approaches struggle to capture fine-scale spatial variability, particularly in mixed-species or heterogeneous landscapes [38,39]. Errors in tree age estimation further exacerbate inaccuracies in SI predictions [10]. Finally, SI is dependent on precise stand age information, which is difficult to acquire in stands that are remote, uneven-aged, or of mixed species. Where stand age information is of low quality, from remote or complex stands, traditional methods of SI estimation may not be practical or valid.

Remote sensing technologies have emerged as tools that can potentially address these limitations. Satellite imagery and Airborne Laser Scanning (ALS) provide high-resolution, spatially comprehensive forest data, significantly reducing reliance on labor-intensive field surveys [40]. Satellite remote sensing has been proposed as a cost-effective means of assessing productivity, age, and spatial patterns of large-scale forest ecosystems [41,42]. Meanwhile, ALS captures three-dimensional representations of forest structure, increasing accuracy in measuring attributes such as canopy height and biomass [10,43]. Advancements in machine learning have also enhanced our ability to use remote sensing data, allowing for age-independent SI estimation and improved understanding of growth potential under varying environmental conditions [44,45].

Remote sensing methods have been used to evaluate vegetation attributes, including Leaf Area Index (LAI), biomass, and Photosynthetically Active Radiation (PAR) [46]. One study demonstrated the use of MODIS-derived Enhanced Vegetation Index (EVI) to map the SI of Douglas-fir over 630,000 km² in the U.S. Pacific Northwest, highlighting the scalability of remote sensing techniques [47]. Similarly, terrain attributes derived from Digital Elevation Models (DEMs), such as the Topographic Wetness Index (TWI) and Topographic Position Index (TPI) offer robust predictors for site productivity [48,49]. It is also understood that various predictive metrics (e.g., TWI or red-edge indices) correlate well with ecological factors such as water availability or vegetation health and can provide insight into key ecological processes related to productivity. For example, the TWI derived from the DEM is directly linked to soil moisture retention, an essential factor governing nutrient uptake and growth dynamics in forest stands [50,51].

High-resolution, region-wide forest productivity maps are essential for forest planning. However, in many areas, these maps are still unavailable. Here, we evaluate whether combining ALS-derived terrain layers with satellite data can generate accurate estimates of Site Index (SI) using an age-independent approach. Our project addresses the following questions:

Does the joint inclusion of ALS terrain-derived and satellite-based attributes (The local optimized model) improve the accuracy of SI models compared to using ALS terrain-derived data alone in plantation stands?
Are the developed SI models transferable and applicable across three ecologically distinct case study areas with varying growth-limiting factors?
Can a generalized model provide reliable SI estimates at broad landscape scales?
What environmental and spectral variables are most influential in estimating SI, and do they differ across ecological contexts?

2. Materials and Methods

2.1. Study Sites

Three ecologically distinct sites were selected across the interior of British Columbia, Canada, Aleza Lake Research Forest (ALRF), Deception Lake, and Eagle Hills (Figure 1) to represent the province’s major climatic and productivity gradients. Forest ecosystems in British Columbia are classified using the Biogeoclimatic Ecosystem Classification (BEC) system [52], which delineates zones and subzones according to climate, physiography, and climax vegetation.

Aleza Lake Research Forest (ALRF) is situated ≈60 km northeast of Prince George, within the Sub-Boreal Spruce (SBSwk1—Wet Cool Willow variant) zone. This region is characterized as a cool, moist continental climate with a mean annual precipitation of roughly 719 mm and frequent snowfall. Forest composition is dominated by hybrid white spruce (Picea glauca × engelmannii) and subalpine fir (Abies lasiocarpa), with lodgepole pine (Pinus contorta var. latifolia) and Douglas-fir (Pseudotsuga menziesii var. glauca) occurring on drier microsites and black spruce (Picea mariana) common in wet depressions.

Deception Lake lies approximately 45 km southeast of Smithers in northwestern British Columbia and encompasses both the Engelmann Spruce–Subalpine Fir (ESSFmc, ESSFmcw) and Sub-Boreal Spruce (SBSmc2 subzone) zones, with minor representation of SBSdk subzone. Elevations range from 890 to 1160 m. The climate is moist and cold, characterized by steep topographic relief and annual precipitation ranging from 574 to 1000 mm. Forests are dominated by Engelmann spruce (Picea engelmannii) and subalpine fir, with seral stands of lodgepole pine. Younger stands often contain trembling aspen (Populus tremuloides) and paper birch (Betula papyrifera).

Eagle Hills, located around 70 km northwest of Kamloops in the southern interior, covers approximately 9500 ha within the Interior Douglas-fir (IDFdk1—Dry Cool Thompson variant) zone at higher elevations (1200 m) and the IDFxh2 (Very Dry Hot Thompson) variant at lower elevations. The area experiences a warm, dry climate with a mean annual precipitation of 433 mm. Forests are dominated by uneven-aged Douglas-fir, with lodgepole pine occupying upper slopes; ponderosa pine (Pinus ponderosa) is present in adjacent PPxh2 stands but was excluded from analysis.

2.2. Empirical Field Data

Within each of the three study areas, approximately 100 individual site index plots were established in forest plantations ranging in age from 10 to 30 years. Sampling followed a stratified random design based on three factors: forest age, distance from main roads, and slope. Age polygons and road network layers were rasterized at a 10 m spatial resolution to identify accessible plantation areas (<500 m from main roads) within the target age range. Within each region, 20 nested sampling locations were randomly distributed, each containing five 100 m² circular plots (one central and four peripheral plots positioned 25 m apart) (Figure 2). This configuration resulted in approximately 100 individual plots per study area. All plots were located at least 25 m apart to minimize spatial autocorrelation.

Field data was collected in the summer of 2019. Within each 5.64 m fixed-diameter SI plot, we selected the dominant tree based on height. The specific method for identifying the tallest tree with good growth and without growth-form defects is described in the Growth Intercept Method for Silviculture Surveys handbook [53]. For each selected tree, we recorded species, diameter at breast height (DBH), and tree position with a high-accuracy Global Navigation Satellite System (GNSS) Receiver (model iSXblue II+ GNSS). The iSXblue II+ GNSS receiver, operating in standard mode, provided a positional accuracy of ±1 to 2 m, which was sufficient for aligning tree locations with the 5.64 m plot boundaries and integrating with Sentinel-2 data at a 10 m pixel resolution. Data collection was conducted under clear sky conditions to optimize signal quality and minimize errors. The site index for each tree was determined using the software Site Tools [54] that is specific to British Columbia forests. Site Tools calculates site index estimations at 50 years of breast height age, based on species, breast height tree age, and tree height above breast height. Given that the majority of the plantations used in our study were primarily planted to spruce, to have consistent and comparable site index estimates, the SI from lodgepole pine plantations were converted to the equivalent white spruce site index using the conversion equations included in Site Tools.

2.3. Remote Sensing Data and Processing

Airborne Laser Scanning (ALS) data were collected in 2016 at a point density between 10–15 points per square meter. Sixteen terrain variables were calculated from the ALS-derived high-resolution digital elevation model (DEM) using R (version 4.2.1) [55] and SAGA packages [56]: Aspect, Convergence Index, Diurnal Anisotropic Heating, Filled DEM, General Curvature Multiresolution Index of the Ridge Top Flatness (MRRTF), Multiresolution Index of Valley Bottom Flatness (MRVBF), Topographic-Openness-Dominance, Topographic Openness Enclosure, overland flow horizontal distance, overland flow vertical distance, slope, Total Curvature, Terrain Ruggedness Index, Topographic Position Index, and Topographic Wetness Index (Table A1). Because many terrain derivatives are mathematically related to DEMs, multicollinearity among these predictors was expected. Therefore, for each study area, the full terrain-derivative stack was evaluated using the Variance Inflation Factor (VIF). Variables with VIF ≥ 10 were iteratively removed, and only those with acceptable collinearity levels (VIF < 10) were retained. Although the ALS data were acquired in 2016, they were used solely to derive static terrain attributes (e.g., DEM, slope, curvature, TPI), which exhibit minimal change over multi-year timescales. Therefore, the 2016 ALS dataset provides an accurate representation of the terrain conditions relevant to our 2019 field and Sentinel-2 observations.

We collected Sentinel-2 A and B Level 1C Top-Of-Atmosphere (L1C-TOA) reflectance products from the summer season. Sentinel-2, Level 1C-TOA images (<3 percent cloud cover) from 1 June to 30 August 2019, were used for our three study sites. The images are freely available at the Copernicus Open Access Hub (https://dataspace.copernicus.eu/ accessed on 1 June 2021). The Sen2Cor processor and the Sentinel-2 SNAP toolbox were employed to correct for atmospheric, terrain, and cirrus cloud errors in L1C-TOA products. A conversion was performed on the L1C-TOA imagery to produce L2A surface reflectance products (Figure A1). We stacked six images and estimated the maximum for each pixel using the maximum value composite algorithm. Following this processing, Spectral Vegetation Indices (SVIs) were computed from Sentinel-2 spectral bands using L2A products at 10 m resolution: CLre, EVI7, EVI8, EVI, GNDVI, IRECI, NDVI, NDVI45, NDVI65, MSR, MTCI, S2REP, and WDRVI (Figure A1). These SVIs estimate vegetation biochemical and biophysical properties such as chlorophyll content, leaf area index (LAI), biomass, and volume [57,58,59,60]. We categorized these indices into three categories based on their application, formula structure, spectrum range, and the number and type of bands used (Table A2). As there is known to be a significant correlation between these layers [61], we use a two-step variable selection procedure. First, we calculated the Variance Inflation Factor (VIF) and retained variables with VIF values below 10 to minimize multicollinearity.

2.4. Development of the Local Optimal SI Models

Field-derived Site Index (SI) values served as the response variable in all modelling procedures (Figure 3). To ensure spatial alignment between empirical SI measurements and environmental predictors, all satellite-derived and LiDAR terrain-derived rasters that remained after the VIF screening were harmonized in QGIS. This preprocessing ensured that every raster layer within each region shared a common spatial reference system (UTM), spatial resolution (10 m), and spatial extent, thereby preventing misalignment during multi-predictor modelling. The harmonized rasters were then stacked and exported to R for plot-level predictor extraction.

Predictor values were retrieved for each empirical SI plot using a fixed buffer with a radius of 5.64 m, which corresponded to the field plot area. The extract() function in R was used to compute the mean pixel value for each predictor within the buffer, generating a spatially representative predictor matrix. Averaging across the buffer reduced the influence of GNSS positional uncertainty and subpixel reflectance variability. The extracted predictor values were subsequently paired with field-measured SI to form the modelling dataset for each study area. Predictor ranking was based on permutation importance (%IncMSE) calculated from the Random Forest out-of-bag error structure. Variables with the highest %IncMSE values were considered the most influential and were sequentially introduced into the model-building procedure.

The randomForest package [62,63] was used to create RF regression models in R (version 4.2.1) using caret [64]. Regression forests typically consider a set number of candidate variables at each split,

m t r y = \sqrt{p},

where

p

represents the number of predictors in the model. All other RF hyperparameters (ntree = 500, nodesize = 5) were left at default to balance model stability and computational efficiency. A fixed random seed (set.seed(7)) was applied to ensure reproducibility of results across all modelling steps. Model fitting, hyperparameter control, and repeated cross-validation were conducted using the train (method = “rf”) function in caret, which ensured consistent resampling and performance evaluation. Out-of-bag (OOB) error estimates were also examined to validate model stability and prevent overfitting.

A structured two-stage modelling strategy was then applied independently to each study area to identify the most parsimonious and ecologically meaningful predictor set.

2.4.1. Satellite-Only Baseline Modelling and Selection of the Most Important Key Spectral Predictors

After VIF reduction, the remaining satellite-derived vegetation indices (typically 6–8 per region) were used to fit a satellite-only Random Forest model. Variable importance was quantified using permutation importance (%IncMSE) derived from the out-of-bag (OOB) error structure. Because vegetation indices frequently exhibit strong multicollinearity—given that many capture overlapping information on canopy greenness, pigment concentration, leaf area, or red-edge biochemistry—retaining a large number of spectral predictors can destabilize RF models, inflate variance, and reduce interpretability. To maintain model parsimony and avoid overfitting, only the four highest-ranked indices based on %IncMSE were retained.

To further avoid ecological redundancy, indices were grouped according to spectral domain (visible, NIR, red edge), formula structure, and expected biophysical sensitivity (Table A2). From the ranked and grouped indices, four predictors were selected for each study area such that:

All major spectral regions were represented.
predictors contributed non-redundant ecological information, and
Each index demonstrated consistently high importance across RF resampling iterate.

These selected indices formed the fixed baseline predictor set for all combined model formulations.

2.4.2. Terrain-Only Modelling and Importance Ranking of LiDAR-Derived Predictors

To quantify the contribution of topographic variables, a terrain-only RF model was fitted using the VIF-filtered LiDAR-derived metrics. Variable importance values from this model produced a region-specific ranking of terrain predictors, highlighting the relative influence of slope, curvature, wetness indices, overland flow distances, and other terrain attributes on SI variability in each region. This ranked list was used as the basis for the structured integration of terrain predictors into the combined SI models.

2.4.3. Stepwise Integration of Terrain Predictors and Refinement of the Combined Model

To construct the local optimal SI model for each region, the four selected satellite predictors were held constant, while terrain predictors were sequentially added in accordance with their ranking of importance. The stepwise model-building structure was Model 1: 4 satellite predictors + terrain variable #1, Model 2: 4 satellite predictors + terrain variables #1 and #2, Model 3: 4 satellite predictors + terrain variables #1, #2, and #3, and continued iteratively until all terrain variables had been evaluated.

At each step, a new RF model was fitted and evaluated using repeated 10-fold cross-validation. Explained variance (r²), RMSE, and MAE were used to measure the model’s performance.

Terrain predictors were retained only when their inclusion resulted in consistent and statistically meaningful improvements in predictive performance across resamples. Predictors that failed to increase accuracy or added redundancy were not included. To guarantee the robustness of the model, consistency between cross-validated and out-of-bag performance indicators was observed.

The final retained subset, consisting of four spectral indices and a region-specific set of high-value terrain metrics, constituted the local optimal SI model for each study area. These models were used to compare the effectiveness of localized versus generalized modelling approaches and to generate SI predictions at the landscape level (Figure 3).

2.4.4. Variable Importance, Model Comparison, and Spatial Performance Mapping

Variable importance (VI) for each local optimal model was quantified using the varImp function in the caret package, based on permutation-based importance estimates. These VI scores were used to summarize the relative contribution of each predictor and to highlight differences in dominant environmental controls among study areas (Figure A2).

To assess the generalizability of the modelling framework, each region’s locally optimized model was compared with a general model fitted using the subset of predictors shared across all study areas. Both models were evaluated under the same 10-fold cross-validation procedure, enabling a consistent comparison of predictive performance across ecological contexts. Both models made spatial predictions for each region, and Difference maps were used to highlight differences between pixels. These maps show how the model behaves in different areas, showing where and why the general model is different from region-specific models. They also illustrate the potential trade-offs between a model optimized for one region vs. a more general model calibrated to multiple areas that are ecologically distinct.

In addition to the Difference map, we also generated uncertainty maps and uncertainty distributions (standard deviation histograms) for both the optimal and general models in each study area. Prediction uncertainty was quantified by extracting tree-level predictions from the Random Forest models using predict (all = TRUE) and calculating the standard deviation across all trees for each pixel. To ensure that uncertainty and SI prediction maps were visually comparable across the three study areas, a standardized colour scale was applied consistently to all spatial outputs. These uncertainty layers provide complementary insight into the stability of model predictions and allow us to evaluate whether the general model maintains comparable confidence levels to the locally optimized models across different environmental conditions. Together, the uncertainty maps and Difference maps offer a more comprehensive understanding of model performance and transferability.

3. Results

3.1. Empirical Data

Descriptive statistics of the empirical data are presented in Table 1. Aleza Lake exhibited the highest mean DBH, height, and site index, followed by Deception and Eagle Hills.

3.2. Predictive Map Generation

Variable-Selection.

Twenty-one ALS terrain and satellite variables (out of the original 30) were retained after doing VIF to reduce collinearity (Table 2). Similar variables were retained in each case study region, with the exception being the selected EVI satellite variable in Deception, ALRF and Eagle Hills, and four of the ALS terrain variables (P-openness, TRI, and vertical distance) (Table 2).

3.3. Random Forest Models

3.3.1. ALS-Derived Terrain Data

Site index models developed using just ALS terrain-derived predictor variables were reasonably accurate across all three case study regions (Figure 4).

Model accuracy was the same for both the ALRF and Deception sites, with slightly higher accuracy achieved in Eagle Hills. Empirically observed SI spanned 13.28–34.52 m at 50 years in ALRF, 13.26–26.60 in Deception, and 12.01–25.16 in Eagle Hills (Table 1). By contrast, the ALS-only models predicted SI ranges of 17.22–26.04 (ALRF), 17.49–24.12 (Deception), and 15.22–22.67 15.22–22.67 (Eagle Hills), indicating that the models did not capture the extremely low and high values and tended to slightly overpredict at low SI and underestimate at high SI (Figure 4).

3.3.2. ALS-Derived Terrain and Satellite Data (The Local Optimized Model)

Using a combination of ALS terrain-derived predictors and satellite predictors increased the accuracy of the SI models across all three case study regions (Table 3, Figure 5). The inclusion of satellite data increased model accuracy the most at the Aleza Lake site (0.23 increase in cross-validated R²), with a slightly lower increase in accuracy at Eagle Hills (a 0.10 R² increase). The smallest amount of increase in model accuracy was found at Deception Lake (0.04 R² increase). Additionally, residual analysis showed no significant relationship between residuals and stand age in any of the study areas (R² < 0.05).

3.4. SI Estimate Comparisons of the Local Optimal Model and General Models for Three Case Study Areas

The changes in the coefficient of determination (

Δ R^{2}

) between ALS+satellite (local optimized) and general models for Site Index (SI) predictions are reported across the study areas (Table 4). For Aleza Lake,

Δ R^{2}

was 0.00. For Deception,

Δ R^{2}

was −0.03. For Eagle Hills,

Δ R^{2}

was −0.04. Additional performance metrics, including Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and Mean Absolute Error (MAE), are provided in Table 4.

General model predictions of SI, based on ALS and satellite-derived variables, are illustrated in Figure 6, with coefficients of determination (

R^{2}

) of 0.63 for Aleza Lake, 0.41 for Deception, and 0.52 for Eagle Hills, representing the fitted regression lines for the local optimized (solid line) and general (dashed line) models against observed SI values.

3.5. Variable Importance

3.5.1. ALS + Satellite (The Local Optimized) Model

The relative contribution of ALS terrain- and satellite-derived variables differed across the three study areas. At Aleza Lake, three of the top five predictors were satellite-derived (MTCI, S2rep, and GNDVI), with MTCI emerging as the most influential; the remaining predictors showed relatively uniform importance. At Deception, S2rep was the leading predictor, but four ALS terrain-derived variables (P-Openness, Slope, DTM, and TPI) dominated the remainder, with a sharper decline in importance after the first variable. At Eagle Hills, ALS terrain-derived variables were predominant, particularly VerticalDistance, which clearly dominated, followed by overland flow; the remaining three predictors contributed minimally. Overall, the composition of the top five predictors was Aleza Lake (3 satellite, 2 ALS), Deception (1 satellite, 4 ALS), and Eagle Hills (1 satellite, 4 ALS), indicating site-specific differences in structural and spectral drivers and highlighting cases where a single variable exerted a strong influence (Figure A2).

3.5.2. General Model

When examining the general model, the relative contribution of ALS terrain- and satellite-derived predictors showed clear site-level differences. At Aleza Lake, three of the top five predictors were satellite-derived (MTCI, S2rep, and GNDVI), with MTCI remaining the dominant predictor. The next four predictors were relatively similar in importance, indicating a balanced contribution. At Deception, S2rep was again the top-ranked predictor, but four of the top five variables were ALS-derived (DTM, P-Openness, TPI, and VerticalDistance). In this case, importance declined more noticeably after the first predictor, showing a stronger reliance on a few dominant variables. At Eagle Hills, ALS variables clearly dominated, particularly Vertical Distance, which was markedly more influential than all others, followed by GNDVI. The remaining predictors (EVI, DTM, P-Openness) contributed only minimally, indicating a sharp drop in importance after the leading variable.

When comparing across sites, the composition of the top five predictors was Aleza Lake (3 satellite, 2 ALS), Deception (1 satellite, 4 ALS), and Eagle Hills (2 satellite, 4 ALS) (Figure A3).

3.6. Spatial Comparison of General vs. Site-Specific SI Predictions

Across plantation polygons, SI predictions ranged from 16 to over 30, with most areas falling into intermediate classes (18–22) and only small patches exhibiting higher SI values (Figure 7a). results from Difference showed minimal differences between the optimal and general models, with nearly all values between −1 and +1 SI units (Figure 7b). At the full landscape extent, SI predictions displayed continuous spatial gradients, with lower values (16–20) dominating the southwestern region and higher values (>24) concentrated in the northern and topographically complex areas (Figure 7c). Difference map across the entire landscape similarly indicated strong agreement between models, with only limited positive differences occurring along the outer boundaries (Figure 7d).

The uncertainty maps derived from both the optimal and general models (Figure 8b,e) showed similar spatial patterns of prediction variability across plantation areas. Uncertainty values (Standard Deviation (SD) of RF predictions) ranged from 0.75 to 4.03 for the optimal model and from 0.75 to 4.26 for the general model. In both cases, uncertainty was predominantly low to moderate (approximately 1–4), with higher values occurring in scattered patches throughout the study area. The general model exhibited a slightly broader spread of uncertainty values compared with the optimal model, although the overall spatial distribution remained consistent. Corresponding uncertainty histograms (Figure 8c,f) supported these patterns, with the optimal model centered around SD values of 2–3 and the general model displaying a marginally wider distribution extending into higher uncertainty classes.

Corresponding uncertainty histograms (Figure 8c,f) supported these observations. The optimal model produced a right-skewed distribution centered around SD values of approximately 2–3, whereas the general model showed a marginally wider distribution that extended into higher SD values. In both models, the majority of predictions fell within the lower to mid-range uncertainty classes.

Across plantation areas in the Deception study region, SI predictions ranged from approximately 16.2 to 24.8, with most polygons falling within lower to intermediate SI classes. Only small, localized patches exhibited higher SI values. To allow direct visual comparison across study areas, SI maps were displayed using a standardized colour scale (18–24). The difference map (general–optimal) showed minimal discrepancies between the two modelling approaches, with nearly all plantation pixels falling between −1 and +1 SI units. At the full landscape extent, SI predictions exhibited a continuous gradient within the same overall range, with intermediate values dominating most of the area and localized higher values appearing in the central and southwestern sections of the landscape. Landscape-scale differences were similarly close to zero in most locations, with only a few isolated patches showing deviations approaching ±2 SI units, indicating strong spatial agreement between the general and site-specific models (Figure A4).

Uncertainty values (SD of RF predictions) for the Deception study area ranged from 0.75 to 4.03 for the optimal model and from 0.75 to 4.26 for the general model. Both models exhibited similar spatial patterns of prediction variability, with most plantation pixels falling within low to moderate uncertainty classes (SD around 1–3) and only small isolated patches showing higher uncertainty values. Corresponding histograms revealed a right-skewed distribution centered around SD values of approximately 2.4–3.0, with the general model displaying a slightly wider spread extending into higher uncertainty values (Figure A5).

Across plantation areas in the Eagle Hills region, SI predictions ranged from approximately 14 to 24, with most polygons falling within lower to intermediate SI classes and only small localized patches exhibiting higher values (Figure A6a). Difference map results (general–optimal) showed minimal discrepancies between models, with nearly all plantation pixels falling between −1 and +1 SI units (Figure A6b). At the full landscape extent, SI predictions displayed a continuous gradient within the same overall range, with intermediate values dominating much of the region and lower values concentrated in the eastern and southeastern sections (Figure A6c). Landscape-scale differences were similarly small, with most values near zero and only isolated patches reaching deviations of up to ±2 SI units (Figure A6).

The uncertainty maps derived from the optimal and general models (Figure A7b,e) showed comparable spatial patterns across plantation areas in the Eagle Hills region. Uncertainty values (SD of RF predictions) ranged from 0.66 to 4.74 for the optimal model and from 0.68 to 4.98 for the general model. In both cases, uncertainty was predominantly low to moderate (approximately 1–3), with higher values occurring only in small, scattered patches. The general model exhibited a slightly broader spread of uncertainty values relative to the optimal model, although the spatial distribution of uncertainty remained consistent between the two approaches. Corresponding histograms (Figure A7c,f) showed right-skewed distributions centered around SD values of approximately 2–3, with the general model extending marginally further into higher uncertainty classes.

Difference maps comparing the ALS + satellite (the local optimal) and general models showed that differences between the two approaches were relatively minor, but localized mismatches were present. In ALRF and Deception, red areas indicated where the general model predicted higher SI than the local optimal model, often along edges of topographically complex areas, whereas blue area indicated underprediction by the general model. At Eagle Hills, the differences were mostly subtle, with small-scattered patches showing divergence, but the overall spatial patterns were consistent. These results demonstrate that the local optimal models captured fine-scale site variability, while the general model provided broadly similar predictions.

4. Discussion

4.1. Integration of Multi-Source Data

The addition of ALS terrain metrics with Sentinel-2 spectral data greatly increased the accuracy of site index (SI) models in this study, with cross-validated R² values of 0.63, 0.44, and 0.56 for Aleza Lake, Deception, and Eagle Hills, respectively, compared to 0.40, 0.40, and 0.46 for ALS-only models (Table 3). Sentinel-2 spectral vegetation indices (SVIs), such as the Sentinel-2 Red Edge Position (S2REP) and the MERIS Terrestrial Chlorophyll Index (MTCI), are sensitive to canopy biochemistry and provide insight into vegetation vigour and health [65]. This information is complimented by ALS derived variables, which offer high-resolution (10 m × 10 m) topographic details, such as the Topographic Position Index (TPI) and Topographic Wetness Index (TWI), to capture microtopography and soil moisture dynamics [66]. In contrast, ALS-only approaches, while effective for deriving fine-scale structural attributes such as canopy height and vertical canopy architecture [43,67], inherently lack sensitivity to biochemical properties such as chlorophyll concentration and photosynthetic activity, which limits their predictive power in heterogeneous forest ecosystems [65,68].

Previous studies have found that height-based site index models are sensitive to stand structure variation, especially in high-productivity or diverse forests where height–diameter relationships diverge [69]. In forests with very complex canopies models that rely solely on structural or topographic features may become less reliable [69]. Our findings also suggest that forest productivity gradients are underrepresented when height measures are used without complementary spectral information. Furthermore, Tompalski et al. [43] demonstrated that the average difference between ALS-derived dominating heights and SI-derived heights was 3.5 m, with the biggest differences seen in structurally complex stands with high canopy cover and canopy roughness (RUMPLE index). We also observed a bias in high-productivity areas, suggesting that ALS-only approaches can overestimate site productivity in complex, multi-layered forests. Our method overcomes these structural constraints by combining ALS with satellite spectral indices, enhancing model sensitivity across ecological gradients, from the drier, structurally varied landscapes of Eagle Hills to the wet, high-productivity conditions of Aleza Lake.

Satellite-only studies, relying primarily on spectral data from platforms like Landsat or Sentinel-2, demonstrate comparable limitations in capturing three-dimensional forest structure and microtopographic variability, often resulting in saturation effects at high biomass levels and reduced accuracy in topographically complex areas [65,70,71]. For example, one study reported that Landsat OLI–only classifications achieved an overall accuracy of 76%, yet the user’s accuracy for shrubland was only 51% due to strong spectral confusion with mature deciduous forests, as the spectral signatures of regenerating stands and closed-canopy forests substantially overlapped [68]. This limitation highlights the inability of passive multispectral sensors to discriminate structurally similar vegetation types when height information is absent.

Fan et al. [70] reported a strong performance in estimating aboveground biomass (AGB) in subtropical plantations using Sentinel-2 multispectral features, with R² values reaching as high as 0.85. Nonetheless, despite the elevated overall accuracy, the models demonstrated higher uncertainty in topographically complex regions characterized by steep inclines and heterogeneous canopy structures, with RMSE values ranging from 40 to 48 Mg/ha. Zhang et al. [71] Similarly found that Landsat 8-based AGB models could produce accurate results (R² values of approximately 0.70–0.80); however, the models exhibited strong saturation in high-biomass stands, leading to systematic underestimation in dense canopies.

In our study, the fusion of ALS terrain metrics with Sentinel-2 spectral indices consistently reduced prediction error relative to ALS-only models across all regions. The greatest improvement occurred in the Aleza Lake Research Forest (ALRF), where RMSE decreased by approximately 19%, while smaller yet meaningful reductions (1–7%) were observed in Deception and Eagle Hills. Growth potential models that incorporate data from complementary sensors can improve the spatial sensitivity to productivity gradients that single-sensor approaches fail to capture, particularly in ecologically heterogeneous landscapes.

Previous studies that have combined ALS and satellite sensors have also found that the data can be complementary and increase model accuracy. For example, a previous study combined ALS-derived canopy height metrics with Landsat spectral indices to estimate SI in Eucalyptus dunnii plantations, achieving R² values between 0.65 and 0.84 [72]. However, their models relied exclusively on height-based ALS metrics without incorporating terrain-derived attributes, resulting in reduced predictive accuracy in water-limited or moisture-restricted environments where topographic controls strongly influence growth potential. In contrast, the inclusion of terrain metrics such as TWI and OFD in our models captures key edaphic drivers of productivity that cannot be resolved using canopy height or spectral information alone, highlighting the broader applicability and ecological sensitivity of our fusion-based SI framework. Other studies have found substantial accuracy improvements in forest stock volume modeling when LiDAR-derived structural features were combined with multi-sensor imagery [73]. However, these studies have focused on modelling biomass or stock volume rather than age-independent SI, and neither evaluated model transferability across broad ecological gradient.

Watt et al. [74] integrated LiDAR-derived structural metrics with RapidEye spectral imagery and environmental variables to model volume-based forest productivity measure in Pinus radiata plantations, achieving 10–20% higher accuracy than single-sensor models. However, their framework remained fundamentally age-dependent, with stand age serving as a key predictor, thereby limiting its applicability to even-aged plantation systems. Moreover, their models did not incorporate terrain-mediated metrics such as Topographic Wetness Index (TWI) or overland flow, which play essential roles in capturing hydrological and edaphic controls on forest growth.

4.2. Site-Level Drivers of Forest Growth Potential and Model Performance

Aleza Lake, Deception, and Eagle Hills have distinct environmental conditions that affect forest growth. Aleza Lake’s wet, generally flat terrain provides high water availability but faces nutrient leaching challenges. Deception’s cold, high-elevation environment experiences short growing seasons and snowpack limitations, and Eagle Hills’ dry, topographically rugged landscape is influenced by water scarcity and erosion. These site-specific growth-limiting factors require customized methodologies for site index (SI) modelling.

Forest productivity is shaped by complex interactions between topography, hydrology, and climate, leading to distinct site-specific variations in SI estimation accuracy. The strongest predictive performance was observed in Aleza Lake Research Forest (ALRF) (R² = 0.63), where stable environmental conditions and a wet climate facilitated strong relationships between SI and ALS terrain-derived variables. These findings are consistent with the results of previous studies [67,75], which emphasizes the role of terrain-driven hydrological processes in shaping forest productivity.

Model accuracy was intermediate at Eagle Hills (R² = 0.56), reflecting the dry, warm climate where water availability is a primary limiting factor. Here, ALS-derived variables like Vertical Distance (overland flow vertical distance) dominated the predictors, followed by satellite indices such as GNDVI and EVI (Figure A2). The importance of these variables emphasizes the critical role of hydrological flow paths and vegetation greenness when evaluating growth in arid landscapes with low precipitation and variable elevations.

Conversely, model performance was lower in Deception (R² = 0.41), highlighting the challenges of estimating SI in high-elevation, topographically complex environments. The dominant influence of positive openness and overland flow distance suggests that hydrological redistribution and snowmelt dynamics play a critical role in growth potential, a pattern consistent with high-elevation studies [76]. Similar trends have been reported in boreal and subalpine forests, where seasonal snowpack fluctuations influence soil moisture availability and growing conditions [44].

The lower R² observed for Deception Lake likely also reflects partial extrapolation error due to limited sampling of the full environmental gradients—particularly elevation and temperature—within the training dataset. Because Random Forest models do not accurately extrapolate beyond the range of predictor values used during training, predictions in high-elevation ESSFmcw and ESSFmcp subzones may underestimate or smooth out expected SI variation. This limitation reflects the model’s reduced ability to capture productivity gradients in underrepresented portions of the predictor space. This was less of a concern for Aleza Lake and Eagle Hills, where the topographic and climatic ranges were fully covered by the training data.

These findings indicate that SI estimation accuracy is strongly influenced by site-specific environmental constraints. Terrain-driven moisture availability and redistribution shape productivity in rugged landscapes, while in more stable environments, vegetation spectral properties become more reliable indicators. Recognizing these distinctions is essential for refining SI models to improve their applicability across diverse forest ecosystems.

4.3. Limitations and Complementarity of Sensor Types

ALS excels in capturing fine-scale topographic complexity (e.g., TPI, positive openness), critical for modeling hydrological processes in complex landscapes [48,67]. However, its high acquisition costs and limited spatial coverage restrict large-scale applications [75]. Conversely, Sentinel-2 provides broad coverage and biochemical insights via S2REP and MTCI, which detect vegetation stress and chlorophyll content [16,77,78]. These indices are less effective for below-canopy soil moisture or subsurface hydrology, limiting their standalone utility.

By integrating ALS and satellite data, our models capture both structural and biochemical drivers, as evidenced by the negative correlation between SI50 and terrain variables (e.g., slope, overland flow distance) and positive correlations with S2REP and GNDVI. Multi-sensor approaches in forestry may therefore provide a more comprehensive and robust assessment of site productivity [78].

Although our SI prediction maps successfully captured large-scale productivity gradients, some localized artefacts were observed, particularly in heterogeneous or species-diverse stands. These artefacts were not spatially consistent across the forest and may be linked to variations in spectral signatures among plantation species (e.g., pine versus spruce). Sentinel-2 indices such as S2REP and MTCI are highly sensitive to canopy biochemistry, and species-specific spectral differences can introduce local prediction noise, especially when mosaicked from multiple acquisition dates. In contrast, ALS-derived terrain variables remained spatially stable and did not produce similar inconsistencies, suggesting that the observed artifacts originate primarily from the spectral component of the model.

Residual analysis confirmed that the SI models are largely age-independent across all study areas. No significant correlations were found between residuals and stand age (R² < 0.05), indicating that prediction errors were random and not biased by age. This validates the robustness of the approach for use in uneven-aged and natural forests, where accurate stand age data are often unavailable.

4.4. Model Transferability and Management Implications

The general Random Forest SI model, using common predictors (e.g., TPI, S2REP), achieved R² values of 0.63, 0.41, and 0.52 for Aleza Lake, Deception, and Eagle Hills, respectively, closely matching site-specific models (R² = 0.63, 0.44, 0.56) (Table 4). This transferability, with minimal accuracy loss (0.00–0.03 R² difference), supports its use for landscape-scale SI mapping across some of British Columbia’s interior biogeoclimatic zones (Figure 8).

However, it is important to note that model transferability does not imply unrestricted extrapolation. Random Forest models are reliable within the environmental and spectral predictor ranges represented in the training data but may produce uncertainty when applied to unobserved conditions. For example, in Deception Lake, where high-elevation and cold subzones (e.g., ESSFmcw, ESSFmcp) were underrepresented, extrapolation beyond the trained domain may have reduced predictive accuracy (Figure A4). Therefore, while the general model demonstrates a relatively strong transferability across similar ecological contexts, its reliability may decrease in regions with environmental gradients extending beyond those represented in the training dataset.

A key strength of the general model lies in its age-independent nature, which allows for consistent and accurate SI estimation without the need for stand age data. By leveraging ALS-derived terrain metrics and satellite spectral indices, the model effectively captures growth dynamics and productivity potential across varying ecological contexts, reducing the need for costly and labor-intensive age data collection.

Nonetheless, the observed differences in performance underscore the necessity of tailoring model inputs and calibration to the specific ecological characteristics of each site. For instance, variables such as positive openness and overland flow distance are more influential in regions like Deception and Eagle Hills, where topography and hydrological dynamics dominate growth-limiting factors. Site-specific models may therefore better capture localized ecological drivers, offering improved predictive accuracy at a scale that is relevant for management decisions. However, the age-independent general model remains valuable for broad-scale applications, providing a robust baseline for comparative assessments across heterogeneous landscapes. These high-resolution SI maps enable targeted management, such as planting drought-resistant Douglas-fir in Eagle Hills’ low-SI areas (Figure A6) or prioritizing conservation in Aleza Lake’s high-SI river valleys for carbon sequestration (Figure 7). By identifying climate-sensitive zones (e.g., Deception’s snowpack-driven areas), the models inform adaptive strategies to mitigate climate change impacts, aligning with provincial sustainability goals [79].

Our spatial uncertainty analysis provided further insight into the transferability of the general model across environmental gradients. Both the optimal and general models showed consistently low prediction variance across most of the mapped area, indicating stable behaviour within the environmental domain represented in the training data. Localized regions with higher uncertainty—typically associated with steep terrain, moisture-limited microsites, or underrepresented high-elevation subzones—highlight areas where model extrapolation is less reliable and where site-specific calibration may provide additional benefits.

4.5. Future Research

Although the model achieved strong predictive performance using ALS-derived terrain metrics and Sentinel-2 spectral indices, further gains in accuracy could be obtained by incorporating additional ecological and biophysical predictors. Fine-resolution climate surfaces, soil and moisture regime information, and high-density LiDAR metrics of forest structure have the potential to capture environmental gradients that were not fully represented in the current predictor set. These datasets were not included in the present study due to inconsistencies in availability across study regions and our aim to develop a modelling framework that remains operational and transferable. Nonetheless, integrating such complementary predictors represents an important opportunity to enhance SI model performance, particularly in ecologically heterogeneous landscapes or climatically marginal areas where growth dynamics are more complete [80].

5. Conclusions

Accurate estimates of forest productivity are critical for effective and sustainable forest management. In this study, we focused on mapping Site Index (SI)—a key indicator of forest growth potential—at high spatial resolution to better capture differences in productivity across the landscape. By combining ALS-derived terrain layers with Sentinel-2 satellite imagery, we developed an age-independent modeling framework capable of representing both the structural and biochemical drivers of forest growth. Despite natural differences in climate, topography, and stand composition among regions, the models performed consistently well (R² = 0.41–0.63) and showed minimal bias, even in complex terrain. Integrating ALS terrain and satellite information allowed us to capture how topography and canopy vigor together shape forest productivity, with variables such as the Topographic Position Index (TPI) and MERIS Terrestrial Chlorophyll Index (MTCI) proving particularly informative. Importantly, the general Random Forest model, trained using predictors shared across study sites, maintained a high level of accuracy, showing strong potential for province-wide SI mapping and broader assessments of forest productivity. Future work that focuses on testing this approach in natural, mature, and old-growth forests will help determine how broadly this framework can be applied beyond managed plantations.

Author Contributions

Conceptualization, F.K.S., C.E., P.D., C.E.C., W.H.M. and L.H.T.; Methodology, F.K.S. and C.E.; Formal analysis, F.K.S.; Investigation, F.K.S. and C.E.; Data curation, F.K.S. and C.E.; Project administration, F.K.S. and C.E.; Writing—original draft, F.K.S.; reviewing and editing, F.K.S., C.E., P.D., C.E.C., W.H.M. and L.H.T.; Visualization, F.K.S. and C.E.; Supervision, C.E.; Funding acquisition, P.D. and W.H.M. All authors have read and agreed to the published version of the manuscript.

Funding

Funding for this project was provided by the B.C. Ministry of Forests, with additional support from NSERC Discovery (CE).

Data Availability Statement

The data presented in this study are available on request from the corresponding author. Field plot data and dendrochronological measurements are subject to institutional and data-sharing agreements and are therefore not publicly available.

Acknowledgments

We thank the handling editor, anonymous reviewers, and the editorial team of Remote Sensing for insightful suggestions to improve the presentation of our research.

Conflicts of Interest

One of the co-authors is currently affiliated with a private company following retirement from academia. This affiliation did not influence the study design, data collection, analysis, interpretation of results, or the decision to publish the manuscript. The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

SI	Site Index
BEC	Biogeoclimatic Ecosystem Classification
ALS	Airborne Laser Scanning
ALRF	Aleza Lake Research Forest

Appendix A

Table A1. ALS-derived layers for three sites.

No.	Variable	Derived Layer Name
1	Digital Terrain Model	DTM
2	Aspect	Aspect
3	Convergence Index	Convergence
4	Diurnal Anisotropic Heating	Diurnal_a_Heating
5	Filled DEM	Filled_Sinks
6	General Curvature	gCurvature
7	Multiresolution Index of the Ridge Top Flatness	MRRTF
8	Multiresolution Index of Valley Bottom Flatness	MRVBF
9	Topographic-Openness-Dominance	P_Openness
10	Topographic Openness Enclosure	N_Openness
11	Overland Flow Horizontal Distance	Overland_Flow
12	overland flow vertical distance	Vertical Distance
13	Slope	Slope
14	Total Curvature	T_Curve
15	Terrain Ruggedness Index	TRI
16	Topographic Position Index	TIP
17	Topographic Wetness Index	TWI

Table A2. Satellite-derived spectral vegetation indices were used for each of the three study regions.

No.	Band/Index	Band Info/Formulation	Group Index Category
1	CLre: (Red-edge Chlorophyll Index)	(b7/b5) − 1	3
2	EVI7: (Enhanced Vegetation Index using b5)	2.5 × (b7 − b4)/(b7 + 6 × b4 − 7.5 × b2 + 1)	1
3	EVI8a: (Enhanced Vegetation Index using b8a)	2.5× (b8a − b4)/(b8a + 6 × b4 − 7.5 × b2 + 1)	1
4	EVI: (Enhanced Vegetation Index)	2.5× (b8 − b4)/(b8 + 2.4 × (b4 + 1))	1
5	GNDVI: (Greenness Normalized Difference Vegetation Index)	(b7 − b3)/(b7 + b3)	2
6	NDVI: (Normalized Difference Vegetation Index using b4 and b8a)	(b8a − b4)/(b8a + b4)	2
7	NDVI45: (NDVI using b4 and b5)	(b5 − b4)/(b5 + b4)	2
8	NDVI65: (NDVI using b6 and b5)	(b6 − b5)/(b6 + b5)	2
9	WDRVI: (Wide Dynamic Range Vegetation Index)	(0.01 × b7 − b5)/z(0.01 × b7 + b5) +(1 − 0.01)/(1 + 0.01)	2
10	S2REP: (Sentinel-2 red-edge position)	705 + 35 × ((b4 + b7)/2 − b5)/(b6 − b5)	3
11	MTCI: (MERIS terrestrial chlorophyll)	(b6 − b5)/(b5− b4)	3
12	MSR: (Modified Simple Ratio)	((b7/b4) − 1)/(b7/b4) + 1) 0.5	3
13	IRECI: (Inverted red-edge chlorophyll index)	(b7 − b4)/(b5/b6)	3

Figure A1. Flowchart demonstrating image processing and data extraction at the plot level.

Figure A2. The Local optimal Random Forest SI model variable importance table was derived from the trained model. (optimal Model).

Figure A3. The general Random Forest SI model Variable Importance table derived from the trained model (general Model).

Figure A4. The local optimal Site Index (SI) predictions and model differences for the Deception Lake. (a) SI predictions for plantation areas. (b) Difference map showing pixel-wise differences between the general and optimal models (general—optimal) for plantation polygons. (c) SI predictions across the full Deception Lake extent. (d) Landscape-scale difference map illustrating spatial variation between the two model outputs (general—optimal).

Figure A5. SI predictions and uncertainty patterns for optimal and general Random Forest models in Deception. Left panels show SI predictions, middle panels show uncertainty (SD of RF pre-dictions), and right panels show uncertainty distributions.

Figure A6. The local optimal Site Index (SI) predictions and model differences for the Eagle Hills. (a) SI predictions for plantation areas. (b) Difference map showing pixel-wise differences between the general and optimal models (general—optimal) for plantation polygons. (c) SI predictions across the full Eagle Hills extent. (d) Landscape-scale difference map illustrating spatial variation between the two model outputs (general—optimal).

Figure A7. SI predictions and uncertainty patterns for optimal and general Random Forest models in Eagle Hills. Left panels show SI predictions, middle panels show uncertainty (SD of RF pre-dictions), and right panels show uncertainty distributions.

References

Powers, R.F. Assessing Potential Sustainable Wood Yield. In The Forests Handbook: Applying Forest Science for Sustainable Management; Blackwell Science, Ltd.: Oxford, UK, 2001; Volume 2, pp. 105–128. [Google Scholar]
Hartmann, H.; Bastos, A.; Das, A.J.; Esquivel-Muelbert, A.; Hammond, W.M.; Martínez-Vilalta, J.; McDowell, N.G.; Powers, J.S.; Pugh, T.A.M.; Ruthrof, K.X.; et al. Climate Change Risks to Global Forest Health: Emergence of Unexpected Events of Elevated Tree Mortality Worldwide. Annu. Rev. Plant Biol. 2022, 73, 673–702. [Google Scholar] [CrossRef]
Hernández-Blanco, M.; Costanza, R.; Chen, H.; DeGroot, D.; Jarvis, D.; Kubiszewski, I.; Montoya, J.; Sangha, K.; Stoeckl, N.; Turner, K. Ecosystem Health, Ecosystem Services, and the Well-being of Humans and the Rest of Nature. Glob. Change Biol. 2022, 28, 5027–5040. [Google Scholar] [CrossRef]
Bunn, A.G.; Goetz, S.J.; Fiske, G.J. Observed and Predicted Responses of Plant Growth to Climate across Canada. Geophys. Res. Lett. 2005, 32, L16710. [Google Scholar] [CrossRef]
Coops, N.C.; Gaulton, R.; Waring, R.H. Mapping Site Indices for Five Pacific Northwest Conifers Using a Physiologically Based Model. Appl. Veg. Sci. 2011, 14, 268–276. [Google Scholar] [CrossRef]
Coops, N.C.; Hember, R.A. Physiologically Derived Predictions of Douglas-Fir Site Index in British Columbia. For. Chron. 2009, 85, 733–744. [Google Scholar] [CrossRef]
Mitchell, A.K.; Burgess, D.; Maynard, D.; Groot, A.; Lussier, J.M.; Ottens, H.; Titus, B. Developing Silvicultural Systems for Sustainable Forestry in Canada. In Proceedings of the XII World Forestry Congress, Quebec City, QC, Canada, 21–28 September 2003; Available online: https://www.fao.org/4/XII/0596-B1.htm (accessed on 5 November 2025).
Bettinger, P.; Boston, K. Forest Planning Heuristics—Current Recommendations and Research Opportunities for s-Metaheuristics. Forests 2017, 8, 476. [Google Scholar] [CrossRef]
Pretzsch, H. The Course of Tree Growth. Theory and Reality. For. Ecol. Manag. 2020, 478, 118508. [Google Scholar]
Tompalski, P.; Coops, N.C.; White, J.C.; Wulder, M.A. Enhancing Forest Growth and Yield Predictions with Airborne Laser Scanning Data: Increasing Spatial Detail and Optimizing Yield Curve Selection through Template Matching. Forests 2016, 7, 255. [Google Scholar] [CrossRef]
Pretzsch, H.; Grote, R.; Reineking, B.; Rötzer, T.H.; Seifert, S.T. Models for Forest Ecosystem Management: A European Perspective. Ann. Bot. 2008, 101, 1065–1087. [Google Scholar] [CrossRef]
Shirley, M. SIBEC Site Index Estimates in Support of Forest Management in British Columbia; Technical Report No. 004; Nigh, G.D., Ed.; British Columbia, Forest Science Program: Victoria, BC, Canada, 2003. [Google Scholar]
Nakajima, T.; Shiraishi, N.; Kanomata, H.; Matsumoto, M. A Method to Maximise Forest Profitability through Optimal Rotation Period Selection under Various Economic, Site and Silvicultural Conditions. N. Z. J. For. Sci. 2017, 47, 4. [Google Scholar] [CrossRef]
Watt, M.S.; Kimberley, M.O.; Dash, J.P.; Harrison, D. Spatial Prediction of Optimal Final Stand Density for Even-Aged Plantation Forests Using Productivity Indices. Can. J. For. Res. 2017, 47, 527–535. [Google Scholar] [CrossRef]
Dash, J.; Mathur, A.; Foody, G.M.; Curran, P.J.; Chipman, J.W.; Lillesand, T.M. Land Cover Classification Using Multi-temporal MERIS Vegetation Indices. Int. J. Remote Sens. 2007, 28, 1137–1159. [Google Scholar] [CrossRef]
Foody, G.M.; Dash, J. Discriminating and Mapping the C3 and C4 Composition of Grasslands in the Northern Great Plains, USA. Ecol. Inform. 2007, 2, 89–93. [Google Scholar] [CrossRef]
Gitelson, A.A.; Viña, A.; Verma, S.B.; Rundquist, D.C.; Arkebauer, T.J.; Keydan, G.; Leavitt, B.; Ciganda, V.; Burba, G.G.; Suyker, A.E. Relationship between Gross Primary Production and Chlorophyll Content in Crops: Implications for the Synoptic Monitoring of Vegetation Productivity. J. Geophys. Res. Atmos. 2006, 111. [Google Scholar] [CrossRef]
Gonzalez-Benecke, C.A.; Teskey, R.O.; Martin, T.A.; Jokela, E.J.; Fox, T.R.; Kane, M.B.; Noormets, A. Regional Validation and Improved Parameterization of the 3-PG Model for Pinus Taeda Stands. For. Ecol. Manag. 2016, 361, 237–256. [Google Scholar] [CrossRef]
Subedi, S.; Fox, T.R.; Wynne, R.H. Determination of Fertility Rating (FR) in the 3-PG Model for Loblolly Pine Plantations in the Southeastern United States Based on Site Index. Forests 2015, 6, 3002–3027. [Google Scholar] [CrossRef]
Burkhart, H.E.; Tomé, M. Modeling Forest Trees and Stands; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2012; ISBN 90-481-3170-7. [Google Scholar]
Le Moguédec, G.; Dhôte, J.-F. Fagacées: A Tree-Centered Growth and Yield Model for Sessile Oak (Quercus petraea L.) and Common Beech (Fagus sylvatica L.). Ann. For. Sci. 2012, 69, 257–269. [Google Scholar] [CrossRef]
Skovsgaard, J.P.; Vanclay, J.K. Forest Site Productivity: A Review of the Evolution of Dendrometric Concepts for Even-Aged Stands. Forestry 2008, 81, 13–31. [Google Scholar] [CrossRef]
Aertsen, W.; Kint, V.; Muys, B.; Van Orshoven, J. Effects of Scale and Scaling in Predictive Modelling of Forest Site Productivity. Environ. Model. Softw. 2012, 31, 19–27. [Google Scholar] [CrossRef]
Nothdurft, A.; Wolf, T.; Ringeler, A.; Böhner, J.; Saborowski, J. Spatio-Temporal Prediction of Site Index Based on Forest Inventories and Climate Change Scenarios. For. Ecol. Manag. 2012, 279, 97–111. [Google Scholar] [CrossRef]
Curt, T.; Bouchaud, M.; Agrech, G. Predicting Site Index of Douglas-Fir Plantations from Ecological Variables in the Massif Central Area of France. For. Ecol. Manag. 2001, 149, 61–74. [Google Scholar] [CrossRef]
Monserud, R.A.; Huang, S.M. Mapping Lodgepole Pine Site Index in Alberta; CABI Publishing: Wallingford, UK, 2003; pp. 11–25. [Google Scholar]
Swenson, J.J.; Waring, R.H.; Fan, W.; Coops, N. Predicting Site Index with a Physiologically Based Growth Model across Oregon, USA. Can. J. For. Res. 2005, 35, 1697–1707. [Google Scholar] [CrossRef]
Pinno, B.D.; Paré, D.; Guindon, L.; Bélanger, N. Predicting Productivity of Trembling Aspen in the Boreal Shield Ecozone of Quebec Using Different Sources of Soil and Site Information. For. Ecol. Manag. 2009, 257, 782–789. [Google Scholar] [CrossRef]
Seynave, I.; Gégout, J.-C.; Hervé, J.-C.; Dhôte, J.-F.; Drapier, J.; Bruno, É.; Dumé, G. Picea Abies Site Index Prediction by Environmental Factors and Understorey Vegetation: A Two-Scale Approach Based on Survey Databases. Can. J. For. Res. 2005, 35, 1669–1678. [Google Scholar] [CrossRef]
Socha, J. Effect of Topography and Geology on the Site Index of Picea Abies in the West Carpathian, Poland. Scand. J. For. Res. 2008, 23, 203–213. [Google Scholar] [CrossRef]
Bravo-Oviedo, A.; Tome, M.; Bravo, F.; Montero, G.; Del Rio, M. Dominant Height Growth Equations Including Site Attributes in the Generalized Algebraic Difference Approach. Can. J. For. Res. 2008, 38, 2348–2358. [Google Scholar] [CrossRef]
Chen, H.Y.H.; Krestov, P.V.; Klinka, K. Trembling Aspen Site Index in Relation to Environmental Measures of Site Quality at Two Spatial Scales. Can. J. For. Res. 2002, 32, 112–119. [Google Scholar] [CrossRef]
Holmgren, P. Topographic and Geochemical Influence on the Forest Site Quality, with Respect to Pinus Sylvestris and Picea Abies in Sweden. Scand. J. For. Res. 1994, 9, 75–82. [Google Scholar] [CrossRef]
Corona, P.; Scotti, R.; Tarchiani, N. Relationship between Environmental Factors and Site Index in Douglas-Fir Plantations in Central Italy. For. Ecol. Manag. 1998, 110, 195–207. [Google Scholar] [CrossRef]
Marques, C.P. Evaluating Site Quality of Even-Aged Maritime Pine Stands in Northern Portugal Using Direct and Indirect Methods. For. Ecol. Manag. 1991, 41, 193–204. [Google Scholar] [CrossRef]
Carmean, W.H. Forest Site Quality Evaluation in the United States. Adv. Agron. 1975, 27, 209–269. [Google Scholar]
Skovsgaard, J.P.; Vanclay, J.K. Forest Site Productivity: A Review of Spatial and Temporal Variability in Natural Site Conditions. Forestry 2013, 86, 305–315. [Google Scholar] [CrossRef]
Næsset, E. Predicting Forest Stand Characteristics with Airborne Scanning Laser Using a Practical Two-Stage Procedure and Field Data. Remote Sens. Environ. 2002, 80, 88–99. [Google Scholar] [CrossRef]
Stearns-Smith, S. Making Sense of Site Index Estimates in British Columbia: A Quick Look at the Big Picture. J. Ecosyst. Manag. 2002, 1, 2. [Google Scholar] [CrossRef]
Tahiru, A.-W.; Cobbina, S.; Asare, W.; Takal, S.U. Advancing Environmental Sustainability Through Remote Sensing: A Review of Applications, Limitations, and Emerging Solutions. 2025. Available online: https://www.preprints.org/manuscript/202503.1896 (accessed on 5 November 2025).
Hall, R.J.; Skakun, R.S.; Arsenault, E.J.; Case, B.S. Modeling Forest Stand Structure Attributes Using Landsat ETM+ Data: Application to Mapping of Aboveground Biomass and Stand Volume. For. Ecol. Manag. 2006, 225, 378–390. [Google Scholar] [CrossRef]
Zhao, M.; Running, S.W. Remote Sensing of Terrestrial Primary Production and Carbon Cycle. In Advances in Land Remote Sensing: System, Modeling, Inversion and Application; Springer: Dordrecht, The Netherlands, 2008; pp. 423–444. [Google Scholar]
Tompalski, P.; Coops, N.C.; White, J.C.; Wulder, M.A. Augmenting Site Index Estimation with Airborne Laser Scanning Data. For. Sci. 2015, 61, 861–873. [Google Scholar] [CrossRef]
Guerra-Hernández, J.; Arellano-Pérez, S.; González-Ferreiro, E.; Pascual, A.; Altelarrea, V.S.; Ruiz-González, A.D.; Álvarez-González, J.G. Developing a Site Index Model for P. Pinaster Stands in NW Spain by Combining Bi-Temporal ALS Data and Environmental Data. For. Ecol. Manag. 2021, 481, 118690. [Google Scholar] [CrossRef]
Socha, J.; Pierzchalski, M.; Bałazy, R.; Ciesielski, M. Modelling Top Height Growth and Site Index Using Repeated Laser Scanning Data. For. Ecol. Manag. 2017, 406, 307–317. [Google Scholar] [CrossRef]
Pettorelli, N.; Vik, J.O.; Mysterud, A.; Gaillard, J.-M.; Tucker, C.J.; Stenseth, N.C. Using the Satellite-Derived NDVI to Assess Ecological Responses to Environmental Change. Trends Ecol. Evol. 2005, 20, 503–510. [Google Scholar] [CrossRef]
Waring, R.H.; Milner, K.S.; Jolly, W.M.; Phillips, L.; McWethy, D. Assessment of Site Index and Forest Growth Capacity across the Pacific and Inland Northwest USA with a MODIS Satellite-Derived Vegetation Index. For. Ecol. Manag. 2006, 228, 285–291. [Google Scholar] [CrossRef]
Beven, K.J.; Kirkby, M.J. A Physically Based, Variable Contributing Area Model of Basin Hydrology/Un Modèle à Base Physique de Zone d’appel Variable de l’hydrologie Du Bassin Versant. Hydrol. Sci. J. 1979, 24, 43–69. [Google Scholar] [CrossRef]
Guisan, A.; Weiss, S.B.; Weiss, A.D. GLM versus CCA Spatial Modeling of Plant Species Distribution. Plant Ecol. 1999, 143, 107–122. [Google Scholar] [CrossRef]
Yu, J.; Wang, J.; Leblon, B. Evaluation of Soil Properties, Topographic Metrics, Plant Height, and Unmanned Aerial Vehicle Multispectral Imagery Using Machine Learning Methods to Estimate Canopy Nitrogen Weight in Corn. Remote Sens. 2021, 13, 3105. [Google Scholar] [CrossRef]
Yu, J.; Wang, J.; Leblon, B.; Song, Y. Nitrogen Estimation for Wheat Using UAV-Based and Satellite Multispectral Imagery, Topographic Metrics, Leaf Area Index, Plant Height, Soil Moisture, and Machine Learning Methods. Nitrogen 2021, 3, 1–25. [Google Scholar] [CrossRef]
Meidinger, D.; Pojar, J. Ecosystems of British Columbia; Special Report Series; Ministry of Forests, British Columbia: Victoria, BC, Canada, 1991. [Google Scholar]
British Columbia Ministry of Forests, Silviculture Branch. Growth Intercept Method for Silviculture Surveys; British Columbia Ministry of Forests: Victoria, BC, Canada, 1995. [Google Scholar]
The Ministry of Forests, Lands, Natural Resource Operations and Rural Development. Site Index Tools (SiteTools)—Province of British Columbia. Available online: https://www2.gov.bc.ca/gov/content/industry/forestry/managing-our-forest-resources/forest-inventory/growth-and-yield-modelling/site-index-tools-sitetools (accessed on 5 November 2025).
Dietrich, J.P.; Leoncio, W. Citation: Software Citation Tools; R package; The R Foundation for Statistical Computing: Vienna, Austria, 2020; Available online: https://CRAN.R-project.org/package=citation (accessed on 5 November 2025).
Brenning, A.; Bangs, D.; Becker, B.; Schratz, P.; Polakowski, F. RSAGA: SAGA Geoprocessing and Terrain Analysis; R Package Version 1.4.2; The R Foundation for Statistical Computing: Vienna, Austria, 2008; Available online: https://cran.r-project.org/package=RSAGA (accessed on 5 November 2025).
Korhonen, L.; Packalen, P.; Rautiainen, M. Comparison of Sentinel-2 and Landsat 8 in the Estimation of Boreal Forest Canopy Cover and Leaf Area Index. Remote Sens. Environ. 2017, 195, 259–274. [Google Scholar] [CrossRef]
Majasalmi, T.; Rautiainen, M. The Potential of Sentinel-2 Data for Estimating Biophysical Variables in a Boreal Forest: A Simulation Study. Remote Sens. Lett. 2016, 7, 427–436. [Google Scholar] [CrossRef]
Rahimzadeh-Bajgiran, P.; Munehiro, M.; Omasa, K. Relationships between the Photochemical Reflectance Index (PRI) and Chlorophyll Fluorescence Parameters and Plant Pigment Indices at Different Leaf Growth Stages. Photosynth. Res. 2012, 113, 261–271. [Google Scholar] [CrossRef] [PubMed]
Verrelst, J.; Camps-Valls, G.; Muñoz-Marí, J.; Rivera, J.P.; Veroustraete, F.; Clevers, J.G.P.W.; Moreno, J. Optical Remote Sensing and the Retrieval of Terrestrial Vegetation Bio-Geophysical Properties—A Review. ISPRS J. Photogramm. Remote Sens. 2015, 108, 273–290. [Google Scholar]
Amin, M.E.S.; Nabil, M.; Abdelfattah, M.A.; Mohamed, E.S.; Mahmoud, A.G. Exploitation of Sentinel-2 Spectral Bands and Vegetation Indices in Potato Yield Estimation. J. Indian Soc. Remote Sens. 2025. [Google Scholar] [CrossRef]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Liaw, A.; Wiener, M. Classification and Regression by randomForest. R News 2002, 2, 18–22. [Google Scholar]
Kuhn, M. Building Predictive Models in R Using the Caret Package. J. Stat. Softw. 2008, 28, 1–26. [Google Scholar] [CrossRef]
Adams, B.T.; Matthews, S.N. Enhancing Forest and Shrubland Mapping in a Managed Forest Landscape with Landsat–LiDAR Data Fusion. Nat. Areas J. 2018, 38, 402–418. [Google Scholar] [CrossRef]
Terezan, L.H. Evaluating Site Index Models Derived from Topography and Predictive Ecosystem Models. Doctoral Dissertation, University of Northern British Columbia, Prince George, BC, Canada, 2022. [Google Scholar]
Lefsky, M.A.; Cohen, W.B.; Parker, G.G.; Harding, D.J. Lidar Remote Sensing for Ecosystem Studies. BioScience 2002, 52, 19–30. [Google Scholar] [CrossRef]
Jiang, F.; Deng, M.; Tang, J.; Fu, L.; Sun, H. Integrating Spaceborne LiDAR and Sentinel-2 Images to Estimate Forest Aboveground Biomass in Northern China. Carbon Balance Manag. 2022, 17, 12. [Google Scholar] [CrossRef]
Sharma, M.; Amateis, R.L.; Burkhart, H.E. Top Height Definition and Its Effect on Site Index Determination in Thinned and Unthinned Loblolly Pine Plantations. For. Ecol. Manag. 2002, 168, 163–175. [Google Scholar] [CrossRef]
Fan, W.; Tian, J.; Knoke, T.; Yang, B.; Liang, F.; Dong, Z. Investigating Dual-Source Satellite Image Data and ALS Data for Estimating Aboveground Biomass. Remote Sens. 2024, 16, 1804. [Google Scholar] [CrossRef]
Zhang, L.; Shao, Z.; Liu, J.; Cheng, Q. Deep Learning Based Retrieval of Forest Aboveground Biomass from Combined LiDAR and Landsat 8 Data. Remote Sens. 2019, 11, 1459. [Google Scholar] [CrossRef]
Rizzo-Martín, I.; Hirigoyen-Domínguez, A.; Arthus-Bacovich, R.; Varo-Martínez, M.Á.; Navarro-Cerrillo, R. Site Index Estimation Using Airborne Laser Scanner Data in Eucalyptus Dunnii Maide Stands in Uruguay. Forests 2023, 14, 933. [Google Scholar] [CrossRef]
Liu, J.; Quan, Y.; Wang, B.; Shi, J.; Ming, L.; Li, M. Estimation of Forest Stock Volume Combining Airborne LiDAR Sampling Approaches with Multi-Sensor Imagery. Forests 2023, 14, 2453. [Google Scholar] [CrossRef]
Watt, M.S.; Dash, J.P.; Watt, P.; Bhandari, S. Multi-Sensor Modelling of a Forest Productivity Index for Radiata Pine Plantations. N. Z. J. For. Sci. 2016, 46, 9. [Google Scholar] [CrossRef]
Dubayah, R.O.; Drake, J.B. Lidar Remote Sensing for Forestry. J. For. 2000, 98, 44–46. [Google Scholar] [CrossRef]
Leclère, L.; Latte, N.; Candaele, R.; Ligot, G.; Lejeune, P. Estimation of Within-Gap Regeneration Height Growth in Managed Temperate Deciduous Forests Using Bi-Temporal Airborne Laser Scanning Data. Ann. For. Sci. 2024, 81, 36. [Google Scholar] [CrossRef]
Peng, Y.; Gitelson, A.A. Remote Estimation of Gross Primary Productivity in Soybean and Maize Based on Total Crop Chlorophyll Content. Remote Sens. Environ. 2012, 117, 440–448. [Google Scholar] [CrossRef]
Rahimzadeh-Bajgiran, P.; Hennigar, C.; Weiskittel, A.; Lamb, S. Forest Potential Productivity Mapping by Linking Remote-Sensing-Derived Metrics to Site Variables. Remote Sens. 2020, 12, 2056. [Google Scholar] [CrossRef]
Ministry of Forests-Forest Practices Branch. How to Determine Site Index in Silviculture. In Participant’s Workbook; British Columbia: Victoria, BC, Canada, 1999. [Google Scholar]
Anderegg, L.D.L.; HilleRisLambers, J. Local Range Boundaries vs. Large-scale Trade-offs: Climatic and Competitive Constraints on Tree Growth. Ecol. Lett. 2019, 22, 787–796. [Google Scholar] [CrossRef]

Figure 1. Location of the three study sites within British Columbia, Canada. The location of sampled plantation stands within each study site is indicated by green circles.

Figure 2. Schematic of plot layout for field data collection. The location of the central tree plot was randomly selected, and the four additional plots were situated based on the central tree location.

Figure 3. Workflow summarizing the data processing and modelling steps used to generate Site Index predictions.

Figure 4. Model predictions from Site Index (SI) models fit using just ALS terrain-derived variables vs. the empirically measured SI values, for each of the three case study regions. Coefficient of determination (R²) from 10-fold cross-validation is shown for each region. The solid red line represents the fitted regression line, while the blue dashed line represents the 1:1 reference line.

Figure 5. Model predictions from Site Index (SI) models fit using ALS terrain and satellite-derived variables vs. the empirically measured SI values for each of the three case study regions. Coefficient of determination (R²) from 10-fold cross-validation is shown for each region. The solid red line represents the fitted regression line, while the blue dashed line represents the 1:1 reference line.

Figure 6. Model predictions from Site Index (SI) models fit using ALS terrain and satellite-derived variables vs. the empirically measured SI values, for each of the three case study regions. Coefficient of determination (R²) from 10-fold cross-validation is shown for each region. The solid red line represents the fitted regression line, while the blue dashed line represents the 1:1 reference line. (general model).

Figure 7. Local optimal Site Index (SI) predictions and model differences for the ALRF. (a) SI predictions for plantation areas. (b) Difference map showing pixel-wise differences between the general and optimal models (general—optimal) for plantation polygons. (c) SI predictions across the full landscape extent. (d) Landscape-scale difference map illustrating spatial variation between the two model outputs (general—optimal).

Figure 8. SI predictions and uncertainty patterns for optimal and general Random Forest models in ALRF. Left panels (a,d) show SI predictions, middle panels (b,e) show uncertainty (SD of RF predictions), and right panels (c,f) show uncertainty distributions.

Table 1. Summary of the characteristics of trees in each of the three research areas.

Site	Variables	Median	Mean	Std. Dev	Maximum	Minimum
Aleza Lake	DBH	15.10	15.50	5.21	27.50	2.20
	Height	10.35	10.75	3.74	17.85	2.33
	Age	28.00	27.17	3.92	30.00	9.00
	Site Index	23.09	22.84	3.69	34.52	13.28
Eagle Hills	DBH	12.60	13.22	3.49	23.00	7.10
	Height	8.50	8.69	1.95	12.80	4.45
	Age	21.00	20.88	3.09	25.00	18.00
	Site Index	20.40	20.39	2.37	25.16	12.01
Deception	DBH	13.30	12.87	4.40	21.60	2.00
	Height	8.60	8.22	2.34	13.65	2.10
	Age	26.00	22.85	5.66	31.00	9.00
	Site Index	20.87	20.50	3.16	26.60	13.26

Table 2. Satellite and ALS- terrain-derived predictor variables that were retained in each region following VIF analysis (retained variables had a VIF < 10). “Yes” indicates variables retained in the final model for each region.

No	Variable	Aleza Lake	Deception	Eagle Hills
	Sentienel-2
1	EVI8a (Enhanced vegetation index using b8a)		Yes
2	EVI (Enhanced vegetation Index)	Yes		Yes
3	GNDVI (Greenness normalized difference vegetation index)	Yes	Yes	Yes
4	MTCI (MERIS terrestrial Chlorophyll)	Yes	Yes	Yes
5	S2REP (Sentinel-2 red edge position)	Yes	Yes	Yes
	ALS-terrain
6	DTM (Digital terrain model)	Yes	Yes	Yes
7	Aspect	Yes	Yes	Yes
8	Convergence Index	Yes	Yes	Yes
9	Dah (Diurnal Anisotropic Heating)	Yes	Yes	Yes
10	G-Curvature (General Curvature)	Yes	Yes	Yes
11	MRRTF (Multiresolution Index of the ridge top flatness)	Yes	Yes	Yes
12	MRVBF (Multiresolution Index of valley bottom flatness)	Yes	Yes	Yes
13	P-Openness (Topographic Openness dominance)		Yes	Yes
14	N-Openness (Topographic Openness enclosure)	Yes	Yes	Yes
15	Overland Flow horizontal distance	Yes	Yes	Yes
16	Slope	Yes	Yes	Yes
17	T-Curvature (Total Curvature)	Yes	Yes	Yes
18	TRI (Terrain Ruggedness Index)	Yes		Yes
19	TPI (Topographic Position Index)	Yes	Yes	Yes
20	TWI (Topographic Wetness Index)	Yes	Yes	Yes
21	Vertical Distance		Yes

Table 3. Coefficient of determination (10-Fold cross-validation) and Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and Mean Absolute Error (MAE) for statistical models tested at each site.

Random Forrest Model	10-Fold Cross-Validation (R²)		Mean Squared Error (MSE)		Root Mean Squared Error (RMSE)		Mean Absolute Error (MAE)
	ALS	ALS+ Satellite (The Local Optimal Model)	ALS	ALS+ Satellite (The Local Optimal Model)	ALS	ALS + Satellite (The Local Optimal Model)	ALS	ALS + Satellite (The Local Optimal Model)
ALRF	0.40	0.63	8.88	5.76	2.98	2.40	2.28	1.85
Deception	0.40	0.44	6.25	6.10	2.50	2.47	2.06	2.02
Eagle Hills	0.46	0.56	2.62	2.25	1.62	1.50	1.30	1.23

Table 4. Coefficient of determination (10-Fold cross-validation) and Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and Mean Absolute Error (MAE) for each site for optimal and general Models.

Random Forrest Model	10-Fold Cross-Validation (Repeated CV) (R²)		Mean Squared Error (MSE)		Root Mean Squared Error (RMSE)		Mean Absolute Error (MAE)
	Local Optimal	General	Local Optimal	General	Local Optimal	General	Local Optimal	General
ALRF	0.63	0.63	5.76	5.76	2.40	2.40	1.85	1.87
Deception	0.44	0.41	6.10	6.40	2.47	2.53	2.02	2.06
Eagle Hills	0.56	0.52	2.25	2.49	1.50	1.58	1.23	1.28

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Khalifeh Soltanian, F.; Henrique Terezan, L.; Chisholm, C.E.; Dykstra, P.; MacKenzie, W.H.; Elkin, C. Combining ALS and Satellite Data to Develop High-Resolution Forest Growth Potential Maps for Plantation Stands in Western Canada. Remote Sens. 2026, 18, 406. https://doi.org/10.3390/rs18030406

AMA Style

Khalifeh Soltanian F, Henrique Terezan L, Chisholm CE, Dykstra P, MacKenzie WH, Elkin C. Combining ALS and Satellite Data to Develop High-Resolution Forest Growth Potential Maps for Plantation Stands in Western Canada. Remote Sensing. 2026; 18(3):406. https://doi.org/10.3390/rs18030406

Chicago/Turabian Style

Khalifeh Soltanian, Faezeh, Luiz Henrique Terezan, Colin E. Chisholm, Pamela Dykstra, William H. MacKenzie, and Che Elkin. 2026. "Combining ALS and Satellite Data to Develop High-Resolution Forest Growth Potential Maps for Plantation Stands in Western Canada" Remote Sensing 18, no. 3: 406. https://doi.org/10.3390/rs18030406

APA Style

Khalifeh Soltanian, F., Henrique Terezan, L., Chisholm, C. E., Dykstra, P., MacKenzie, W. H., & Elkin, C. (2026). Combining ALS and Satellite Data to Develop High-Resolution Forest Growth Potential Maps for Plantation Stands in Western Canada. Remote Sensing, 18(3), 406. https://doi.org/10.3390/rs18030406

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Combining ALS and Satellite Data to Develop High-Resolution Forest Growth Potential Maps for Plantation Stands in Western Canada

Highlights

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Sites

2.2. Empirical Field Data

2.3. Remote Sensing Data and Processing

2.4. Development of the Local Optimal SI Models

2.4.1. Satellite-Only Baseline Modelling and Selection of the Most Important Key Spectral Predictors

2.4.2. Terrain-Only Modelling and Importance Ranking of LiDAR-Derived Predictors

2.4.3. Stepwise Integration of Terrain Predictors and Refinement of the Combined Model

2.4.4. Variable Importance, Model Comparison, and Spatial Performance Mapping

3. Results

3.1. Empirical Data

3.2. Predictive Map Generation

3.3. Random Forest Models

3.3.1. ALS-Derived Terrain Data

3.3.2. ALS-Derived Terrain and Satellite Data (The Local Optimized Model)

3.4. SI Estimate Comparisons of the Local Optimal Model and General Models for Three Case Study Areas

3.5. Variable Importance

3.5.1. ALS + Satellite (The Local Optimized) Model

3.5.2. General Model

3.6. Spatial Comparison of General vs. Site-Specific SI Predictions

4. Discussion

4.1. Integration of Multi-Source Data

4.2. Site-Level Drivers of Forest Growth Potential and Model Performance

4.3. Limitations and Complementarity of Sensor Types

4.4. Model Transferability and Management Implications

4.5. Future Research

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI