Modeling Primary Production in Temperate Forests Using Three-Dimensional Canopy Structural Complexity Metrics Derived from Airborne LiDAR Data

Siddiqui, Tahrir; Alveshere, Brandon; Gough, Christopher; van Aardt, Jan; Krause, Keith

doi:10.3390/rs17162817

Open AccessArticle

Modeling Primary Production in Temperate Forests Using Three-Dimensional Canopy Structural Complexity Metrics Derived from Airborne LiDAR Data

by

Tahrir Siddiqui

^1,*

,

Brandon Alveshere

²

,

Christopher Gough

²

,

Jan van Aardt

¹

and

Keith Krause

³

¹

Chester F. Carlson Center for Imaging Science, Rochester Institute of Technology, Rochester, NY 14623, USA

²

Department of Biology, Virginia Commonwealth University, Richmond, VA 23284, USA

³

Battelle, National Ecological Observatory Network, Boulder, CO 80301, USA

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(16), 2817; https://doi.org/10.3390/rs17162817

Submission received: 25 June 2025 / Revised: 26 July 2025 / Accepted: 5 August 2025 / Published: 14 August 2025

(This article belongs to the Special Issue Digital Modeling for Sustainable Forest Management)

Download

Browse Figures

Versions Notes

Abstract

Accurate and scalable estimation of forest production is essential for quantifying carbon sequestration, forecasting timber yields, and guiding climate change mitigation strategies. While prior studies established a positive linkage between net primary production (NPP) and canopy structural complexity (CSC) metrics derived from terrestrial LiDAR, the spatial coverage of ground-based surveys is limited. Airborne laser scanning (ALS) could offer a rapid and spatially extensive alternative to terrestrial scanning, but the predictive capacity of ALS-derived CSC metrics for estimating forest production remains insufficiently explored. To address this gap, we derived a suite of three-dimensional (3D) CSC metrics from small-footprint, high-density ALS data collected by the National Ecological Observatory Network’s Airborne Observation Platform. We evaluated relationships between CSC metrics and the NPP of plots nested within seven deciduous and evergreen temperate forests. Optimal metric combinations for predicting NPP within and across forest types were identified using partial least squares regression coupled with recursive feature elimination. ALS-derived CSC metrics explained 77% (RMSE = 11%) and 76% (RMSE = 13%) of the variance in deciduous and evergreen forest plot NPP, respectively. Our findings demonstrate that 3D CSC metrics derived from high-density ALS are robust predictors of plot-level NPP, offering performance comparable to terrestrial scanners while enabling greater scalability and more efficient data acquisition.

Keywords:

airborne LiDAR; canopy structural complexity; net primary production; regression analysis; statistical modeling

Graphical Abstract

1. Introduction

Accurate estimation of forest production, or the amount of biomass produced over a given period, is essential to measuring carbon sequestration, forecasting timber yields, and informing forest management strategies to combat climate change [1,2,3]. Although remote sensing approaches have made significant advances in estimating standing biomass and carbon stocks, accurately quantifying forest production across large regions remains a persistent challenge [4].

Light detection and ranging (LiDAR) is a powerful tool for mapping three-dimensional (3D) vegetation structure. Studies using terrestrial LiDAR have revealed a strong correlation between canopy structural complexity (CSC), which describes 3D variability in vegetation density and distribution, and net primary production (NPP), measured as the accumulation of aboveground biomass over time, in temperate forests [5,6,7]. In a multi-site study across ten National Ecological Observatory Network (NEON) temperate forest sites, Gough et al. [5] found that three CSC metrics—canopy rugosity, top rugosity, and rumple—derived from portable canopy LiDAR (PCL) exhibited strong correlations with site-level NPP (R² = 0.83, 0.58, 0.77, respectively), outperforming other commonly used metrics. Nonetheless, PCL data acquisition is labor-intensive and limited to narrow canopy transects, making it unfeasible for spatially comprehensive sampling of CSC over extensive forested regions.

Airborne laser scanning (ALS) presents a promising approach for leveraging CSC measurements to estimate forest production across large, contiguous landscapes. Recent advances in airborne remote sensing and computational processing have enabled ALS systems to generate dense 3D point clouds, opening new avenues for fine-scale forest analysis over broad regions. While CSC–NPP relationships are well established in ecological literature [8,9,10,11], the capability of modern ALS systems to quantify CSC traits that drive forest production remains underexplored. In particular, the link between stand-scale production and CSC metrics derived from high point density (>10 points/m²; hereafter “high-density”) ALS data has yet to be systematically investigated.

There is strong theoretical support for the use of ALS-derived CSC metrics in predicting NPP. For example, canopy light use efficiency (LUE)—a key driver of NPP—is directly influenced by the spatial arrangement of foliage within the canopy [12]. Structurally complex canopies facilitate greater light interception and use light more efficiently to drive production [13,14]. Empirical studies using PCL have validated these concepts, demonstrating positive associations between rugosity, a metric that captures internal canopy complexity, and both LUE and NPP in temperate forest stands [5,6]. CSC is an emergent property of forests that is dependent on numerous factors that also affect NPP, including stand age, soils, climate, and species composition. Consequently, CSC metrics are integrative measures that to some extent account for differences in these factors simultaneously [15]. We hypothesize that high-density ALS data can resolve sufficient internal structure to enable the derivation of meaningful CSC metrics tied to stand-scale NPP, given that the detection of sub-canopy structure increases with ALS point density [16].

While several studies use ALS-derived CSC to infer biomass pools or carbon fluxes, none focus on NPP. LaRue et al. [17] showed that two ALS-derived measures of canopy heterogeneity, even from low-density acquisitions, were strong predictors of plot-level basal area—a proxy for wood productivity—across 19 US forest sites. Liu et al. [18] reported a moderately strong site-level correlation (r = 0.62, p < 0.01) between an ALS-derived CSC metric and gross primary production (GPP) measured from eddy covariance (EC) flux towers. However, EC-based GPP estimates are subject to considerable uncertainty [19] and lack the spatial resolution necessary for sub-hectare production assessments. Other ALS-based approaches have predicted forest production indirectly, either through site index (SI), defined as the average height of dominant trees at a given index age [20], or via growth simulation models [21,22,23]. These methods, however, rely on ancillary data, such as stand age and species composition, which are difficult to obtain accurately over large, heterogeneous landscapes. Consequently, such studies have largely been limited to individual sites with relatively homogeneous species assemblages and unform environmental conditions.

Systematically linking CSC traits to NPP remains challenging due to the multifaceted nature of forest structure and its complex interactions with biotic and abiotic factors that influence production. The increasing availability of high-resolution ALS data across diverse forest types, in conjunction with spatially and temporally aligned field inventory datasets, presents a promising opportunity to address this challenge. Yet, empirical studies have not leveraged such data to directly estimate field-derived forest production at sub-hectare scales across multiple sites, representing a critical gap in remote sensing approaches for scalable, biome-wide production assessment. The primary objective of this study is to address this gap by evaluating the capacity of small-footprint, high-density ALS data to capture CSC metrics predictive of plot-scale forest production across North America’s temperate biome. Specifically, we

(1): Derived a suite of 3D CSC metrics from high-density ALS data acquired by the NEON Airborne Observation Platform (AOP) [24] at seven forested NEON sites, spanning six ecoclimatic domains and three forest types: deciduous, evergreen, and mixed (deciduous and evergreen).
(2): Evaluated how NPP, estimated from field inventories, relates to ALS-derived 3D CSC metrics using a novel modeling framework that combines partial least squares regression with recursive feature elimination.

We modeled NPP separately for deciduous and evergreen forest plots to determine whether CSC–NPP relationships vary with forest type, followed by combined modeling across all forest types. Additionally, we assessed how the grid resolution used to derive CSC metrics influences NPP prediction accuracy, offering new insights into the spatial scales at which CSC traits regulate plot-scale forest production in temperate ecosystems.

2. Materials and Methods

2.1. Study Sites and Field Data

We conducted our analysis using 40 × 40 m “tower base” plots (see Figure S1) from seven temperate forest sites in the NEON, representing diverse ecological and climatic conditions (Figure S2). These plots are arrayed following a spatially balanced, randomized design at each site, and data collection procedures are standardized across sites. Forested NEON sites were selected based on the following criteria:

ALS data collected using a high point density LiDAR scanner, specifically AOP Payload 3 [24], which was the only payload employing a high-density scanner [25] within our analysis windows.
Included greater than or equal to five plots with repeat diameter-at-breast-height (DBH) measurements over a 2–4-year period, overlapping with the year of AOP data collection, for calculating NPP.
No major disturbances occurred during the NPP measurement window.

Criterion 1 ensures greater penetration beneath the canopy surface and denser sampling of both the canopy and the ground compared to AOP payloads that use low-density scanners. This increases the likelihood of producing ecologically relevant CSC metrics, since interior canopy features influence NPP [26]. Criterion 2 is essential for the calculation of NPP [5]. Criterion 3 ensures that tree mortality, caused by natural perturbations or anthropogenic activities, does not disrupt the CSC–production relationships previously observed in intact, undisturbed forests [27]. We screened sites meeting criteria 1 and 2 for disturbances using NEON’s site management and event reporting document [28], retaining only plots unaffected by harvesting or recent canopy disturbances. This screening process yielded 49 forest plots across seven sites, consisting of 26 deciduous plots from five sites, 14 evergreen plots from three sites, and 9 mixed plots from three sites (Table 1). The forest type for each plot was obtained from the 30 m resolution National Land Cover Database (NLCD) and subsequently verified through ground-truthing by NEON field teams [29].

2.2. Field-Derived NPP

Wood NPP (Mg ha⁻¹ yr⁻¹) was calculated over a 2–4-year period, depending on data availability, from repeated stem diameter measurements of live trees ≥ 3 cm DBH in each plot. Plot center and corner coordinates were recorded with sub-meter accuracy (≤30 cm) using Trimble GPS [30]. Diameter measurements from the two 400 m² subplots used for vegetation sampling in each 1600 m² tower plot were extracted from NEON vegetation structure data [31]. NEON samples two fixed 400 m² subplots in each bout, totaling 800 m² per plot (Figure S1). A 2–4-year sliding temporal window was employed to accommodate differences in sampling among sites, consistent with prior studies [5]. The generalized allometries of Chojnacky et al. [32] were used to infer aboveground dry weight biomass (AGB) from stem diameters. Sampling period 1 and 2 AGB estimates were set equal when the estimate for a tree was greater in period 1 than in period 2, to account for field measurement errors. The total AGB increment over the 2–4-year period was then calculated by summing tree AGB by respective plot and year and subtracting total AGB in period 1 from period 2. Wood NPP was derived by scaling the total AGB increment for the sampled area (800 m²) to the hectare scale, then dividing by the respective temporal window (2–4 years). Summary statistics of the field-derived NPP values, along with the dominant canopy species for each site, are provided in Table S1.

2.3. Airborne LiDAR Data Collection and Processing

Airborne LiDAR data were acquired during AOP flight campaigns [24] using a Riegl LMS-Q780 [25] small-footprint LiDAR scanner during the “peak foliar greenness” period. This period is determined from time-series Enhanced Vegetation Index (EVI) data, calculated from Moderate Resolution Imaging Spectroradiometer (MODIS) imagery [33], and typically runs from May through August in US temperate forest sites. The year of data collection for each site is provided in Table 1. Flights were conducted at a nominal altitude of 1000 m above ground level (AGL), with a flying speed of about 50 m/s and 37% flight-line overlap [34]. The beam divergence of transmitted laser pulses was 0.25 mrad (1/e²), producing a nadir footprint size of 0.25 m at 1000 m AGL, and the scan angle ranged from −18 to +18 degrees across all analyzed sites. The average pulse density was 8.9 pulses/m², and up to seven discrete returns were recorded per pulse. The resulting point clouds had an average density of 22 points/m² across all plots, with each point geolocated using onboard GPS and inertial measurement unit (IMU) data [34]. The discrete-return LiDAR data were stored in LAS 1.3 format, with horizontal coordinates referenced to a UTM map projection and ITRF00 datum and elevations referenced to Geoid12A. Additionally, the data vendor processed and classified the point clouds using LAStools v. 230901 [35], providing the classified data in 1 × 1 km tiles.

Classified point cloud data tiles [34] overlapping the NEON tower plots at each site were clipped to individual 40 × 40 m plots (hereafter referred to as “plot point clouds”) using plot boundary polygons provided in NEON’s document library [36]. Clipping was performed using the clip_roi function from the lidR package [37,38]. To ensure data quality, duplicate points were removed from each plot point cloud using the lasduplicate command within LAStools. Finally, only points classified as “ground” (LAS 1.3 class 2), “medium vegetation” (class 4), and “high vegetation” (class 5) were retained for analysis.

2.4. 3D CSC Metrics for NPP Estimation

We derived a suite of nine 3D CSC metrics designed to characterize the structural complexity in different portions of the canopy, guided by past studies that have linked stand-scale CSC metrics to aboveground production [5,6,17]. For eight of these metrics, plot point clouds were first voxelized into vertical columns of laser returns at a specified horizontal resolution (Figure 1). Metric-specific derivations were then applied on the columns to produce a raster for each plot, from which CSC metrics were subsequently computed. We computed two measures of horizontal variability (v1 and v2) and one measure of horizontal contiguity (Moran’s I) for each raster-based metric in order to capture different aspects of spatial variability. Detailed descriptions and mathematical formulations of all nine metrics, along with their variants, are provided in Table 2.

All 3D CSC metrics derived in this study are conceptually grounded in prior ecological research. While Rumple [41], TopRug_v1, and CanHet_v1 [17] have previously been applied, the remaining metrics are novel formulations that were designed to capture ecologically meaningful aspects of canopy structural complexity. Notably, canopy rugosity was developed as a top-down, 3D adaptation of the PCL-derived version [7]. We introduced several new complexity measures to quantify structural variability across distinct vertical strata of the canopy. Upper, mean, and lower rugosity quantify vegetation height variability in the upper, middle, and lower canopy, respectively, following the same principle as top rugosity. Entropy variability was introduced to assess whether the horizontal variability in foliage height diversity (FHD), an entropy-based CSC index typically measured at the plot level [42], influences NPP. Similarly, Pz_abovemean was developed to evaluate whether horizontal variability in the percentage of laser hits above mean canopy height—a proxy for upper canopy vegetation density—is strongly related to NPP.

We developed a workflow in R v.4.3.1 [43] using functions in the lidR package to compute Rumple and generate the grids for all other 3D CSC metrics. This was achieved by passing the plot LAS file, the relevant grid function (provided under “computational description” in Table 2), and the grid resolution as arguments into the pixel_metrics function. Subsequent computation of the three variants of each raster-based metrics was performed using Python 3.10.11. All code for metric derivation, including procedures for handling empty raster cells (no hits), are available in the repository provided in the data availability statement.

2.5. Sensitivity of 3D CSC Metrics to Grid Resolution

Previous studies employing grid-based CSC metrics derived from LiDAR point clouds used a binning interval of 1 m [5,7,44]. However, a fixed 1 m resolution may not adequately capture structural variability relevant to stand-scale processes influencing NPP across all CSC metrics developed in this or earlier work. Recent work has demonstrated significant relationships between complexity metrics derived at coarser grid resolutions and forest productivity (as estimated by NDVI) at stand-to-patch scales [45]. Yet, to our knowledge, no studies have evaluated how the grid resolution used to calculate CSC metrics affects their relationship with NPP at the stand scale. Given the absence of a priori knowledge on the optimal grid resolution for predicting plot-scale production, we systematically evaluated the sensitivity of each CSC metric to grid resolution. Metrics were derived at resolutions ranging from 1 to 10 m in 1 m increments, yielding 10 values per metric. To assess sensitivity, we analyzed Pearson correlations (R) between each metric at coarser resolutions and the baseline 1 m resolution. Following the convention that R ≥ 0.7 indicates a strong correlation [46], metrics were considered resolution-stable if they maintained R > 0.7, indicating that their relationship with NPP remained consistent relative to the 1 m derivation.

As grid resolution increased, the following metrics fell below this threshold: rumple, canopy heterogeneity, entropy variability, percent hits above mean height variability, and all measures of Moran’s I. The grid resolutions at which these scale-sensitive (ss) metrics first dropped below the correlation threshold were identified as “breakpoints.” For each breakpoint, a new set of CSC metrics, termed a “breakpoint set,” was created by substituting the original 1 m resolution of the corresponding ss metric with its breakpoint resolution. For instance, if a metric’s first breakpoint occurred at 4 m, that resolution replaced 1 m in breakpoint set 1. This procedure was repeated for the first breakpoint of each ss metric. At the second breakpoint, the previously substituted resolution (e.g., 4 m) was updated to the next breakpoint value, forming breakpoint set 2, and so forth (Figure 2). Breakpoints for the ss metrics began to emerge at 3 m and continued through 10 m, resulting in eight breakpoint sets. Non-ss metrics retained a fixed grid resolution of 1 m across all sets. We modeled NPP separately with each set to ensure that all breakpoints of ss metrics were evaluated. This approach provides an efficient alternative to exhaustively testing all possible metric–resolution combinations, while ensuring a comprehensive evaluation of scale sensitivity.

2.6. Modeling and Statistical Analyses

2.6.1. Combining PLS-CV and RFE to Identify Candidate Models

We employed partial least squares (PLS) regression, in conjunction with recursive feature elimination (RFE), to identify the optimal combination of 3D CSC metrics for maximizing NPP estimation accuracy and to address high positive collinearity among several predictors (Figure S3). PLS regression is highly effective in addressing multicollinearity issues that commonly occur in multiple linear regression (MLR), especially when working with small sample sizes [47,48].

The PLS algorithm can extract as many components (latent variables) as there are predictors; however, including too many components risks overfitting [49]. We thus implemented cross-validation to determine the optimal number of components (hereafter, “PLS-CV”), selecting the number that maximized the cross-validated coefficient of determination (R²), referred to as the “CV score.” For model training and component selection, we used five-fold cross-validation (CV) for all plots (n = 49) and deciduous plots (n = 26). Given the site-wise distribution of samples (see Table 1), this approach ensured that samples from five to size out of seven sites for all forest types and three to four out of five sites for the deciduous class were used for model calibration, thereby enhancing the generalizability of the model across multiple sites and regional conditions. For the evergreen class, we used leave-one-out-cross-validation (LOOCV) to determine the optimal number of components due to the small sample size (n = 14), such that two of the three sites were used for model calibration.

The absolute values of the standardized regression coefficients were used to rank the importance of individual metrics in the fitted PLS models. RFE was then applied to iteratively eliminate the least important metric. After each elimination, the data were re-fitted, and the CV score was recorded, producing 25 models in total, corresponding to the number of metrics. The model with the highest CV score was then selected as a candidate model. Cross-validation is an effective method for assessing model estimation accuracy when additional validation data are unavailable, as is the case in our study. This combined PLS-CV and RFE workflow (Figure 3) was applied to all nine sets of CSC metrics to identify candidate models (one per set) for (1) all forest types, (2) deciduous forests, and (3) evergreen forests.

2.6.2. Selecting the Top Candidate Models

Candidate models with CV scores below 0.5 were excluded due to weak out-of-sample prediction strength, after which the corrected Akaike information criterion (AIC_C) was calculated for each of the remaining candidate models. The AIC is an effective measure of a model’s overall quality, with a lower AIC value indicating a more parsimonious and better-fitting model [50]. AIC_C is a bias-corrected version of AIC, better suited for small sample sizes or when the ratio of data points to the number of parameters is less than 40 [51], both of which apply to our study. Therefore, the candidate models with the lowest AIC_C scores were selected as the top candidate models. AIC and AIC_C were calculated using Equations (1) and (2), respectively.

AIC = 2k + n ln (MSE)

(1)

{AIC}_{C} = AIC + \frac{2 k (k + 1)}{n - k - 1}

(2)

where k is the number of parameters (metrics), n is the sample size, and MSE is the mean squared error of calibration.

2.6.3. Stepwise Elimination of Statistically Insignificant Metrics from the Top Candidate Models

When both variants (v1 and v2) of a given metric were included in a top candidate model, we removed the variant with the lower absolute standardized coefficient if the two shared the same sign and exhibited a strong correlation (R > 0.9), thereby eliminating redundancy. We then performed MLR using ordinary least squares (OLS) to evaluate the statistical significance of each remaining metric based on p-values (p), since PLS regression coefficients do not inherently carry p-values. Metric coefficients with p > 0.05 were considered statistically insignificant, following the conventional threshold for significance [52]. In cases where multiple metrics exceeded this threshold, the variable with the highest p was removed first. The removal of a variable can affect the variance of the remaining coefficients, potentially altering the p-values of other variables that previously exceeded the threshold. We therefore re-ran MLR after each elimination, removing the statistically insignificant metric with the highest p until all retained metrics had p < 0.05. After these removal steps, we re-applied PLS-CV and assessed the predictive performance of the resulting model, hereafter referred to as the “best model,” in terms of R², root mean squared error (RMSE), and relative RMSE (RMSE_r) in percentage, calculated using Equation (4).

RMSE = \sqrt{\frac{\sum_{i = 1}^{n} (ŷ_{i} - y_{i})}{n}}

(3)

{RMSE}_{r} = \frac{R M S E}{\max (y) - m i n (y)}

(4)

where ŷ_i represents the estimated NPP of sample i, ŷ_i represents the field-observed NPP of sample i, and n represents the total number of samples.

2.6.4. Assessing the Performance of Scale-Sensitive Metrics Selected in the Best Model at Other Highly Correlated Grid Resolutions

We evaluated the robustness of scale-sensitive metrics selected in the best model by substituting them with alternative grid resolutions that exhibited strong within-group (all/deciduous/evergreen) correlations (R > 0.7) with the originally selected scale. This procedure allowed us to examine whether the selected ss metrics retained their predictive utility for NPP estimation across other highly correlated grid resolutions, thereby addressing limitations of the breakpoint set approach, which assessed sensitivity relative only to the baseline 1 m resolution across all plots (n = 49). Finally, we performed MLR for each substituted resolution to test the statistical significance of all metric coefficients. If all metrics remained statistically significant (p < 0.05), we re-fitted the PLS-CV model and recorded the resulting model performance.

3. Results

3.1. NPP Estimation in Deciduous Plots

The best candidate model for estimating NPP in deciduous plots yielded a CV R² = 0.71, calibration R² = 0.76, and AIC_C = 18.9. The second- and third-best candidate models had substantially higher AIC_C values of 29.1 and 31.5, respectively (see Table 3). The top candidate model included four metrics: CanRug_v1 (1 m resolution in all sets), Rumple derived at a 10 m scale, and both EntVar_v1 and EntVar_v2 derived at an 8 m scale. Rumple quantifies roughness of the outer canopy, while Canopy Rugosity and Entropy Variability measure the horizontal variability of vertical canopy heterogeneity. CanRug_v1 and CanRug_v2 consistently ranked among the top three predictors across all breakpoint sets, with one of the two emerging as the top predictor, and both exhibited positive coefficients in every candidate model. These results indicate that both variants are strong positive predictors of NPP in deciduous plots. Notably, CanRug_v1 exhibited a significant pairwise correlation with NPP (R = 0.67; p < 0.05), further reinforcing its utility as a key structural predictor of plot-level production in deciduous forests. Rumple_10m, identified as the first and only breakpoint for this metric, was consistently selected as a positive predictor in all sets where it was included (set 2 onwards). EntVar_v1 and EntVar_v2 derived at an 8 m scale were selected as negative predictors in the only breakpoint set in which they were included (set 5).

The two EntVar variants selected in the top candidate model were highly correlated (R = 0.98) and closely clustered in latent space (see Figure S4), strongly suggesting that both captured the same underlying information. EntVar_v1_8m was removed from the model, since it exhibited a lower absolute standardized coefficient. Substituting EntVar_v2_8m with EntVar_v1_8m and re-fitting the model resulted in a negligible drop in accuracy (R² = 0.75), confirming that either variant can be used interchangeably for NPP estimation in deciduous plots. The remaining metrics—CanRug_v1, Rumple_10m, and EntVar_v2_8m—were all statistically significant based on MLR p-values. We therefore employed PLS-CV with five-fold cross-validation using these metrics to obtain the best model, achieving an R² = 0.77, an RMSE = 1.18 Mg ha⁻¹ year⁻¹, and an RMSE_r = 11%. The model equation and additional evaluation metrics are reported in Table 4 and Figure 4a. Notably, the coefficients for CanRug_v1 and Rumple_10m match those in the top candidate model, while the coefficient for EntVar_v2 equaled the sum of the coefficients of both EntVar variants in that model, further supporting their functional equivalence.

Neither Rumple_10m nor EntVar_v2_8m exhibited a strong correlation (R > 0.7) with its counterparts at other derivation scales within the deciduous dataset. Consequently, we did not assess the impact of substituting alternative grid resolutions for these metrics on model performance. The strong agreement between field-derived and model-estimated NPP (Figure 5a) underscores the reliability of the final three-metric model for estimating plot-scale NPP in temperate deciduous forests.

3.2. NPP Estimation in Evergreen Plots

The top candidate model for estimating NPP in evergreen plots yielded a CV R² = 0.51, a calibration R² = 0.73, and an AIC_C = 15. The performance of all other candidate models was notably inferior, with the next best model having an AIC_C = 31.3. The equation of the top candidate model, which included five metrics, is as follows:

NPP = 3.21 + 0.83 TopRug_MoranI_4m − 0.62 EntVar_v2_6m − 0.55 Pz_abovemean_v1_7m − 0.53 TopRug_v2_1m − 0.48 TopRug_v1_1m

(5)

TopRug_v1 and TopRug_v2 (1 m resolution in all sets) ranked among the top five predictors across all breakpoint sets. One of the two emerged as the top predictor in all but one set, and both exhibited negative coefficients in every candidate model. These results indicate that both variants were strong negative predictors of NPP in evergreen plots. The three other metrics selected in the top model—TopRug_MoranI_4m, EntVar_v2_6m, and Pz_abovemean_v1_7m—are all scale-sensitive and were included only in breakpoint set 3. The two TopRug variants selected in the model were perfectly correlated (R = 1.00) and closely clustered in latent space (see Figure S5), indicating redundancy. As TopRug_v1 had a lower absolute standardized coefficient, it was removed from the model. Substituting TopRug_v2 with TopRug_v1 and re-fitting the model resulted in identical accuracy, confirming that either variant is interchangeable for NPP estimation in evergreen plots. Subsequent stepwise elimination of statistically insignificant metrics led to the removal of Pz_abovemean_v1_7m (p = 0.22). Performing PLS-CV with LOOCV with the three remaining metrics yielded an R² = 0.68, an RMSE = 1 Mg ha⁻¹ year⁻¹, and an RMSE_r = 15%. The model equation is as follows:

NPP = 3.21 − 1.02 TopRug_v2_1m + 0.83 TopRug_MoranI_4m − 0.85 EntVar_v2_6m

(6)

The model’s goodness-of-fit decreased marginally (ΔR² = 0.05) following the removal steps. The coefficient for TopRug_MoranI remained consistent, while the absolute coefficient of EntVar_v2 increased. Notably, the coefficient for TopRug_v2 equaled the sum of the coefficients of both TopRug variants prior to removal, further confirming their functional redundancy.

For the evergreen samples, TopRug_MoranI_4m exhibited strong correlations (R > 0.7) with all other grid resolutions of the metric except at 1 m and 10 m, while EntVar_v2_6m exceeded the R = 0.7 threshold only at 5 m resolution. We substituted each metric with its respective strongly correlated resolution and re-ran MLR to evaluate statistical significance. Among all combinations tested, the pairing of TopRug_MoranI_7m and EntVar_v2_5m was the only combination where all three metrics met the significance threshold (p < 0.05). This combination achieved improved performance with R² = 0.76, RMSE = 0.86 Mg ha⁻¹ year⁻¹, and RMSE_r = 13% (Figure 5b) when used to re-fit the PLS-CV model and was considered the best evergreen model due to its superior accuracy relative to the top candidate model (Table 4, Figure 4b). EntVar_v2_5m had a larger absolute coefficient compared to its earlier counterpart in Equation (6), while TopRug_v2_1m and TopRug_MoranI_7m exhibited lower absolute coefficients. The inclusion of two top rugosity variants in the best model suggests that height variability of the canopy surface strongly influences plot-scale NPP in temperate evergreen forests.

3.3. NPP Estimation Across All Plots

None of the breakpoint sets yielded a strong candidate model (CV score > 0.5) for estimating NPP across all forest types, with the highest CV R² reaching only 0.36 and the next best model achieving 0.10. Notably, one or both variants (v1 and v2) of canopy rugosity ranked among the top five predictors across all sets, each time with positive coefficients. Similarly, one or both variants (v1 and v2) of top rugosity were among the top five predictors in six out of nine sets, always with negative coefficients. These patterns mirror findings from forest type-specific models, where canopy rugosity emerged as a strong positive predictor of NPP in deciduous plots and top rugosity as a strong negative predictor in evergreen plots (Table 4). This suggests that, while these metrics capture key structural drivers of NPP within each forest type, they do not generalize well across both deciduous and evergreen forests.

4. Discussion

We identified strong biome-wide predictors of stand-scale NPP for deciduous and evergreen forests using a novel set of 3D CSC metrics derived from high-density ALS data. The NPP variation explained by the best forest-type-specific models (Figure 5) affirms the capability of high-density, single-acquisition ALS data to reliably scale up fine-scale forest production estimates across large temperate forest landscapes. However, none of the 3D CSC metrics emerged as strong NPP predictors when modeling the combined datasets of deciduous, evergreen, and mixed forest plots (R² ≤ 0.40).

4.1. Ecological Significance of Strong CSC Predictors of Deciduous Forest NPP

The cross-validated accuracy of the best NPP prediction model for deciduous forests (R² = 0.71) was significantly higher than that for evergreen forests (R² = 0.54), indicating stronger generalizability of the metrics identified in the former compared to the latter. However, it is important to note that disparities in sample size may have led to this difference in model performance, as fewer evergreen plots and sites were used in cross-validation. This smaller validation set may have limited the model’s ability to capture the full range of variability present in evergreen forests, thereby reducing its predictive strength.

Canopy rugosity v1 emerged as the strongest positive NPP predictor in deciduous forests, followed by Rumple_10m. This finding aligns with research using terrestrial LiDAR by Gough et al. [5], which found strong positive CSC-NPP correlations with PCL-derived canopy rugosity (R² = 0.83) and rumple (R² = 0.77) across temperate forests in the eastern US, predominantly composed of deciduous broadleaf trees. The strong positive association between ALS-derived canopy rugosity and plot-scale NPP indicates that the positive relationship reported for PCL-derived, transect-based rugosity [5] extends to spatially continuous, three-dimensional estimates in temperate deciduous forests. Within the deciduous class, CanRug_v2_1m exhibited a near-perfect positive correlation with CanRug_v1_1m, indicating interchangeability, and both variants exhibited correlations > 0.88 across the 1–10 m grid resolution range relative to their respective 1 m derivation scale, suggesting stable predictive power up to 10 m resolution. The emergence of 3D rumple, which quantifies canopy surface roughness, as a strong positive predictor of plot-scale NPP at 10 m grid resolution likely relates to the size of individual tree crowns in the upper canopy, which are often ≥ 10 m in diameter. Similarly, the selection of EntVar_v2 as a strong negative predictor of plot-scale NPP at the 8 m scale may also be connected to crown size. The inverse relationship between EntVar_v2_8m and plot-scale NPP suggests that higher horizontal variability in tree-level foliage height diversity is associated with lower production. This trend likely stems from the metric’s negative association with canopy closure, as illustrated in Figure S6. Among the plots with below-mean NPP, those exhibiting the highest EntVar_v2_8m values show canopy gaps in their CHMs, whereas plots with high NPP and low EntVar_v2_8m exhibit more continuous and densely packed canopies. These denser canopies are indicative of greater stand density, a structural trait known to promote stand productivity [53]. Together, the three CSC predictors in the best deciduous model—CanRug_v1, Rumple_10m, and EntVar_v2_8m—capture distinct but complementary aspects of canopy structure.

4.2. Ecological Significance of Strong CSC Predictors of Evergreen Forest NPP

The inclusion of two top rugosity metrics in the best evergreen model is consistent with the finding that structural complexity in evergreen conifer forests is primarily driven by features near the canopy surface [54]. Top rugosity v1 and v2 both emerged as strong negative NPP predictors in evergreen forests, indicating an inverse relationship between canopy height variability and plot-scale NPP. Among the two, TopRug_v2 consistently ranked higher than TopRug_v1 across multiple sets, including in the best model. This suggests that the spatial dependence captured by the transect-wise variant (v2) offers greater predictive power than the overall variation (v1). Within the evergreen class, both variants of top rugosity exhibited correlations > 0.8 up to the 7 m derivation scale, suggesting stable predictive power up to this grid resolution. TopRug_MoranI derived at the 7 m scale emerged as the only strong positive NPP predictor in evergreen forests, reinforcing the inverse relationship between canopy height variability and plot-scale NPP, as higher Moran’s I values correspond to lower horizontal variability. The negative relationship between CSC and NPP in evergreen conifer forests, evidenced by all the metrics in the best evergreen model, aligns with Hickey et al. [55], who found that younger, denser pine stands utilized light more efficiently for biomass production than older, structurally complex forests. This pattern is exemplified by the NPP differences (Table S1) between WREF—an old-growth forest with many stands over 400 years old [56]—and TALL, which consists primarily of younger, regenerated stands. WREF plots exhibited substantially higher top rugosity (mean TopRug_v2_1m = 9.4) compared to TALL and RMNP (mean TopRug_v2_1m = 5.3 and 4.6, respectively), supporting Kane et al.’s [41] finding that conifer stand surface roughness increases with age. This age-related rise in structural complexity appears inversely related to conifer forest production, as demonstrated by our results and those of Hickey et al. [55]. However, the contrasting relationship between NPP and top rugosity across the evergreen sites may also be driven by differences in species composition. WREF is dominated by late-successional conifers with conical crowns, resulting in high surface rugosity, whereas the more species-rich stands in TALL (see Table 1) likely produce smoother canopy surfaces by more effectively filling available space with diverse crown architectures. This suggests that TALL’s species composition enables more efficient resource utilization through greater volume occupation, ultimately leading to higher production.

4.3. Explaining the Differences in CSC–NPP Relationships Between Deciduous and Evergreen Forests

The poor predictive performance for combined forest types can be attributed to fundamental differences in crown structure and leaf phenology between conifers and deciduous broadleaf trees, which reflect different strategies for light interception and, consequently, influence productivity [57]. The physiognomy of deciduous broadleaf trees, with their wide leaves and diverse crown architectures, contrasts sharply with that of needle-leaved, pyramidal-shaped conifers, leading to markedly different canopy structures and, potentially, different structural indicators that drive NPP. Consequently, contrasting forest types may lack shared structural attributes that consistently relate to primary production in both. This finding highlights the need for stratifying forests based on forest type when using 3D structural metrics for production estimation.

The differences in CSC-NPP relationships observed between the two forest types can be attributed to differing growth strategies and varying species diversity. Evergreen conifer forests, which exhibit a more homogeneous canopy structure with few species occupying the overstory [58], follow a quantity-based growth strategy in which leaf density and concentration at the top canopy surface maximizes light interception, possibly at the cost of optimal light use [55]. This likely explains the negative CSC–NPP relationships observed in the evergreen stands, where plots with greater structural homogeneity were more productive. In contrast, deciduous broadleaf forests, typically more diverse in tree species, follow an optimization-based growth strategy in which a structurally varied leaf distribution appears to drive light use efficiency, and thus NPP [7]. This is congruent with the positive CSC–NPP relationships observed in these plots.

4.4. Implications, Limitations, and Future Directions

Our findings have promising implications for temperate forest management and ecosystem modeling. The CSC metrics identified as strong NPP predictors can be leveraged to locate and monitor highly productive temperate forest stands via comprehensive ALS surveys, guiding land-use policies and management strategies that enhance carbon sequestration rates. We recommend that forest managers incorporate the CSC metrics in this model to assess production in temperate deciduous stands, given the high cross-validation accuracy of the deciduous model.

The strong predictive performance of the forest type-specific models across a range of ecoclimatic conditions underscores their biome-wide applicability, marking a notable advancement over previous LiDAR-based efforts to predict stand-scale production, which were largely limited to single-site studies [6,44,59]. The robust CSC–NPP relationships observed at the plot scale offer new opportunities to analyze structure–production interactions across fine-scale climatic, edaphic, and biotic gradients, providing valuable insights for localized forest management.

Although we observed strong CSC-NPP relationships in undisturbed temperate forests, these relationships may vary with factors such as disturbance history or forest type. Disturbance events such as logging and insect-induced tree mortality have been shown to significantly alter forest structural complexity [44,60] and productivity [61,62], potentially disrupting the observed links between CSC metrics and NPP. Further research is required to determine how the type, severity, and timing of disturbance events influence CSC-NPP relationships over varying temporal scales. Additionally, the applicability of these relationships to other biomes and forest types remains uncertain, as evidenced by the divergent CSC-NPP patterns observed between deciduous and evergreen plots in this study. Despite these limitations, our results demonstrate that, in undisturbed temperate forests, CSC metrics derived from high-density ALS point clouds offer a reliable means of estimating stand-scale production without relying on auxiliary climatic or biotic variables, which often introduce additional uncertainty. For example, Holopainen et al. [21] found that a 5-year error in stand age can reduce the accuracy of site index models by 5–15%. Our methodological framework provides a foundation for extending this research beyond temperate forests. Future studies should investigate whether ALS-derived 3D CSC metrics, including those developed in this study, can serve as robust proxies for forest production in tropical, boreal, and other non-temperate biomes.

5. Conclusions

Based on our results, we draw the following conclusions:

(1): Plot-level NPP in both deciduous and evergreen forests can be reliably estimated using a linear combination of three ALS-derived CSC metrics when modeled separately by forest type.
(2): The best-performing NPP model for deciduous plots outperformed that for evergreen plots, indicating stronger biome-wide CSC-NPP relationships in deciduous forests.
(3): ALS-derived 3D CSC metrics did not yield robust NPP estimation models when the two forest types were combined, suggesting that the structural attributes influencing NPP differ between deciduous and evergreen forests.
(4): The accuracy of NPP predictions was sensitive to the spatial resolution at which some CSC metrics were derived, highlighting the importance of scale when linking canopy structure to primary production.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/rs17162817/s1, Figure S1: Spatial layout of observational sampling at NEON terrestrial sites [63]; Figure S2: Mapped locations of the seven NEON temperate forest sites selected for this study [64]; Figure S3: Correlation matrix of 3D CSC metrics derived at 1 m resolution; Figure S4: Loadings plot of the top candidate model (deciduous class); Figure S5: Loadings plot of the top candidate model (evergreen class); Figure S6: Top row: Canopy height models (CHMs) of the four plots with the highest EntVar_v2_8m values among those with below-mean NPP in the deciduous dataset. Bottom row: CHMs of the four plots with the highest NPP values among those with below-mean EntVar_v2_8m in the deciduous dataset; Table S1: Summary statistics of field-derived NPP values, in megagrams per hectare per year, and the dominant canopy species [31] at each site.

Author Contributions

Conceptualization, T.S.; methodology, T.S.; software, T.S.; validation, T.S. and B.A.; formal analysis, T.S. and B.A.; investigation, T.S. and B.A.; resources, T.S., B.A., C.G., K.K. and J.v.A.; data curation, T.S.; writing—original draft preparation, T.S.; writing—review and editing, B.A., C.G., J.v.A. and K.K.; visualization, T.S.; supervision, J.v.A.; project administration, K.K., J.v.A. and C.G.; funding acquisition, K.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Aeronautics and Space Administration (NASA) through the Decadal Survey Incubation (DSI) program, grant number 80NSSC22K1097.

Data Availability Statement

The code for generating ALS-derived 3D CSC metrics and modelling NPP using these metrics are available at https://github.com/tahriribraq/AOP-CSC-NPP (accessed on 24 June 2025). The preprocessed plot point clouds used in the study are also provided in this repository.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Pretzsch, H. Forest dynamics, growth, and yield: A review, analysis of the present state, and perspective. In Forest Dynamics, Growth and Yield: From Measurement to Model; Spinger: Berlin/Heidelberg, Germany, 2008; pp. 1–39. [Google Scholar]
Buotte, P.C.; Law, B.E.; Ripple, W.J.; Berner, L.T. Carbon sequestration and biodiversity co-benefits of preserving forests in the western United States. Ecol. Appl. 2020, 30, e02039. [Google Scholar] [CrossRef]
Forest Owner Carbon and Climate Education (FOCCE). Carbon Accounting in Forest Management. 2023. Available online: https://extension.psu.edu/carbon-accounting-in-forest-management (accessed on 3 July 2024).
Coops, N.C. Characterizing Forest Growth and Productivity Using Remotely Sensed Data. Curr. For. Rep. 2015, 1, 195–205. [Google Scholar] [CrossRef]
Gough, C.M.; Atkins, J.W.; Fahey, R.T.; Hardiman, B.S. High Rates of Primary Production in Structurally Complex Forests. Ecology 2019, 100, e02864. [Google Scholar] [CrossRef]
Hardiman, B.S.; Gough, C.M.; Halperin, A.; Hofmeister, K.L.; Nave, L.E.; Bohrer, G.; Curtis, P.S. Maintaining high rates of carbon storage in old forests: A mechanism linking canopy structure to forest function. For. Ecol. Manag. 2013, 298, 111–119. [Google Scholar] [CrossRef]
Hardiman, B.S.; Bohrer, G.; Gough, C.M.; Vogel, C.S.; Curtis, P.S. The Role of Canopy Structural Complexity in Wood Net Primary Production of a Maturing Northern Deciduous Forest. Ecology 2011, 92, 1818–1827. [Google Scholar] [CrossRef] [PubMed]
Atkins, J.W.; Fahey, R.T.; Hardiman, B.S.; Gough, C.M. Forest canopy structural complexity and light absorption relationships at the subcontinental scale. J. Geophys. Res. Biogeosci. 2018, 123, 1387–1405. [Google Scholar] [CrossRef]
Ahl, D.E.; Gower, S.T.; Mackay, D.S.; Burrows, S.N.; Norman, J.M.; Diak, G.R. Heterogeneity of light use efficiency in a northern Wisconsin forest: Implications for modeling net primary production with remote sensing. Remote Sens. Environ. 2004, 93, 168–178. [Google Scholar] [CrossRef]
Ishii, H.T.; Tanabe, S.-I.; Hiura, T. Exploring the relationships among canopy structure, stand productivity, and biodiversity of temperate forest ecosystems. For. Sci. 2004, 50, 342–355. [Google Scholar] [CrossRef]
Duursma, R.A.; Makela, A. Summary models for light interception and light-use efficiency of non-homogeneous canopies. Tree Physiol. 2007, 27, 859–870. [Google Scholar] [CrossRef]
Niinemets, U. Photosynthesis and resource distribution through plant canopies. Plant Cell Environ. 2007, 30, 1052–1071. [Google Scholar] [CrossRef]
Walcroft, A.S.; Brown, K.J.; Schuster, W.S.; Tissue, D.T.; Turnbull, M.H.; Griffin, K.L.; Whitehead, D. Radiative transfer and carbon assimilation in relation to canopy architecture, foliage area distribution and clumping in a mature temperate rainforest canopy in New Zealand. Agric. For. Meteorol. 2005, 135, 326–339. [Google Scholar] [CrossRef]
Niinemets, Ü. Optimization of foliage photosynthetic capacity in tree canopies: Towards identifying missing constraints. Tree Physiol. 2012, 32, 505–509. [Google Scholar] [CrossRef]
Fahey, R.T.; Atkins, J.W.; Gough, C.M.; Hardiman, B.S.; Nave, L.E.; Tallant, J.M.; Nadehoffer, K.J.; Vogel, C.; Scheuermann, C.M.; Stuart-Haëntjens, E.; et al. Defining a spectrum of integrative trait-based vegetation canopy structural types. Ecol. Lett. 2019, 22, 2049–2059. [Google Scholar] [CrossRef] [PubMed]
Jarron, L.R.; Coops, N.C.; MacKenzie, W.H.; Tompalski, P.; Dykstra, P. Detection of sub-canopy forest structure using airborne LiDAR. Remote Sens. Environ. 2020, 244, 111770. [Google Scholar] [CrossRef]
LaRue, E.A.; Hardiman, B.S.; Elliott, J.M.; Fei, S. Structural diversity as a predictor of ecosystem function. Environ. Res. Lett. 2019, 14, 114011. [Google Scholar] [CrossRef]
Liu, X.; Feng, Y.; Hu, T.; Luo, Y.; Zhao, X.; Wu, J.; Maeda, E.E.; Ju, W.; Liu, L.; Guo, Q.; et al. Enhancing ecosystem productivity and stability with increasing canopy structural complexity in global forests. Sci. Adv. 2024, 10, eadl1947. [Google Scholar] [CrossRef]
Hayek, M.N.; Wehr, R.; Longo, M.; Hutyra, L.R.; Wiedemann, K.; Munger, J.W.; Bonal, D.; Saleska, S.R.; Fitzjarrald, D.R.; Wofsy, S.C. A novel correction for biases in forest eddy covariance carbon balance. Agric. For. Meteorol. 2018, 250–251, 90–101. [Google Scholar] [CrossRef]
Skovsgaard, J.P.; Vanclay, J.K. Forest site productivity: A review of the evolution of dendrometric concepts for even-aged stands. Forestry 2008, 81, 13–31. [Google Scholar] [CrossRef]
Holopainen, M.; Vastaranta, M.; Haapanen, R.; Yu, X.; Hyyppä, J.; Kaartinen, H.; Viitala, R.; Hyyppä, H. Site-type estimation using airborne laser scanning and stand register data. Photogramm. J. Finl. 2010, 22, 16–32. [Google Scholar]
Maselli, F.; Mari, R.; Chiesi, M. Use of lidar data to simulate forest net primary production. Int. J. Remote Sens. 2012, 34, 2487–2501. [Google Scholar] [CrossRef]
Tompalski, P.; Coops, N.C.; White, J.C.; Wulder, M.A.; Pickell, P.D. Estimating Forest Site Productivity Using Airborne Laser Scanning Data and Landsat Time Series. Can. J. Remote Sens. 2015, 41, 232–245. [Google Scholar] [CrossRef]
Kampe, T.U.; Johnson, B.R.; Kuester, M.; Keller, M. NEON: The first continental-scale ecological observatory with airborne remote sensing of vegetation canopy biochemistry and structure. J. Appl. Remote Sens. 2010, 4, 043510. [Google Scholar] [CrossRef]
Riegl LMS-Q780 Data Sheet, Riegl. 2015. Available online: https://www.rieglusa.com/wp-content/uploads/lms-q780-datasheet.pdf (accessed on 30 October 2024).
Fotis, A.T.; Morin, T.H.; Fahey, R.T.; Hardiman, B.S.; Bohrer, G.; Curtis, P.S. Forest structure in space and time: Biotic and abiotic determinants of canopy complexity and their effects on net primary productivity. Agric. For. Meteorol. 2018, 250, 181–191. [Google Scholar] [CrossRef]
Gough, C.M.; Atkins, J.W.; Fahey, R.T.; Curtis, P.S.; Bohrer, G.; Hardiman, B.S.; Hickey, L.J.; Nave, L.E.; Niedermaier, K.M.; Clay, C.; et al. Disturbance has variable effects on the structural complexity of a temperate forest landscape. Ecol. Indic. 2022, 140, 109004. [Google Scholar] [CrossRef]
NEON (National Ecological Observatory Network). Site Management and Event Reporting (DP1.10111.001), RELEASE-2024. Available online: https://data.neonscience.org/data-products/DP1.10111.001/RELEASE-2024 (accessed on 15 July 2024).
Correcting Land Cover Maps for NEON Field Sites. Available online: https://www.neonscience.org/impact/observatory-blog/correcting-land-cover-maps-neon-field-sites (accessed on 2 May 2025).
Swanson, R. TOS Protocol and Procedure: PLT–Plot Establishment and Maintenance; NEON.DOC.001025; NEON (National Ecological Observatory Network): Boulder, CO, USA, 2023. [Google Scholar]
NEON. Vegetation Structure (DP1.10098.001); RELEASE-2024; National Ecological Observatory Network (NEON): Boulder, CO, USA, 2024. [Google Scholar]
Chojnacky, D.C.; Heath, L.S.; Jenkins, J.C. Updated generalized biomass equations for North American tree species. Forestry 2014, 87, 129–151. [Google Scholar] [CrossRef]
Musinsky, J.; Goulden, T.; Wirth, G.; Leisso, N.; Krause, K.; Haynes, M.; Chapman, C. Spanning scales: The airborne spatial and temporal sampling design of the National Ecological Observatory Network. Methods Ecol. Evol. 2022, 13, 1866–1884. [Google Scholar] [CrossRef]
NEON (National Ecological Observatory Network). Discrete Return LiDAR Point Cloud; DP1.30003.001; RELEASE-2024; NEON (National Ecological Observatory Network): Boulder, CO, USA, 2024; Available online: https://data.neonscience.org/data-products/DP1.30003.001/RELEASE-2024 (accessed on 18 July 2024).
LAStools. Efficient LiDAR Processing Software, version 230901; Academic; rapidlasso GmbH: Gilching, Germany, 2023; Available online: http://rapidlasso.com/LAStools (accessed on 30 October 2024).
All_NEON_TOS_Plots_V10. Available online: https://data.neonscience.org/documents/-/document_library_display/kV4WWrbEEM2s/view_file/3411434 (accessed on 28 April 2025).
Roussel, J.; Auty, D.; Coops, N.C.; Tompalski, P.; Goodbody, T.R.; Meador, A.S.; Bourdon, J.; de Boissieu, F.; Achim, A. lidR: An R package for analysis of Airborne Laser Scanning (ALS) data. Remote Sens. Environ. 2020, 251, 112061. [Google Scholar] [CrossRef]
Roussel, J.; Auty, D. Airborne LiDAR Data Manipulation and Visualization for Forestry Applications; R Package Version 4.1.2. 2024. Available online: https://cran.r-project.org/package=lidR (accessed on 24 October 2024).
Moran, P.A.P. Notes on Continuous Stochastic Phenomena. Biometrika 1950, 37, 17–23. [Google Scholar] [CrossRef]
Atkins, J.W.; Bohrer, G.; Fahey, R.T.; Hardiman, B.S.; Morin, T.H.; Stovall, A.E.L.; Gough, C.M. Quantifying vegetation and canopy structural complexity from TLS data using the forestr r package. Methods Ecol. Evol. 2018, 9, 2057–2066. [Google Scholar] [CrossRef]
Kane, V.R.; Bakker, J.D.; McGaughey, R.J.; Lutz, J.A.; Gersonde, R.F.; Franklin, J.F. Examining conifer canopy structural complexity across forest ages and elevations with LiDAR data. Can. J. For. Res. 2010, 40, 774–787. [Google Scholar] [CrossRef]
Simonson, W.D.; Allen, H.D.; Coomes, D.A. Applications of airborne lidar for the assessment of animal species diversity. Methods Ecol. Evol. 2014, 5, 719–729. [Google Scholar]
R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2023; Available online: https://www.R-project.org/ (accessed on 24 October 2024).
Alveshere, B.C.; Siddiqui, T.; Krause, K.; van Aardt, J.A.; Gough, C.M. Hemlock woolly adelgid infestation influences canopy structural complexity and its relationship with primary production in a temperate mixed forest. For. Ecol. Manag. 2025, 586, 122698. [Google Scholar] [CrossRef]
LaRue, E.A.; Rezendes, K.M.; Choi, D.H.; Wang, J.; Downing, A.G.; Fei, S.; Hardiman, B.S. Gradient surface metrics of ecosystem structural diversity and their relationship with productivity across macrosystems. Ecosphere 2025, 16, e70172. [Google Scholar] [CrossRef]
Ratner, B. The correlation coefficient: Its values range between +1/−1, or do they? J. Target. Meas. Anal. Mark. 2009, 17, 139–142. [Google Scholar] [CrossRef]
Wold, H. Soft modeling: The basic design and some extensions. Syst. Under Indirect. Obs. Part II 1982, 2, 36–37. [Google Scholar]
Wold, S.; Sjöström, M.; Eriksson, L. PLS-regression: A basic tool of chemometrics. Chemom. Intell. Lab. Syst. 2001, 58, 109–130. [Google Scholar] [CrossRef]
Abdi, H. Partial least squares regression and projection on latent structure regression (PLS Regression). Wiley Interdiscip. Rev. Comput. Stat. 2010, 2, 97–106. [Google Scholar] [CrossRef]
Kelloway, E.K. Using Mplus for Structural Equation Modeling: A Researcher’s Guide; Sage Publications: Thousand Oaks, CA, USA, 2014. [Google Scholar]
Hurvich, C.M.; Tsai, C.L. Regression and time series model selection in small samples. Biometrika 1989, 76, 297–307. [Google Scholar] [CrossRef]
Concato, J.; Hartigan, J.A. P values: From suggestion to superstition. J. Investig. Med. 2016, 64, 1166–1171. [Google Scholar] [CrossRef]
Morin, X.; Toigo, M.; Fahse, L.; Guillemot, J.; Cailleret, M.; Bertrand, R.; Cateau, E.; de Coligny, F.; García-Valdés, R.; Ratcliffe, S.; et al. More species, more trees: The role of tree packing in promoting forest productivity. J. Ecol. 2025, 113, 371–386. [Google Scholar] [CrossRef]
de Conto, T.; Armston, J.; Dubayah, R. Characterizing the structural complexity of the Earth’s forests with spaceborne lidar. Nat. Commun. 2024, 15, 8116. [Google Scholar] [CrossRef]
Hickey, L.J.; Atkins, J.; Fahey, R.T.; Kreider, M.R.; Wales, S.B.; Gough, C.M. Contrasting Development of Canopy Structure and Primary Production in Planted and Naturally Regenerated Red Pine Forests. Forests 2019, 10, 566. [Google Scholar] [CrossRef]
U.S. Forest Service. Wind River Experimental Forest. 2025. Available online: https://research.fs.usda.gov/pnw/forestsandranges/locations/windriver (accessed on 21 February 2025).
Ishii, H.; Asano, S. The role of crown architecture, leaf phenology and photosynthetic activity in promoting complementary use of light among coexisting species in temperate forests. Ecol. Res. 2010, 25, 715–722. [Google Scholar] [CrossRef]
Dormann, C.F.; Bagnara, M.; Boch, S.; Hinderling, J.; Janeiro-Otero, A.; Schäfer, D.; Schall, P.; Hartig, F. Plant species richness increases with light availability, but not variability, in temperate forests understorey. BMC Ecol. 2020, 20, 43. [Google Scholar] [CrossRef] [PubMed]
Tompalski, P.; Coops, N.C.; White, J.C.; Goodbody, T.R.; Hennigar, C.R.; Wulder, M.A.; Socha, J.; Woods, M.E. Estimating Changes in Forest Attributes and Enhancing Growth Projections: A Review of Existing Approaches and Future Directions Using Airborne 3D Point Cloud Data. Curr. For. Rep. 2021, 7, 1–24. [Google Scholar] [CrossRef]
Yi, X.; Wang, N.; Ren, H.; Yu, J.; Hu, T.; Su, Y.; Mi, X.; Guo, Q.; Ma, K. From canopy complementarity to asymmetric competition: The negative relationship between structural diversity and productivity during succession. J. Ecol. 2021, 110, 457–465. [Google Scholar] [CrossRef]
Hicke, J.A.; Allen, C.D.; Desai, A.R.; Dietze, M.C.; Hall, R.J.; Hogg, E.H.; Kashian, D.M.; Moore, D.J.; Raffa, K.F.; Sturrock, R.N.; et al. Effects of biotic disturbances on forest carbon cycling in the United States and Canada. Glob. Change Biol. 2012, 18, 7–34. [Google Scholar] [CrossRef]
Juchheim, J.; Ammer, C.; Schall, P.; Seidel, D. Canopy space filling rather than conventional measures of structural diversity explains productivity of beech stands. For. Ecol. Manag. 2017, 395, 19–26. [Google Scholar] [CrossRef]
Thorpe, A.S.; Barnett, D.T.; Elmendorf, S.C.; Hinckley, E.L.S.; Hoekman, D.; Jones, K.D.; LeVan, K.E.; Meier, C.L.; Stanish, L.F.; Thibault, K.M. Introduction to the sampling designs of the National Ecological Observatory Network Terrestrial Observation System. Ecosphere 2016, 7, e01627. [Google Scholar] [CrossRef]
NEON (National Ecological Observatory Network). Field Sites Map. Available online: https://neon.maps.arcgis.com/home/webmap/viewer.html?webmap=6396acd10aab4f0b83299911053dccfc (accessed on 31 October 2024).

Figure 1. Filtered point cloud of a 40 × 40 m plot showing vegetation and ground hits. Points are colored according to height using a blue-green-red gradient. A column from the plot point cloud gridded at 1 m resolution is illustrated on the right, with arrows indicating the column (raster) values used to derive all 3D CSC metrics. The portions of the canopy described by each column value are color-coded as follows: top/surface—black, upper—red, mid/lower—blue, entire—orange.

Figure 2. Metric values were derived at grid resolutions ranging from 1 to 10 m, as illustrated above. The derived raster values are colored using a blue-green-red gradient. The flowchart at the bottom illustrates the successive replacement of scale-sensitive (ss) metrics at resolutions where the correlation fell below R = 0.7 relative to 1 m (breakpoint). The first set consists of all metrics derived at 1 m resolution. In the second set, ss metrics in set 1 were replaced with their counterparts derived at the resolutions in which the second breakpoints occur. This was repeated for all breakpoints, yielding nine sets of 3D CSC metrics across the range of grid resolution.

Figure 3. Modelling approach combining partial least squares regression and cross-validation (PLS-CV) and recursive feature elimination (RFE) to select candidate models.

Figure 4. Progression from top candidate model to best model for (a) deciduous plots and (b) evergreen plots.

Figure 5. Predicted versus field-derived NPP for (a) deciduous and (b) evergreen plots using the respective best-performing models. Samples are color-coded by site.

Table 1. Summary of data collection years, ecoclimatic conditions, forest types, and number of qualifying tower plot species at each NEON forest site. The site acronyms stand for: BART—Bartlett Experimental Forest, GRSM—Great Smoky Mountains National Park, ORNL—Oak Ridge National Laboratory, RMNP—Rocky Mountain National Park, TALL—Talladega National Forest, UNDE—University of Notre Dame Environmental Research Center, WREF—Wind River Experimental Forest.

Site	NEON Ecoclimatic Domain	AOP Data Collection Year	NPP Measurement Period	Forest Types (Plot-Level NLCD Classes)	Mean Elevation (m)	Mean Annual Temperature (C)	Mean Annual Precipitation (mm)	Number of Qualifying Plots
BART	Northeast	2022	2018–2022	Deciduous, Evergreen, Mixed	274	6.2	1325	5 (3 deciduous, 2 mixed)
GRSM	Appalachians and Cumberland Plateau	2018	2017–2019	Deciduous, Evergreen	575	13.1	1375	11 (10 deciduous, 1 mixed)
ORNL	Appalachians and Cumberland Plateau	2018	2018–2020	Deciduous, Evergreen	344	14.4	1340	5 (all deciduous)
RMNP	Southern Rockies and Colorado Plateau	2020	2019–2022	Evergreen	2742	2.9	731	5 (all evergreen)
TALL	The Ozarks Complex	2021	2018–2021	Deciduous, Evergreen, Mixed	166	7.2	1383	13 (7 deciduous, 6 mixed)
UNDE	The Great Lakes	2020	2018–2022	Deciduous, Mixed	521	4.3	802	13 (7 deciduous, 6 mixed)
WREF	The Pacific Northwest	2019	2019–2022	Evergreen	351	9.2	2225	5 (all evergreen)

Table 2. 3D CSC metrics derived from high-density discrete return ALS point cloud data. Metric acronyms are presented parenthetically under the first column. The lidR functions used to generate rasters for each metric are provided in italics under “computational derivation.” For each raster-based metric, three variants (see footnote) were calculated to capture different aspects of horizontal variability, resulting in a total of 25 metrics.

CSC Metric	Description	Portion of Canopy	Computational Derivation
Rumple	Area of canopy surface divided by the projected ground surface.	Outer surface	The surface points (highest hit) of DTM-normalized ¹ plot point clouds were filtered at 1–10 m resolution ². The area of the triangulated surface created by the surface points was then divided by the area of the plot DTM using the rumple_index function.
Top Rugosity v1 (TopRug_v1)	Overall horizontal variation in maximum height.	Outer surface	The highest hits were gridded at 1–10 m resolution using the zmax function. V1, V2, and Moran’s I ³ [39] were computed from the grid (see footnote).
Top Rugosity v (TopRug_v2)	Transect-wise horizontal variation in maximum height.
Top Rugosity Moran’s I (TopRug_v2)	Spatial autocorrelation of maximum height.
Upper Rugosity v1 (UpperRug_v1)	Overall horizontal variation in 75th percentile height.	Upper	The 75th percentile heights of column hits were gridded at 1–10 m resolution using the zq75 function. V1, V2, and Moran’s I were computed from the grid (see footnote).
Upper Rugosity v2 (UpperRug_v2)	Transect-wise horizontal variation in 75th percentile height.
Upper Rugosity Moran’s I (UpperRug_MoranI)	Spatial autocorrelation of 75th percentile height.
Mean Rugosity v1 (MeanRug_v1)	Overall horizontal variation in mean height.	Middle	The mean heights of column hits were gridded at 1–10 m resolution using the mean (Z) function. V1, V2, and Moran’s I were computed from the grid (see footnote).
Mean Rugosity v2 (MeanRug_v2)	Transect-wise horizontal variation in mean height.
Mean Rugosity Moran’s I (MeanRug_MoranI)	Spatial autocorrelation of mean height.
Lower Rugosity v1 (LowerRug_v1)	Overall horizontal variation in 25th percentile height.	Lower	The 25th percentile heights of column hits were gridded at 1–10 m resolution using the zq25 function. V1, V2, and Moran’s I were computed from the grid (see footnote).
Lower Rugosity v2 (LowerRug_v2)	Transect-wise horizontal variation in 25th percentile height.
Lower Rugosity Moran’s I (LowerRug_MoranI)	Spatial autocorrelation of 25th percentile height.
Canopy Rugosity v1 (CanRug_v1)	Overall horizontal variation in vertical variation of density adjusted mean vegetation height.	Entire	1. The ground points were filtered out from the DTM-normalized plot point clouds. 2. The plot point clouds were converted into n × n × 1 m voxels, where n ranged from 1 to 10 m, and the number of hits in each voxel was tallied using the voxel_metrics function. 3. For each n × n m column, the number of hits in each voxel (z-bin) was normalized by the total number of hits in the column to obtain the vegetation area index (VAI) in each z-bin. 4. The standard deviation of density-adjusted mean leaf height (StdBin) was computed for each column by applying the same equations as the PCL derivation in the ForestR package [40]. 5. V1, V2, and Moran’s I were computed from the grid (see footnote).
Canopy Rugosity v2 (CanRug_v2)	Transect-wise horizontal variation in vertical variation of density adjusted mean vegetation height.
Canopy Rugosity Moran’s I (CanRug_MoranI)	Spatial autocorrelation of vertical variation of density adjusted mean vegetation height.
Canopy Heterogeneity v1 (CanHet_v1)	Overall horizontal variation in standard deviation of column vegetation height.	Entire	The standard deviations of column hits were gridded at 1–10 m resolution using the zsd function. V1, V2, and Moran’s I were computed from the grid (see footnote).
Canopy Heterogeneity v2 (CanHet_v2)	Transect-wise horizontal variation in standard deviation of column vegetation height.
Canopy Heterogeneity Moran’s I (CanHet_MoranI)	Spatial autocorrelation of standard deviation of column vegetation height.
Entropy variability v1 (EntVar_v1)	Overall horizontal variation in entropy of column height distribution.	Entire	1. The ground points were filtered out from the plot point clouds. 2. The entropy of column height distribution was gridded at 1–10 m resolution using the zentropy function. 3. V1, V2, and Moran’s I were computed from the grid (see footnote).
Entropy variability v2 (EntVar_v2)	Transect-wise horizontal variation in entropy of column height distribution.
Entropy variability Moran’s I (EntVar_MoranI)	Spatial autocorrelation in entropy of column height distribution.
Percent Hits Above Mean Height variability v1 (Pz_abovemean_v1)	Overall horizontal variation in percentage of hits above mean column vegetation height.	Upper	1. The ground points were filtered out from the plot point clouds. 2. The percentage of hits above mean column height was gridded at 1–10 m resolution using the pzabovemean function. 3. V1, V2, and Moran’s I were computed from the grid (see footnote).
Percent Hits Above Mean Height variability v2 (Pz_abovemean_v2)	Transect-wise horizontal variation in percentage of hits above mean column vegetation height.
Percent Hits Above Mean Height Moran’s I (Pz_abovemean_MoranI)	Spatial autocorrelation in percentage of hits above mean column vegetation height.

V1: Standard deviation (SD) of all grid values. V2: The mean of SDs along the x axis (SD_x) of the grid was computed. Then, the mean of SDs along the y axis (SD_y) of the grid was computed. Finally, the average of these means was taken as V2. Formula: V2 = [mean(SD_x) + mean(SD_y)]/2. Moran’s I: Moran’s I of the grid (−1 = fully dispersed, 1 = fully contiguous/clustered). ¹ DTM-normalization was performed by subtracting the 1 m grid resolution plot DTM from the plot point cloud. ² Grids were derived at resolutions ranging from 1 to 10 m in 1 m increments for each metric, resulting in 10 derivations per metric. ³ Moran’s I was computed using a queen-based spatial weights matrix.

Table 3. Top three candidate models for NPP estimation in deciduous plots. Predictors are listed in descending order of importance (absolute coefficients), with up to six predictors shown. The resolution of each selected metric is appended to its acronym with an underscore.

Model Ranking	No. of Metrics; No. of PLS Components	Calibration R²; CV R²	Top Predictors (Up to Six) Prefixed by Their Standardized Coefficients	AIC_C
1	4; 3	0.76; 0.71	1.54 CanRug_v1_1m 0.85 Rumple_10m −0.81 EntVar_v2_8m −0.61 EntVar_v1_8m	18.9
2	8; 5	0.80; 0.74	1.44 CanRug_v1_1m 1.34 CanRug_v2_1m −0.97 CanHet_v2_9m −0.72 EntVar_v1_10m −0.58 EntVar_v2_10m 0.48 Rumple_10m	29.1
3	6; 2	0.70; 0.61	0.79 CanRug_v2_1m 0.79 CanRug_v1_1m −0.58 EntVar_v1_10m −0.55 EntVar_v2_10m 0.49 Rumple_10m 0.47 CanHet_MoranI_9m	31.5

Table 4. Best models for deciduous and evergreen classes. Metrics in the equations are listed in order of importance.

Forest Type	Calibration R²; CV R²	No. of PLS Components; RMSE (Mg ha⁻¹ year⁻¹); RMSEr	Model Equation (NPP in Mg ha⁻¹ year⁻¹)	AIC_C; p-Value
Deciduous	0.77; 0.71	2; 1.18; 11%	NPP = 4.69 + 1.56 CanRug_v1_1m − 1.42 EntVar_v2_8m + 0.83 Rumple_10m	15.5; <0.01
Evergreen	0.76; 0.54	1; 0.85; 13%	NPP = 3.21 − 1.05 EntVar_v2_5m − 0.80 TopRug_v2_1m + 0.52 TopRug_MoranI_7m	4.13; <0.01

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Siddiqui, T.; Alveshere, B.; Gough, C.; van Aardt, J.; Krause, K. Modeling Primary Production in Temperate Forests Using Three-Dimensional Canopy Structural Complexity Metrics Derived from Airborne LiDAR Data. Remote Sens. 2025, 17, 2817. https://doi.org/10.3390/rs17162817

AMA Style

Siddiqui T, Alveshere B, Gough C, van Aardt J, Krause K. Modeling Primary Production in Temperate Forests Using Three-Dimensional Canopy Structural Complexity Metrics Derived from Airborne LiDAR Data. Remote Sensing. 2025; 17(16):2817. https://doi.org/10.3390/rs17162817

Chicago/Turabian Style

Siddiqui, Tahrir, Brandon Alveshere, Christopher Gough, Jan van Aardt, and Keith Krause. 2025. "Modeling Primary Production in Temperate Forests Using Three-Dimensional Canopy Structural Complexity Metrics Derived from Airborne LiDAR Data" Remote Sensing 17, no. 16: 2817. https://doi.org/10.3390/rs17162817

APA Style

Siddiqui, T., Alveshere, B., Gough, C., van Aardt, J., & Krause, K. (2025). Modeling Primary Production in Temperate Forests Using Three-Dimensional Canopy Structural Complexity Metrics Derived from Airborne LiDAR Data. Remote Sensing, 17(16), 2817. https://doi.org/10.3390/rs17162817

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Modeling Primary Production in Temperate Forests Using Three-Dimensional Canopy Structural Complexity Metrics Derived from Airborne LiDAR Data

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Sites and Field Data

2.2. Field-Derived NPP

2.3. Airborne LiDAR Data Collection and Processing

2.4. 3D CSC Metrics for NPP Estimation

2.5. Sensitivity of 3D CSC Metrics to Grid Resolution

2.6. Modeling and Statistical Analyses

2.6.1. Combining PLS-CV and RFE to Identify Candidate Models

2.6.2. Selecting the Top Candidate Models

2.6.3. Stepwise Elimination of Statistically Insignificant Metrics from the Top Candidate Models

2.6.4. Assessing the Performance of Scale-Sensitive Metrics Selected in the Best Model at Other Highly Correlated Grid Resolutions

3. Results

3.1. NPP Estimation in Deciduous Plots

3.2. NPP Estimation in Evergreen Plots

3.3. NPP Estimation Across All Plots

4. Discussion

4.1. Ecological Significance of Strong CSC Predictors of Deciduous Forest NPP

4.2. Ecological Significance of Strong CSC Predictors of Evergreen Forest NPP

4.3. Explaining the Differences in CSC–NPP Relationships Between Deciduous and Evergreen Forests

4.4. Implications, Limitations, and Future Directions

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI