Previous Article in Journal
Impact of Forest Restoration on Reducing Soil and Water Loss in a Bare Catchment of the Purple Soil Region, Southwestern China
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Modeling Commercial Height in Amazonian Forests: Accuracy of Mixed-Effects Regression Versus Random Forest

by
Renato Bezerra da Silva Ribeiro
1,*,
Leonardo Pequeno Reis
2,
Antonio Pedro Fragoso Woycikievicz
2,
Marcello Neiva de Mello
3,
Afonso Henrique Moraes Oliveira
1,
Carlos Tadeu dos Santos Dias
4 and
Lucietta Guerreiro Martorano
5
1
Institute of Biodiversity and Forests, Federal University of Western Pará, Campus Santarém, Rua Vera Paz, S/N—Salé, Santarém 68000-000, Brazil
2
Rural Federal University of the Amazon, Campus Capitão Poço, Rua Pau Amarelo, S/N, Capitão Poço 68650-000, Brazil
3
Rural Federal University of the Amazon, Campus Capanema, Avenida Barão de Capanema, S/N, Capanema 68700-665, Brazil
4
Department of Statistics and Applied Mathematics, Federal University of Ceará, Block 910, Fortaleza 60440-900, Brazil
5
National Institute of Meteorology (INMET), Eixo Monumental, Via S-1, Sudoeste, Brasília 70680-900, Brazil
*
Author to whom correspondence should be addressed.
Forests 2026, 17(1), 30; https://doi.org/10.3390/f17010030 (registering DOI)
Submission received: 19 November 2025 / Revised: 15 December 2025 / Accepted: 22 December 2025 / Published: 25 December 2025
(This article belongs to the Section Forest Inventory, Modeling and Remote Sensing)

Abstract

Accurate estimation of commercial tree height is essential for volumetric predictions in forest management plans, particularly in Amazonian forests with high species diversity. We assessed two predictive approaches for estimating commercial height, using the sum of actual commercial log lengths as the reference metric. The dataset comprised 1745 harvested trees from Annual Production Unit 8 in the Tapajós National Forest. Three commercial volume groups dominated the structural gradient: 46.1% of the trees Group 1 (<6 m3), 36.7% into Group 2 (6–10 m3), and 17.2% into Group 3 (≥10 m3). The Linear Mixed-Effects Model included diameter at breast height (DBH) as a fixed effect and species as a random effect, whereas the Random Forest model used DBH and species as predictors. The mixed-effects model achieved higher accuracy (r = 0.77; RMSE = 2.95 m), while the Random Forest model performed slightly worse (r = 0.73; RMSE = 3.10 m). Species with greater commercial heights exerted a strong influence on both modelling approaches. Principal Component Analysis revealed structural separation among the three volume groups, driven by DBH, commercial height, number of logs, and log volume. The mixed-effects model provided effective framework for predicting commercial height in heterogeneous tropical forests.

1. Introduction

The Amazon Rainforest is the largest contiguous tropical forest on the planet. It regulates atmospheric, hydrological and climatic patterns by influencing evapotranspiration and the carbon cycle, while also maintaining essential ecological processes [1,2,3]. These attributes emphasise the global importance of conserving the forest and the need for evidence-based strategies, such as sustainable forest management, to maintain forest structure and enable the responsible use of wood [4,5]. The planned exploitation can promote natural regeneration and reduce the negative impacts of unregulated extraction [6], while reduced-impact management can protect forest remnants and maintain long-term productivity [7]. In this context, advances in Brazilian forestry and measurement have produced biometric tools that are essential for monitoring and evaluating managed forests [8,9].
Of the biometric variables fundamental to forest structure planning and assessment, tree height is a particularly important attribute in forest science [10]. However, despite its importance, accurately measuring or estimating tree height remains costly and time-consuming in forest inventories [11,12]. However, it is indispensable for calculating volume [13], assessing site quality [13,14] and improving volumetric modelling [15,16]. Additionally, height significantly contributes to the estimation of above-ground biomass and carbon, which are crucial variables in climate change and forest conservation studies [17,18].
Obtaining tree height data in native forests is challenging due to the diversity of understory species and canopy density [10,19]. Heterogeneity in tree growth and the intrinsic characteristics of each species can lead to inaccuracies in field surveys [10]. Even with sophisticated equipment such as hypsometers is available, their efficiency is reduced in native forests, as noted by [14]. Therefore, visual estimations of tree height in practice remain highly imprecise in commercial forest inventories, even with extensive field experience [20]. Refs. [21,22] found that visually estimated commercial height differs by more than 15% from direct measurements of log lengths after felling. Visual estimations of this kind require intensive training, significant field experience and frequent calibration [19,20]. Nevertheless, there is a tendency to underestimate the height of taller trees, which negatively impacts timber volume estimates [14,19,23]. Furthermore, factors such as observer fatigue and mood can also influence the accuracy of height measurements [20,24].
In forest management, various height measurements are used. These include total height, which is the distance from the ground to the top of the tree [25]; dominant height, which is an indicator of stand productivity [20,26] and is calculated using methods such as those of Hart and Assmann [27]; and commercial height, which is the distance from the base to the first branches or the point of greatest timber utility [28]. In forest plantations, the structural homogeneity between diameter at breast height (DBH) and height facilitates accuracy and enables the application of hypsometric equations [14]. Measuring log lengths after felling reduces sampling errors [29,30].
In this context, this study introduces a methodological innovation: using commercial height, determined by cubage (i.e., the actual length of logs measured in the field), as an input variable to evaluate predictive height models. This approach aims to reduce the uncertainty associated with visual height estimations. However, this variable exhibits greater variability in the height-diameter ratio than in total height. This is mainly due to the definition of measurement points, which should be utilized more extensively in the timber industry and vary according to the characteristics of the exploited species. This implies low accuracy when applying traditional regression methods, so more effective methods must be sought to achieve better results.
Traditional regression approaches, whether linear or nonlinear, assume a homogeneous relationship between diameter and height across the population, as well as independence and homoscedasticity of residuals. These assumptions are rarely met in managed Amazonian forests, where trees are hierarchically structured by species, stem quality and operational units, and where commercial height exhibits high variability due to ecological and silvicultural factors. As a result, conventional regression models often fail to adequately represent the complexity of height–diameter relationships when commercial height is derived from actual log lengths, leading to biased estimates and limited generalization capacity.
In this context, Linear Mixed-Effects Models (LMM) provide a statistically coherent framework for modelling commercial height, as they explicitly account for the hierarchical structure of forest data by incorporating random effects associated with species and other grouping factors. By separating population-level trends from group-specific deviations, LMMs allow for the representation of both interspecific and intraspecific variability, improving parameter estimation and predictive reliability under heterogeneous forest conditions [31,32,33]. Complementarily, machine learning algorithms such as Random Forest offer a flexible, non-parametric alternative for height prediction, as they do not require predefined functional forms and are capable of capturing complex nonlinear relationships and interactions among predictor variables. Random Forest models are particularly robust to collinearity, noise and outliers, which are common in operational forest datasets, and have demonstrated strong predictive performance in forest and agricultural sciences [34,35,36]. In this study, Random Forest is used not only as a predictive tool but also as a benchmark to assess the potential gains in accuracy achievable when relaxing the assumptions inherent to parametric models.
This research focuses on two questions. How accurate are predictive frameworks such as mixed-effects regression and Random Forest when estimating commercial height using actual log length as reference? How much do they improve the accuracy and reliability of forest inventories in the Amazon?
Our objective was to develop and compare predictive models for commercial height using field variables measured in the APU under real operational conditions.

2. Materials and Methods

2.1. Study Area

The study was conducted in the TNF (Flona Tapajós), a Sustainable Use Conservation Unit located in western Pará State, Brazil. This protected area spans the municipalities of Santarém, Belterra and Aveiro. Established in 1974, TNF covers approximately 527,000 ha of dense ombrophilous Amazon rainforest, characterized by high biological diversity and substantial relevance for natural resource conservation [37,38]. The forest forms part of Brazil’s National System of Conservation Units and serves as a model for SFM in the Amazon [9].
It was focused on Annual Production Unit 8 (APU 8), located within the Sustainable Forest Management Area at kilometer 67 of the BR-163 highway (Santarém–Cuiabá) (Figure 1). The Tapajós National Forest Mixed Cooperative (Coomflona) holds a non-revenue public forest concession for this area and implements forest management according to the plan approved by the Chico Mendes Institute for Biodiversity Conservation [39].
The regional climate is classified as Am3 under the Köppen system, following the adaptation proposed by [40] and the updated Amazon climate regionalization developed by [41]. In Belterra, mean annual temperature ranges from 25.5 °C to 27 °C, annual rainfall varies between 2000 mm and 2500 mm, and means relative humidity is about 87% [42].
Vegetation is dominated by dense ombrophilous submontane forest, typical of western Pará [43]. The terrain includes flat to gently undulating surfaces, with elevations between 100 and 200 m. The predominant soils are latosols, argisols and quartzarenic neosols, characterized by medium to clayey textures, low natural fertility and good drainage [39]. The forest contains valuable timber species such as Manilkara huberi, Dinizia excelsa, Mezilaurus itauba, Handroanthus serratifolius and Hymenaea courbaril [44].
Long-term research conducted in Flona Tapajós has contributed significantly to understanding forest dynamics, biodiversity conservation and ecosystem services [9,38,45]. The site has also supported international studies on carbon storage in tropical forests and the socio-environmental sustainability of traditional communities [37,38].

2.2. Database and Analysis

Data were obtained from the forest census inventory and the post-harvest report for APU 8, required under Brazilian forest management regulations. The census inventory provided diameter at breast height (DBH), measured at 1.30 m above ground. The post-harvest report contained detailed information on log length and log volume.
Commercial height was defined as the sum of the lengths of all commercial logs measured after tree felling. Measurements were taken directly in the storage yard using a measuring tape. Diameters at both ends of each log were recorded, and actual log volume was computed using Smalian’s equation:
V S m a l i a n = ( g 1 + g 2 2 ) L  
where VSmalian = Smalian volume, in m3; g1 and g2 = cross-sectional areas of the log ends, in m2; L = log length, in m.
This integration of field data provides robust estimates of harvested-tree yield and supports the evaluation of sustainable forest management practices [9,38]. A total of 2948 commercial trees from 26 species were felled on APU 8. From these, 1745 trees were selected for modeling. Selection was based on a maximum difference of ±20% between inventory-estimated and measured commercial heights, a threshold that reflects the visual nature of pre-harvest height estimates. The sum of measured log lengths was therefore used as the reference commercial height.
We fitted Linear Mixed-Effects Models (LMM) and Random Forest (RF) models to analyze how each method represents the structural variability of this Amazonian forest [31]. Two modeling frameworks were evaluated: a parametric LMM and a non-parametric RF algorithm.
In the LMM, we treated species as random effects. We compared a simpler linear model and a more complex quadratic model using a likelihood ratio test. The test indicated that the quadratic structure significantly improved model fit, so we included both DBH and DBH2 as fixed predictors of commercial height. This specification captures the general DBH–height relationship within the observed DBH range while allowing species-specific deviations through the random intercept. The general model is expressed as:
H c i j = ( β 0 + b 0 j ) + ( β 1 + b 1 j ) D B H i j + β 2 D B H i j 2 + ε i j
in which H c i j is the commercial height (m), D B H i j is the diameter at breast height (cm), β 0 ,   β 1 ,   β 2 are fixed coefficients, b 0 j and b 1 j are species-specific random effects, and ε i j is the residual error with ε i j N ( 0 , σ j 2 ) , for i-th tree and j-th species and σ j 2 being the residual variance adjusted for the j-th species.
The Random Forest model used 70% of the data for training and 30% for validation. Predictor variables included continuous DBH and categorical species. Commercial height was the response variable. Two basic hyperparameters ran the RF model: (a) ntree: number of trees created from the bootstrap samples selection and (b) mtry: number of predictors randomly sampled to gather each node of the trees. After some preliminary tests, we set ntree = 1000 and mtry = 1, a value which corresponds to one-third of the total number of variables, as recommended for regression by [46]. We encoded species using one-hot transformation to allow their inclusion in the model.
We evaluated model performance using the residuals correlation coefficient ( r y y ^ ), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), Mean Percentage Deviation (MPD%), and Mean Bias (BIAS):
r y y ^ = ( Y I Y ) ( Y ^ I Y ^ ) ( Y I Y ) 2 ( Y ^ I Y ^ ) 2
R M S E = 1 N ( Y I Y ^ I ) 2
M A E = 1 N Y I Y ^ I
M P D % = 1 N ( Y I Y ^ I Y I ) × 100
B I A S = 1 N ( Y I Y ^ I ) ,
in which y i and y ^ i are observed and predicted commercial heights, respectively, and nis the number of observations.
All analyses were conducted in R (version 4.4.2) [47]. We fitted the Linear Mixed-Effects Model using the nlme package (version 3.1.166) and implemented the Random Forest model and variable importance analyses using the randomForest package (version 4.7.1.2).
It is worth noting that we verified all assumptions of the Linear Mixed-Effects Model (LMM). We assessed residual normality using Q–Q plots, the Shapiro–Wilk test, and visual inspection of skewness and heavy-tail behavior. We evaluated homoscedasticity by inspecting residuals versus fitted values and by examining species-level residual patterns. To address potential collinearity between DBH and DBH2, we centered DBH at its overall mean before computing the squared term. We quantified remaining collinearity using the Variance Inflation Factor (VIF), which indicated acceptable levels for all predictors. All model diagnostics showed satisfactory behavior after these adjustments, supporting the suitability of the quadratic LMM within the observed DBH range.

3. Results

The performance statistics obtained during model fitting and validation appear in Table 1.
The analysis showed that the Linear Mixed-Effects Model surpassed the Random Forest in all performance metrics and demonstrated superior predictive ability for estimating the commercial height of harvested trees in APU 8. The correlation between observed and predicted values reached 0.77 for the mixed-effects model and 0.73 for the Random Forest. This difference indicates that the mixed-effects model captured the natural variability of the data more effectively, especially the structural effects introduced by species and stem quality classes.
Regarding precision, the mixed-effects model produced lower mean errors in both absolute terms (MAE = 2.33) and squared terms (RMSE = 2.95) compared to the Random Forest (MAE = 2.44 and RMSE = 3.10). These results confirm that the mixed-effects model generated predictions that were more stable and closer to the observed measurements. This level of precision is crucial for applications such as harvest planning and timber yield estimation.
The Mean Percentage Deviation (MPD%) showed underestimation in both adjustments, with a lower value for the mixed effects model (−2.62%) and −2.76% for Random Forest. The linear mixed-effects model (LMM), which incorporated species as a random effect, showed a bias of 0.002, indicating no systematic tendency to overestimate or underestimate. Similarly, the Random Forest algorithm produced a bias of –0.05, which is also very close to zero, revealing its robust performance even in the face of high floristic heterogeneity. Thus, both models produced unbiased predictions for commercial height estimation.
The estimated coefficients of the mixed-effects model confirmed a consistent quadratic–linear relationship between the predictor variables and the commercial tree heights in APU 8. The fixed effect associated with DBH was positive (0.142190, standard error 0.0328839), and the coefficient for DBH2 was negative (−0.000499, standard error 0.0001653). The positive linear term and the negative quadratic term describe a concave growth curve, in which height increases with DBH but at a decreasing rate as the diameter becomes larger. The complete set of fixed effects appears in Table 2.
The results presented in Table 3 show that the random effects associated with the intercept and the linear slope within species captured the strong structural heterogeneity of the managed forest in APU 8. The model intercept was 11.196970 (standard error 1.7269491), representing the expected commercial height for the baseline DBH condition. This value expresses the central tendency of the dataset, while the random effects revealed species-specific deviations driven by morphological and architectural traits. The residual standard deviation was 3.65 for the intercept, indicating the dispersion of species intercepts around the model intercept, and 0.027 for the linear slope. In addition, the negative correlation (−0.627) between the two shows that species with larger intercepts tend to have smaller slopes, meaning that species reaching greater average heights tend to exhibit lower rates of height increase as DBH enlarges.
The random intercepts and slopes highlighted marked differences among species. H. courbaril, C. guianensis, and C. catenaeformis showed the highest positive coefficients, all above 4, with notable contributions from H. courbaril (6.7646) and C. guianensis (5.1793). These species possess intrinsic structural characteristics that promote the formation of long and straight boles, which are highly valued for industrial timber production. In contrast, species such as L. pisonis (−6.0406), V. maxima (−5.5428), and P. psilostachya (−4.3135) exhibited pronounced negative coefficients. These patterns reflect morphological limitations, including crown expansion, early bifurcation, or stem irregularities, all of which reduce their potential to produce commercially desirable heights, even in individuals with large diameters.
Figure 2 presents the relationship between the observed and predicted commercial heights obtained from the Linear Mixed-Effects Model, together with the Willmott index (d), which quantifies the global agreement between predictions and observations. The Linear Mixed-Effects Model achieved a Willmott index of 0.861, indicating moderate predictive efficiency and showing that the model captures the overall structure of the height–diameter relationship in the dataset. The scatter plot shows a clear positive association between observed and predicted values, demonstrating that the model identifies the general tendency of increasing commercial height with increasing DBH.
However, the wide dispersion of points around the 1:1 line, particularly among trees with low and intermediate commercial heights, reveals substantial prediction uncertainty. This dispersion suggests that the Random Forest model loses precision for individuals that deviate from the central structure of the dataset. The larger spread of points at both extremes of observed height indicates that the model does not fully represent the structural variability among species, nor the effects of bole form and crown architecture on height allocation. This graphical behavior is consistent with the numerical performance metrics, which showed higher RMSE and MAE for the Random Forest compared to the Linear Mixed-Effects Model.
Taken together, the graphical evidence and the Willmott index confirm that, although the Random Forest captures the general trend of commercial height, it underperforms for extreme values and for species with strong architectural differentiation. These limitations reinforce the superior predictive accuracy of the Linear Mixed-Effects Model, which better represents the biological and structural gradients that drive commercial height in heterogeneous tropical forests.
Figure 3 shows the updated distribution of commercial heights across all species and highlights the structural heterogeneity of the managed forest in APU 8. Figure 3a presents species-specific boxplots that display the full variability observed in commercial tree height. These distributions reveal marked differences among species, with some species showing narrow ranges of commercial height and others showing broad dispersion, which reflects contrasting architectural forms, growth strategies, and bole quality.
Figure 3b–d refine this interpretation by presenting commercial height distributions within three commercial volume groups. Figure 3b shows trees with volumes lower than 6 m3, which represent the majority of individuals and show the greatest height dispersion relative to their commercial value. Figure 3c presents trees with intermediate volumes, from 6 to 10 m3, and shows a more concentrated distribution of commercial height, indicating a stronger relationship between height and diameter in this volume range. Figure 3d shows the largest trees, with volumes greater than 10 m3, which display more consistent commercial heights but appear in much lower density.
Figure 3e provides a new quantitative summary of the dataset and shows the proportion of trees in each commercial volume class. Group 1 (<6 m3) represents 46.1% of all individuals. Group 2 (6 to 10 m3) represents 36.7%, and Group 3 (>10 m3) accounts for 17.2%. These precise values were not available in earlier versions of the figure and now provide a clearer understanding of the forest’s structural profile.
The improved figure offers a comprehensive view of how commercial height varies across species and commercial volume groups. It highlights the predominance of smaller-volume trees in the forest and shows how species-specific architecture influences the distribution of commercial height.
Figure 4 presents the updated species-specific curves estimated by the Linear Mixed-Effects Model. Figure 4a overlays the fitted curves on the observed commercial heights for all species, allowing direct comparison between predicted responses and empirical data. The combined visualization highlights the strong structural heterogeneity of the managed forest in APU 8, as each fitted trajectory follows a distinct shape and slope.
Figure 4b displays the fitted curves separately for each species. This representation makes clear that commercial height does not follow a monotonic pattern across species. For several species, such as H. courbaril, C. guianensis, and M. huberi, the model produces steep and consistently increasing curves, indicating substantial gains in commercial height with increasing DBH. In contrast, other species exhibit curves that plateau or even decrease after reaching a maximum point, as observed in V. maxima, P. psilostachya, and L. pisonis.
It is important to clarify that this decreasing pattern does not represent a biological decline in height. Commercial height reflects the sum of merchantable log lengths, which depends on operational criteria such as cutting decisions, bole defects, branching, and standards of log utilization. Because these operational factors can reduce the number or length of usable logs at larger diameters, the commercial height curve may decrease even though total tree height continues to increase biologically. The quadratic form of the model naturally captures this behavior, with the maximum commercial height occurring at the vertex of the curve, which can be obtained analytically from the first derivative.
Overall, the species-specific panels show that each species presents a distinct allometric trajectory shaped by both biological attributes (architecture, crown form, bole taper) and operational factors that define commercial utilization. The mixed-effects model successfully accommodates this variability, confirming its ability to represent species-level structural patterns. This capability strengthens its applicability in forest management, particularly in harvest planning and timber yield estimation, where species differences directly influence commercial volume distribution.
Figure 5 presents the PCA biplot that summarizes the multivariate structure of the dataset and illustrates how trees cluster according to commercial volume classes. The first principal component (PC1), which explains most of the total variability (65.3%), is jointly driven by DBH, commercial height, log volume, and log production (i.e., the number of commercial logs obtained from each tree). Trees positioned on the positive side of PC1 generally have larger diameters, higher log volumes, and greater log production, whereas negative PC1 scores correspond to smaller trees with reduced commercial yield. The second principal component (PC2) captures additional variation that is primarily associated with commercial height. The orientation of the commercial height vector shows a positive contribution to both PC1 and PC2, indicating that taller trees tend to occupy intermediate to high structural positions within the PCA space.
Three distinct clusters appear in accordance with the commercial volume groups. Group 1 (<6 m3), shown in red, lies on the left side of the biplot and includes trees with smaller DBH, fewer commercial logs, and lower commercial heights. Group 2 (6–10 m3), shown in blue, occupies the central region and represents intermediate structural conditions, with moderate values for DBH, commercial height, and log production. Group 3 (≥10 m3), shown in green, appears on the right side of the PCA space and comprises trees with the largest diameters, highest log volumes, and the greatest log production.
The arrangement of individuals along PC1 and PC2 demonstrates that DBH, commercial height, and log production jointly define the main structural gradient of harvested trees in APU 8. The PCA provides a clear visualization of how these variables interact and how they determine the commercial volume classes observed in the forest.

4. Discussion

Previous studies have shown that mixed-effects models perform strongly in forest modeling [48,49,50], particularly when data present hierarchical organization and substantial structural variability [51,52]. The mixed-effects models respond effectively to variation among species and sites without requiring species-specific calibrations [53]. This advantage is especially relevant for tropical forests, where taxonomic diversity and structural heterogeneity create conditions that challenge traditional modeling approaches. The including random effects improves the accuracy of height–diameter equations in uneven-aged forests, a common condition across the Brazilian Amazon [54,55]. These findings align with the results of [56], treating species as a random effect enhances allometric predictions in highly diverse ecosystems.
Although the Random Forest (RF) algorithm showed lower predictive accuracy than the mixed-effects model in this study, it demonstrated robustness and flexibility. When investigating the potential and accuracy of the relationships between metrics derived from LiDAR and forest inventory data for predicting above-ground biomass in Amazonian forests under selective cutting regimes, ref. [57] obtained better results using a Generalized Linear Model (GLM) than with different machine learning techniques, including Random Forest.
Random Forest can handle nonlinear relationships, interactions and multicollinearity, which are frequent characteristics of ecological data [34,58]. In their study of tropical forest, ref. [59] confirmed the effectiveness of the Random Forest algorithm for estimating forest structural parameters using field and remote sensing data in Paragominas, Brazil. Their results support our conclusion that Random Forest can accurately model forest attributes in complex, biodiverse environments. However, there are limitations in predicting extreme values, as demonstrated by the dispersion patterns in Figure 4b of this study.
Ref. [60] provided a comprehensive review of the strengths of this method for ecological applications. Similarly, refs. [61,62] emphasized the predictive advantages of Random Forest in complex settings where parametric assumptions are difficult to meet. In the present study, the algorithm captured the general trend of commercial height, but the dispersion pattern in Figure 2 confirms lower precision at the extremes and reflects its higher RMSE and MAE values.
Interpretability also plays a central role when choosing a predictive approach for forest management. Importantly, when choosing between LMM and Random Forest, consider not only predictive accuracy, but also the interpretability of the model. The machine-learning algorithms often lack transparency regarding the biological and ecological mechanisms involved in predictions [63]. Mixed-effects models, in contrast, provide explicit estimates of fixed and random components, which allow researchers and managers to understand how species and other structural factors influence the modeled attribute [64]. This interpretability is essential for applied forestry, where management decisions depend on understanding not only prediction outcomes but also the underlying ecological processes. Ref. [65] suggested that hybrid modeling strategies may combine the interpretability of statistical models with the flexibility of machine learning, which could lead to even better predictive performance while maintaining ecological transparency.
The decision to center DBH before computing the quadratic term improved both the numerical stability and the interpretability of the mixed-effects model. Centering reduces the collinearity between DBH and DBH2, produces more stable coefficient estimates, and ensures that the parameters retain biological meaning. This procedure also prevents unrealistic model behavior, such as implausible intercepts or curvature patterns that can arise when uncentered polynomial terms are used.
Studies in ecological regression demonstrate that centering and rescaling continuous predictors enhance model robustness and improve the interpretation of regression coefficients in both simple and multilevel frameworks [66,67,68]. By incorporating centered DBH, the present study aligns with these best practices and provides species-specific curves that are more coherent and analytically reliable.
The practical implications of these findings for SFM in the Amazon are significant. Accurate predictions of commercial height can enhance forest inventory precision, improve operational planning, and optimize timber yield estimations. This contributes to reducing waste and environmental impacts. Integrating advanced modeling techniques, such as mixed-effects models and machine learning, is a promising way to modernize forest management and ensure exploitation aligns with sustainability objectives.
This methodological approach strengthens the relationship between diameter and commercial height modelled in this study, providing a solid basis for future studies seeking to represent the allometric structure or commercial attributes of heterogeneous tropical forests. However, we recognize that further studies are needed to validate the methodology in environments with different floristic compositions and to improve it for use in other Amazonian environments. Furthermore, future studies should continue to explore techniques such as machine learning, which are capable of handling the complex data structures found in this study.

5. Conclusions

We identified three structural groups among the species, each defined by distinct patterns of stem form, crown architecture and growth strategies. The first group included species with slender individuals and lower commercial heights, which tend to respond more strongly to competition. The second group consisted of species with intermediate structural conformation and a more stable diameter and height relationship. The third group comprised species of larger stature, whose individuals often reach upper canopy positions and exhibit greater commercial height. This grouping showed that the hypsometric structure of the forest depends primarily on interspecific differences rather than on individual variability alone.
The Principal Component Analysis supported this interpretation by demonstrating that DBH, slenderness ratio and total height contributed most to distinguishing species. The organization of species in the factorial plane revealed consistent structural gradients and confirmed that the stand is not homogeneous. These multivariate results indicate that growth allocation strategies and stem morphology explain much of the variation observed in commercial height.
The evaluation of the predictive models emphasized the influence of this structural complexity. The Random Forest model captured the general pattern of increasing height with increasing diameter, but it produced wide dispersion across the species groups and showed higher RMSE and MAE values. Its Willmott index of 0.861 indicated only moderate agreement with observed data and revealed that the model could not represent the structural variability of the stand with sufficient precision.
The Linear Mixed-Effects Model showed better performance because it incorporated variation among species. This model accounted for differences in height allocation and stem form. As a result, it reduced prediction errors and generated more consistent estimates across the full diameter range. Its behavior also agreed with the structural patterns revealed by the cluster analysis and the Principal Component Analysis, which demonstrates that hierarchical models are more suitable for forests with high species diversity and complex structure.
The combined evidence from structural grouping, the multivariate evaluation and the predictive modeling shows that reliable estimation of commercial height in managed Amazonian forests requires explicit representation of interspecific variability. Mixed-effects models provide more accurate estimates for volume calculation, biomass assessment and planning in concession areas. These findings support the use of hierarchical modeling approaches as a priority in continuous forest inventories and in decision-making processes for forest management in the Amazon.

Author Contributions

Conceptualization, R.B.d.S.R. and L.G.M.; methodology, R.B.d.S.R. and L.G.M.; statistical analysis, R.B.d.S.R. and L.G.M.; software implementation, L.P.R., M.N.d.M. and C.T.d.S.D.; validation of statistical outputs and biological consistency, R.B.d.S.R., L.G.M. and A.H.M.O.; investigation, R.B.d.S.R., L.G.M., L.P.R., M.N.d.M. and C.T.d.S.D.; resources, A.P.F.W. and A.H.M.O.; data organization, processing, and curation, R.B.d.S.R. and L.G.M.; writing (original draft preparation), R.B.d.S.R. and L.G.M.; writing (review and editing), R.B.d.S.R., L.G.M., L.P.R., M.N.d.M. and C.T.d.S.D.; visualization and graphical development, L.P.R., M.N.d.M. and C.T.d.S.D.; supervision, R.B.d.S.R. and L.G.M.; project administration, L.G.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article Further inquiries can be directed to the corresponding author.

Acknowledgments

The authors express their sincere appreciation to Cooperativa Mista da Flona Tapajós (COOMFLONA) for providing the essential field data that made this study possible. The authors also honor João Ricardo Vasconcellos Gama (in memoriam) for his meaningful initial contributions and for the inspiration he continues to provide through his lifelong dedication to forest science.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Malhi, Y.; Roberts, J.T.; Betts, R.A.; Killeen, T.J.; Li, W.; Nobre, C.A. Climate change, deforestation and the fate of the Amazon. Science 2008, 319, 169–172. [Google Scholar] [CrossRef] [PubMed]
  2. Aragão, L.E.O.C.; Malhi, Y.; Barbier, N.; Lima, A.; Shimabukuro, Y.; Anderson, L.; Costa, M. Environmental change and the carbon balance of Amazonian forests. Biol. Rev. 2014, 89, 913–931. [Google Scholar] [CrossRef] [PubMed]
  3. Zemp, D.C.; Schleussner, C.F.; Barbosa, H.M.J.; Rammig, A. Deforestation effects on Amazon forest resilience. Geophys. Res. Lett. 2017, 44, 6182–6190. [Google Scholar] [CrossRef]
  4. Lovejoy, T.E.; Nobre, C. Amazon tipping point: Last chance for action. Sci. Adv. 2019, 5, eaba2949. [Google Scholar] [CrossRef]
  5. Vanclay, J.K. Modelling Forest Growth and Yield: Applications to Mixed Tropical Forests; CAB International: Wallingford, UK, 1994. [Google Scholar]
  6. Putz, F.E.; Zuidema, P.A.; Pinard, M.A.; Boot, R.G.A.; Sayer, J.A.; Sheil, D.; Sist, P.; Elias, M.; Vanclay, J.K. Improved tropical forest management for carbon retention. PLoS Biol. 2008, 6, e166. [Google Scholar] [CrossRef]
  7. Sist, P.; Ferreira, F.N. Sustainability of reduced-impact logging in the Eastern Amazon. For. Ecol. Manag. 2007, 243, 199–209. [Google Scholar] [CrossRef]
  8. Campos, J.C.C.; Leite, H.G. Mensuração Florestal: Perguntas e Respostas, 5th ed.; Editora UFV: Viçosa, MG, Brazil, 2017; p. 636. [Google Scholar]
  9. Putz, F.E.; Zuidema, P.A.; Synnott, T.; Peña-Claros, M.; Pinard, M.A.; Sheil, D.; Vanclay, J.K.; Sist, P.; Gourlet-Fleury, S.; Griscom, B.; et al. Sustaining conservation values in selectively logged tropical forests: The attained and the attainable. Conserv. Lett. 2012, 5, 296–303. [Google Scholar] [CrossRef]
  10. Wang, Y.; Lehtomäki, M.; Liang, X.; Pyörälä, J.; Kukko, A.; Jaakkola, A.; Liu, J.; Feng, Z.; Chen, R.; Hyyppä, J. Is field-measured tree height as reliable as believed—A comparison study of tree height estimates from field measurement, airborne laser scanning and terrestrial laser scanning in a boreal forest. ISPRS J. Photogramm. Remote Sens. 2019, 147, 132–145. [Google Scholar] [CrossRef]
  11. Mugasha, W.A.; Bollandsås, O.M.; Eid, T. Relationships between diameter and height of trees in natural tropical forest in Tanzania. South. For. 2013, 75, 221–237. [Google Scholar] [CrossRef]
  12. Marques, P.A.; Romarco, M.L.; Cardoso, J.F.; Vilas-Boas, M.N.; Azevedo, L.R.; Silva, A.V.S. Estimativa de altura por meio de modelos hipsométricos de um povoamento de Schizolobium amazonicum Huber ex Ducke (Fabaceae) em Minas Gerais. Série Técnica IPEF 2023, 26, 554–558. [Google Scholar] [CrossRef]
  13. Jayaraman, K.; Zakrzewski, W.T. Practical approaches to calibrating height-diameter relationships for natural sugar maple stands in Ontario. For. Ecol. Manag. 2001, 148, 169–177. [Google Scholar] [CrossRef]
  14. Silva, G.F.; Curto, R.A.; Soares, C.P.B.; Piassi, L.C. Avaliação de métodos de medição de altura em florestas naturais. Rev. Árvore 2012, 36, 341–348. [Google Scholar] [CrossRef]
  15. Silva, L.B.D.; Morais, V.A.; Caetano, M.G.; Bernardes, L.F.G.M. Equações para estimativa volumétrica de espécies arbóreas da Amazônia. Rev. De Ciências Agroambientais 2020, 18, 16–26. [Google Scholar]
  16. Silva, I.C.O.; Garlet, J.; Morais, V.A.; Araújo, E.J.G.; Silva, J.R.O.; Curto, R.A. Equations and form factor by species increase the precision and accuracy for estimating tree volume in the Amazon. Floresta 2022, 52, 268–276. [Google Scholar] [CrossRef]
  17. Rutishauser, E.; Noor’an, F.; Laumonier, Y.; Halperin, J.; Rufi’ie, H.K.; Verchot, L. Generic allometric models including height best estimate forest biomass and carbon stocks in Indonesia. For. Ecol. Manag. 2013, 307, 219–225. [Google Scholar] [CrossRef]
  18. Romero, F.M.B.; Jacovine, L.A.G.; Ribeiro, S.C.; Torres, C.M.M.E.; Silva, L.F.; Gaspar, R.O.; Rocha, S.J.S.S.; Staudhammer, C.L.; Fearnside, P.M. Allometric equations for volume, biomass and carbon in commercial stems harvested in a managed forest in the southwestern Amazon: A case study. Forests 2020, 11, 874. [Google Scholar] [CrossRef]
  19. Curto, R.A.; Silva, G.F.; Soares, C.P.B.; Martins, L.T.; David, H.C. Métodos de estimação de altura de árvores em Floresta Estacional Semidecidual. Floresta 2013, 45, 105–116. [Google Scholar] [CrossRef]
  20. Silva, J.C.; Mendonça, A.R.; Silva, G.F.; Curto, R.A.; Figueiredo, L.T.M.; Silva, M.L.M. Métodos de medição da altura comercial de árvores na região Amazônica. Sci. For. 2019, 47, 588–598. [Google Scholar] [CrossRef]
  21. Gomes, K.M.A.; Silva-Ribeiro, R.B.; Gama, J.R.V.; Andrade, D.F.C. Eficiência na estimativa volumétrica de madeira na Floresta Nacional do Tapajós. Nativa 2018, 6, 170–176. [Google Scholar] [CrossRef]
  22. Cardoso, R.M.; Miguel, E.P.; Souza, H.J.; Souza, A.N.; Nascimento, R.G.M. Wood volume is overestimated in the Brazilian Amazon: Why not use generic volume prediction methods in tropical forest management? J. Environ. Manag. 2024, 350, 119593. [Google Scholar] [CrossRef]
  23. Frutuoso, L.M.S.; Almeida, D.M.; Ucella-Filho, J.C.M.; Barbosa-Júnior, G.S.; Canto, J.L. Métodos de medição de altura em fragmento de Floresta Estacional Decidual. Nativa 2020, 8, 610–614. [Google Scholar] [CrossRef]
  24. Curto, R.A.; Loureiro, G.H.; Môra, R.; Miranda, R.O.V.; Péllico-Neto, S.; Silva, G.F. Relações hipsométricas em floresta estacional semidecidual. Rev. De Ciências Agrárias 2014, 57, 57–66. [Google Scholar] [CrossRef]
  25. Machado, S.A.; Figueiredo-Filho, A. Dendrometria. Curitiba; Universidade Federal do Paraná: Curitiba, Brazil, 2003; p. 309. [Google Scholar]
  26. Scolforo, J.R. Biometria Florestal, 1st ed.; UFLA: Lavras, Brazil, 2006; p. 352. [Google Scholar]
  27. Oliveira, X.M.; Mayrinck, R.C.; Silva, G.C.C.; Ferraz-Filho, A.C.; Mello, J.M. Modelo de estimativa de volume e carbono por hectare para fragmentos de cerrado sensu stricto em Minas Gerais. Enciclopédia Biosf. 2016, 13, 801–811. [Google Scholar] [CrossRef]
  28. Barcik, L.Z.; Ruiz, E.C.Z.; Mussio, C.F.; Garrett, A.T.A. Relações hipsométricas para altura comercial de Araucaria angustifolia (Bertol.) Kuntze em fragmentos de Floresta Ombrófila Mista no Paraná. Braz. J. Anim. Environ. Res. 2023, 6, 679–691. [Google Scholar] [CrossRef]
  29. Hiramatsu, N.A. Equações de Volume Comercial para Espécies Nativas na Região do Vale do Jari, Amazônia Oriental. Master’s Thesis, Universidade Federal do Paraná, Curitiba, Brazil, 2008; p. 107. [Google Scholar]
  30. Silva-Ribeiro, R.B.; Gama, J.R.V.; Melo, L.O. Seccionamento para cubagem e escolha de equações de volume para a Floresta Nacional do Tapajós. Cerne 2014, 20, 605–612. [Google Scholar] [CrossRef]
  31. Sharma, R.P.; Vacek, Z.; Vacek, S. Nonlinear mixed-effects height–diameter model for mixed-species forests in the central part of the Czech Republic. J. For. Sci. 2016, 62, 470–484. [Google Scholar] [CrossRef]
  32. Zhou, X.; Kutchartt, E.; Hernández, J.; Corvalán, P.; Promis, A.; Zwanzig, M. Determination of optimal tree height models and calibration designs for Araucaria araucana and Nothofagus pumilio in mixed stands affected to different levels by anthropogenic disturbance in South-Central Chile. Ann. For. Sci. 2023, 80, 18. [Google Scholar] [CrossRef]
  33. Teshome, M.; Braz, E.M.; Torres, C.M.M.E.; Raptis, D.I.; Mattos, P.P.; Temesgen, H.; Rubio-Camacho, E.A.; Sileshi, G.W. Mixed-effects height prediction model for Juniperus procera trees from a dry Afromontane forest in Ethiopia. Forests 2024, 15, 443. [Google Scholar] [CrossRef]
  34. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  35. Marçal, M.F.M.; Souza, Z.M.; Tavares, R.L.M.; Farhate, C.V.V.; Oliveira, S.R.M.; Galindo, F.S. Predictive models to estimate carbon stocks in agroforestry systems. Forests 2021, 12, 1240. [Google Scholar] [CrossRef]
  36. Lima, E.S.; Souza, Z.M.; Oliveira, S.R.M.; Montanari, R.; Farhate, C.V.V. Random forest model to predict the height of Eucalyptus. Eng. Agrícola 2022, 42, e20210153. [Google Scholar] [CrossRef]
  37. Nepstad, D.C.; de Carvalho, C.R.; Davidson, E.A.; Jipp, P.H.; Lefebvre, P.A.; Negreiros, G.H.; da Silva, E.D.; Stone, T.A.; Trumbore, S.E.; Vieira, S. The role of deep roots in the hydrological and carbon cycles of Amazonian forests and pastures. Nature 1994, 372, 666–669. [Google Scholar] [CrossRef]
  38. Barlow, J.; Lennox, G.D.; Ferreira, J.; Berenguer, E.; Lees, A.C.; Mac Nally, R.; Thomson, J.R.; Ferraz, S.F.D.B.; Louzada, J.; Oliveira, V.H.F.; et al. Anthropogenic disturbance in tropical forests can double biodiversity loss from deforestation. Nature 2016, 535, 144–147. [Google Scholar] [CrossRef] [PubMed]
  39. Instituto Chico Mendes de Conservação da Biodiversidade (ICMBio). Plano de Manejo da Floresta Nacional do Tapajós—Volume I: Diagnóstico, 1st ed.; Ministério do Meio Ambiente: Brasília, Brazil, 2019; p. 316. [Google Scholar]
  40. Martorano, L.G.; Pereira, L.C.; Cesar, E.G.M.; Pereira, I.C.B. Estudos Climáticos do Estado do Pará, Classificação Climática (Köppen) e Deficiência Hídrica (Thornthwaite, Mather); SUDAM; Embrapa-SNLCS: Belém, PA, Brazil, 1993; p. 53. [Google Scholar]
  41. Martorano, L.G.; Brienza-Júnior, S.; Lisboa, L.S.S.; Moraes, J.R.S.; Aparecido, L.E.O.; Dias, C.T.S. A methodological proposal for topoclimatic zoning of native forest species in the Brazilian Amazon. Theor. Appl. Climatol. 2025, 156, 156–249. [Google Scholar] [CrossRef]
  42. Martorano, L.G.; Soares, W.B.; Moraes, J.R.S.C.; Nascimento, W.; Aparecido, L.E.O.; Villa, P.M. Climatology of air temperature in Belterra: Thermal regulation ecosystem services provided by the Tapajós National Forest in the Amazon. Rev. Bras. De Meteorol. 2021, 36, 327–337. [Google Scholar] [CrossRef]
  43. Instituto Brasileiro de Geografia e Estatística (IBGE). Manual Técnico da Vegetação Brasileira, 2nd ed.; Ministério do Planejamento, Orçamento e Gestão: Brasília, Brazil, 2012; p. 272. [Google Scholar]
  44. Andrade, D.F.C.; Braga, C.R.; Silva, J.R.; Chaves, A.R.S. Do mil ao milhão: Estudo de caso do manejo florestal comunitário na Floresta Nacional do Tapajós. Biodiversidade Bras. 2022, 12, 5–17. [Google Scholar] [CrossRef]
  45. Barreto, P.; Pinto, A.; Brito, B.; Hayashi, S. Quem é o dono da Amazônia? Uma Análise do Desmatamento e da Ocupação em Terras Públicas e Privadas na Região Amazônica; IMAZON: Belém, PA, Brazil, 2016; p. 108. [Google Scholar]
  46. Liaw, A.; Wiener, M. Classification and regression by randomForest. R News 2002, 2, 18–22. Available online: https://journal.r-project.org/articles/RN-2002-022 (accessed on 21 December 2025).
  47. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2025; Available online: https://www.R-project.org (accessed on 5 April 2025).
  48. Vismara, E.S. Avaliação da Construção e Aplicação de Modelos Florestais de Efeitos Fixos e Efeitos Mistos sob o Ponto de Vista Preditivo. Ph.D. Thesis, Universidade de São Paulo, Piracicaba, Brazil, 2013; p. 107. [Google Scholar]
  49. Chen, Q.; Lu, D.; Keller, M.; dos-Santos, M.N.; Bolfe, E.L.; Feng, Y.; Wang, C. Modeling and Mapping Agroforestry Aboveground Biomass in the Brazilian Amazon Using Airborne Lidar Data. Remote Sens. 2016, 8, 21. [Google Scholar] [CrossRef]
  50. Lussetti, D.; Kuljus, K.; Ranneby, B.; Ilstedt, U.; Falck, J.; Karlsson, A. Using linear mixed models to evaluate stand level growth rates for dipterocarps and Macaranga species following two selective logging methods in Sabah, Borneo. For. Ecol. Manag. 2019, 437, 372–379. [Google Scholar] [CrossRef]
  51. Di Cosmo, L.; Giuliani, D.; Dickson, M.M.; Gasparini, P. An individual-tree linear mixed-effects model for predicting the basal area increment of major forest species in Southern Europe. For. Syst. 2020, 29, e019. [Google Scholar] [CrossRef]
  52. Réjou-Méchain, M.; Flores, O.; Bourland, N.; Doucet, J.L.; Fétéké, R.F.; Pasquier, A.; Hardy, O.J. Spatial aggregation of tropical trees at multiple spatial scales. J. Ecol. 2011, 99, 1373–1381. [Google Scholar] [CrossRef]
  53. Eerikäinen, K. A multivariate linear mixed-effects model for the generalization of sample tree heights and crown ratios in the Finnish National Forest Inventory. For. Sci. 2009, 55, 480–493. [Google Scholar] [CrossRef]
  54. Vibrans, A.C.; Moser, P.; Oliveira, L.Z.; Maçaneiro, J.P. Height-Diameter models for three subtropical forest types in Southern Brazil. Ciênc. Agrotec. 2015, 39, 205–215. [Google Scholar] [CrossRef]
  55. Fu, L.; Lei, X.; Sharma, R.P.; Li, H.; Zhu, G.; Hong, L.; You, L.; Duan, G.; Lei, Y.; Li, Y.; et al. Comparing height-age and height-diameter modelling approaches for estimating site productivity of natural uneven-aged forests. Forestry 2018, 91, 419–433. [Google Scholar] [CrossRef]
  56. Wang, T.Y.; Lam, T.Y. Modelling height–diameter relationship of fifteen tree species planted on reclaimed agricultural lands with random species effects. Trop. For. 2022, 1053, e012013. [Google Scholar] [CrossRef]
  57. Schuh, M.; Favarin, J.A.S.; Marchesan, J.; Alba, E.; Berra, E.F.; Pereira, R.S. Machine learning and generalized linear model techniques to predict aboveground biomass in Amazon rainforest using LiDAR data. J. Appl. Remote Sens. 2020, 14, 034518. [Google Scholar] [CrossRef]
  58. Chai, Z.; Zhao, C. Enhanced Random Forest with concurrent analysis of static and dynamic nodes for industrial fault classification. IEEE Trans. Ind. Inform. 2020, 16, 54–66. [Google Scholar] [CrossRef]
  59. Marchesan, J.; Alba, E.; Schuh, M.S.; Favarin, J.A.S.; Pereira, R.S. Aboveground biomass estimation in a tropical forest with selective logging using Random Forest and LiDAR data. Floresta 2020, 50, 1873–1882. [Google Scholar] [CrossRef]
  60. Cutler, D.R.; Edwards, T.C.; Beard, K.H.; Cutler, A.; Hess, K.T.; Gibson, J.; Lawler, J.J. Random forests for classification in ecology. Ecology 2007, 88, 2783–2792. [Google Scholar] [CrossRef]
  61. Biau, G.; Scornet, E. A random forest guided tour. Test 2016, 25, 197–227. [Google Scholar] [CrossRef]
  62. Simon, S.M.; Glaum, P.; Valdovinos, F.S. Interpreting random forest analysis of ecological models to move from prediction to explanation. Sci. Rep. 2023, 13, 3881. [Google Scholar] [CrossRef]
  63. Tavares, R.L.M.; Oliveira, S.R.M.; Barros, F.M.M.; Farhate, C.V.V.; Souza, Z.M.; Scala-Junior, N.L. Prediction of soil CO2 flux in sugarcane management systems using the random forest approach. Sci. Agric. 2018, 75, 281–287. [Google Scholar] [CrossRef]
  64. Babanezhad, M. Different Algorithms in Mixed Effect Models Estimation Approach. J. Appl. Math. Bioinform. 2012, 2, 23–34. [Google Scholar]
  65. Miranda, E.N.; Barbosa, B.H.G.; Silva, S.H.G.; Monti, C.A.U.; Tng, D.Y.P.; Gomide, L.R. Variable selection for estimating individual tree height using genetic algorithm and random forest. For. Ecol. Manag. 2022, 504, 119828. [Google Scholar] [CrossRef]
  66. Schielzeth, H. Simple Means to Improve the Interpretability of Regression Coefficients. Methods Ecol. Evol. 2010, 1, 103–113. [Google Scholar] [CrossRef]
  67. Gelman, A. Scaling Regression Inputs by Dividing by Two Standard Deviations. Stat. Med. 2008, 27, 2865–2873. [Google Scholar] [CrossRef]
  68. Harrison, X.A.; Donaldson, L.; Correa-Cano, M.E.; Evans, J.; Fisher, D.N.; Goodwin, C.E.D.; Robinson, B.S.; Hodgson, D.J.; Inger, R. A Brief Introduction to Mixed Effects Modelling and Multi-Model Inference in Ecology. PeerJ 2018, 6, e4794. [Google Scholar] [CrossRef]
Figure 1. Location of the study area within the Tapajós National Forest, showing the Annual Production Unit (APU), Belterra, Brazil. Source: Authors, 2025.
Figure 1. Location of the study area within the Tapajós National Forest, showing the Annual Production Unit (APU), Belterra, Brazil. Source: Authors, 2025.
Forests 17 00030 g001
Figure 2. Scatterplot of Relationship between observed and predicted commercial heights generated by the Linear Mixed-Effects Model, with predictive agreement quantified by the Willmott index (d = 0.861). Source: Authors, 2025.
Figure 2. Scatterplot of Relationship between observed and predicted commercial heights generated by the Linear Mixed-Effects Model, with predictive agreement quantified by the Willmott index (d = 0.861). Source: Authors, 2025.
Forests 17 00030 g002
Figure 3. Boxplot of Species-level distribution of commercial height illustrated through overall boxplots (a), commercial volume groups (bd), and donut chart of the proportional representation of each volume class (e), showing that Group 1 (<6 m3), Group 2 (6–10 m3), and Group 3 (>10 m3) account for 46.1%, 36.7%, and 17.2% of the sampled trees, respectively. Source: Authors, 2025.
Figure 3. Boxplot of Species-level distribution of commercial height illustrated through overall boxplots (a), commercial volume groups (bd), and donut chart of the proportional representation of each volume class (e), showing that Group 1 (<6 m3), Group 2 (6–10 m3), and Group 3 (>10 m3) account for 46.1%, 36.7%, and 17.2% of the sampled trees, respectively. Source: Authors, 2025.
Forests 17 00030 g003
Figure 4. Scatterplot and lines of Species-specific relationships between DBH and commercial height predicted by the Linear Mixed-Effects Model, shown as combined curves (a) and individual species panels (b). Source: Authors, 2025.
Figure 4. Scatterplot and lines of Species-specific relationships between DBH and commercial height predicted by the Linear Mixed-Effects Model, shown as combined curves (a) and individual species panels (b). Source: Authors, 2025.
Forests 17 00030 g004
Figure 5. Biplot of Principal Component Analysis showing the multivariate gradients of harvested trees and their clustering into commercial volume classes. Source: Authors, 2025.
Figure 5. Biplot of Principal Component Analysis showing the multivariate gradients of harvested trees and their clustering into commercial volume classes. Source: Authors, 2025.
Forests 17 00030 g005
Table 1. Performance statistics of the Linear Mixed-Effects Model and the Random Forest model for commercial height estimation.
Table 1. Performance statistics of the Linear Mixed-Effects Model and the Random Forest model for commercial height estimation.
StatisticsLinear Mixed-Effects ModelRandom Forest
r y y ^ 0.770.73
RMSE2.953.10
MAE2.332.44
MPD%−2.62−2.76
BIAS0.002−0.05
Table 2. Estimated coefficients and standard errors for the fixed effects of the Linear Mixed-Effects Model.
Table 2. Estimated coefficients and standard errors for the fixed effects of the Linear Mixed-Effects Model.
Type of EffectGroup or VariableParameter or LevelsEstimateStandard Error
Fixed EffectIntercept β 0 11.1969701.7269491
DBH β 1 0.1421900.0328839
DBH2 β 2 −0.0004990.0001653
Table 3. Species-specific random effects and adjusted coefficients estimated by the Linear Mixed-Effects Model.
Table 3. Species-specific random effects and adjusted coefficients estimated by the Linear Mixed-Effects Model.
Random Effects
SpeciesRepresentationb0 (Rand. Intercept)b1 (Rand. Slope)0 + b0ⱼ)1 + b1ⱼ)
Alexa grandiflora DuckeAg−1.5227−0.01279.67430.1295
Astronium lecointei DuckeAl2.78630.006213.98320.1484
Apuleia moralis Spruce ex Benth.Am−3.38460.01347.81240.1556
Brosimum acutifolium HuberBa−0.82600.008610.37090.1508
Bagassa guianensis Aubl.Bg2.0093−0.045213.20630.0970
Buchenavia huberi DuckeBh−1.6343−0.00349.56270.1388
Cedrelinga catenaeformis DuckeCc4.6445−0.026015.84140.1162
Couratari guianensis Aubl.Cg5.1793−0.018016.37630.1242
Cedrela odorata L.Co−1.22780.00439.96920.1465
Diplotropis purpurea (Rich.) AmshoffDp1.1351−0.008112.33210.1340
Hymenaea courbaril L.Hc6.7646−0.004417.96160.1378
Hymenolobium petraeum DuckeHp−1.28250.00109.91450.1432
Hymenaea parvifolia HuberHpa2.8247−0.002814.02160.1394
Handroanthus serratifolius (Vahl) Nichols.Hs1.85040.010613.04730.1528
Lecythis lurida (Miers) S.A. MoriLl−1.58380.01589.61320.1580
Lecythis pisonis Cambess.Lp−6.04060.02355.15640.1656
Manilkara huberi (Ducke) ChevalierMh−2.29090.02808.90610.1702
Mezilaurus itauba (Meisn.) Taub. ex
Mez
Mi−0.85880.021610.33810.1638
(conclusion)
Ocotea baturitensis Vattimo
Ob3.3957−0.014914.59270.1272
Parkia multijuga Benth.Pm−1.04810.005710.14880.1479
Pseudopiptadenia psilostachya (Benth.) G.P. Lewis & L. RicoPp−4.3135−0.00766.88340.1346
Schizolobium amazonicum Huber ex DuckeSam0.92220.003012.11920.1452
Swartzia laurifolia Benth.Sl−0.51730.007510.67970.1497
Trattinnickia rhoifolia Willd.Tr0.3874−0.025611.58440.1166
Vochysia maxima DuckeVm−5.54280.000635.65420.1428
Vatairea paraensis DuckeVp0.17430.019011.37130.1612
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ribeiro, R.B.d.S.; Reis, L.P.; Woycikievicz, A.P.F.; Mello, M.N.d.; Oliveira, A.H.M.; Dias, C.T.d.S.; Martorano, L.G. Modeling Commercial Height in Amazonian Forests: Accuracy of Mixed-Effects Regression Versus Random Forest. Forests 2026, 17, 30. https://doi.org/10.3390/f17010030

AMA Style

Ribeiro RBdS, Reis LP, Woycikievicz APF, Mello MNd, Oliveira AHM, Dias CTdS, Martorano LG. Modeling Commercial Height in Amazonian Forests: Accuracy of Mixed-Effects Regression Versus Random Forest. Forests. 2026; 17(1):30. https://doi.org/10.3390/f17010030

Chicago/Turabian Style

Ribeiro, Renato Bezerra da Silva, Leonardo Pequeno Reis, Antonio Pedro Fragoso Woycikievicz, Marcello Neiva de Mello, Afonso Henrique Moraes Oliveira, Carlos Tadeu dos Santos Dias, and Lucietta Guerreiro Martorano. 2026. "Modeling Commercial Height in Amazonian Forests: Accuracy of Mixed-Effects Regression Versus Random Forest" Forests 17, no. 1: 30. https://doi.org/10.3390/f17010030

APA Style

Ribeiro, R. B. d. S., Reis, L. P., Woycikievicz, A. P. F., Mello, M. N. d., Oliveira, A. H. M., Dias, C. T. d. S., & Martorano, L. G. (2026). Modeling Commercial Height in Amazonian Forests: Accuracy of Mixed-Effects Regression Versus Random Forest. Forests, 17(1), 30. https://doi.org/10.3390/f17010030

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop