Next Article in Journal
Mango Inflorescence Detection Based on Improved YOLOv8 and UAVs-RGB Images
Previous Article in Journal
Impact of Fire Severity on Soil Bacterial Community Structure and Its Function in Pinus densata Forest, Southeastern Tibet
Previous Article in Special Issue
Estimation of Gross Primary Productivity Using Performance-Optimized Machine Learning Methods for the Forest Ecosystems in China
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

L2: Accurate Forestry Time-Series Completion and Growth Factor Inference

1
School of Information Science and Technology, Beijing Forestry University (BFU), Beijing 100083, China
2
Engineering Research Center for Forestry-Oriented Intelligent Information Processing of National Forestry and Grassland Administration, Beijing Forestry University (BFU), Beijing 100083, China
3
Ministry of Education Key Laboratory of Silviculture and Conservation, Beijing Forestry University (BFU), Beijing 100083, China
4
State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
5
School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 101408, China
*
Author to whom correspondence should be addressed.
Forests 2025, 16(6), 895; https://doi.org/10.3390/f16060895
Submission received: 17 March 2025 / Revised: 29 April 2025 / Accepted: 22 May 2025 / Published: 26 May 2025
(This article belongs to the Special Issue Application of Machine-Learning Methods in Forestry)

Abstract

In forestry data management and analysis, data integrity and analytical accuracy are of critical importance. However, existing techniques face a dual challenge: first, sensor failures, data transmission interruptions, and human errors lead to the prevalence of missing data in forestry datasets; second, the multidimensional heterogeneity and environmental complexity of forestry systems not only increase the difficulty of missing value estimation, but also significantly affect the accuracy of resolving the potential correlations among data. In order to solve the above problems, we proposed the L 2 model using the aspen woodland as the experimental object. The L 2 model consists of a complementary model and a predictive model. The L 2 complementary model integrates low tensor tensor kernel norm minimisation (LRTC-TNN) to capture global consistency and local trends, and combines long and short-term memory and convolutional neural network (LSTM-CNN) to extract temporal and spatial features, which is effective in accurately reconstructing the missing values in forestry time-series data. We also optimised the LRTC-TNN model to handle multi-class data and incorporated a self-attention mechanism into the LSTM-CNN framework to improve performance in the case of complex missing data. The L 2 prediction model adopts a dual attention mechanism (temporal attention mechanism and feature attention mechanism) based on LSTM to construct a stem diameter prediction model, which achieves high-precision prediction of stem diameter variation. Then we further analyzed the effects of various factors on stem diameter using SHAP (Shapley Additive Explanations). Experimental results demonstrate that our L 2 significantly improves data completion accuracy while preserving the original structure and key characteristics of the data. Moreover, it enables a more precise analysis of the factors affecting stem diameter, providing a robust foundation for advanced forestry data analysis and informed decision making.

1. Introduction

Forestry is an important pillar of the national economy and plays an important role not only in maintaining the ecosystem, but also in promoting long-term economic development and improving social well-being.
The integration of data into forestry resource management is crucial. Accurate and efficient management of forest resources can be achieved by collecting and analyzing a wide range of forest resource data, including forest area, species distribution, and stocking. These data enable decision makers to assess current conditions, monitor trends, and identify the potential development and utilization value of resources, leading to more scientific and sustainable resource use strategies [1].
In addition, data are indispensable in forestry monitoring. Technologies such as remote sensing and unmanned aerial vehicle inspections provide real-time insights into forest growth, pest outbreaks, and disease occurrence [2]. Comprehensive analyses of historical and current data allow for the prediction of trends in forestry resources and provide a solid basis for the development of long-term forestry development plans [3].
In this study, Populus tomentosa forest was used as the benchmark object for experimental research. The research process is mainly divided into two stages, data processing stage and data analysis stage. The contributions of our work are listed as follows:
  • We propose the L 2 completion model, which integrates low-rank tensor completion (LRTC) with long short-term memory (LSTM) networks to enhance the completeness and accuracy of forestry time-series data. This model effectively optimized and completed time-series data for Populus tomentosa forest farms, significantly improving data integrity and reliability.
  • We introduce the feature attention layer and the time-series attention layer in the LSTM framework to construct our L 2 stem diameter prediction model, and based on the SHAP (Shapley Additive Explanations) analysis algorithm [4] quantitatively analyzed the effects of various forestry factors on the growth and development of woolly poplar.
  • We conduct comprehensive experiments using the control variable method and a module stacking strategy, validating our model’s significant advantages in effectiveness and stability.

2. Related Work

Data processing optimization is of key significance in ensuring the efficiency and accuracy of forestry data management in forestry applications, in which the accurate completion of a large amount of missing data is particularly important. A large amount of missing data will have a significant impact on the accuracy and reliability of the forestry system, weakening its value in practical applications.
In recent years, scholars have proposed a variety of solutions to the problem of missing data. Kabir et al. [5] explored in depth the application of single-mean interpolation and multiple interpolation methods in filling in missing water flow data. Yu et al. [6] implemented a matrix-completion operation for traffic data using the Schatten p-paradigm. Zhu et al. [7] introduced a method for audio A tensor kernel paradigm for signal recovery to complete the reconstruction of multi-channel missing data. Pan et al. [8] developed a multi-tasking parallel neural network based on the transformer.
Despite the above results in several fields, there is a lack of research on accurate complementation of forestry data. LRTC [9] is commonly used to fill multidimensional data, while LSTM [10] as a special variant of RNN, is able to effectively capture long-term dependencies in time-series data. In view of this, we propose an L 2 -completion method based on the improved LRTC-TNN model and LSTM-CNN neural network, aiming at achieving the accurate complementation of forestry multi-twin data.
In the field of forestry time-series data analysis, the traditional methods based on experience and basic mathematical analysis are not precise enough, and the establishment of computer models through the data and the quantitative analysis of various features in the models are effective ways to achieve more precise analysis. The KNN algorithm [11] is a classic and effective algorithm. ARIMA-LSTM Hybrid Moving Average Model [12] is an effective stock-price forecasting algorithm that combines the strengths of the LSTM and ARIMA models to construct an integrated forecasting framework. Zhang et al. [13] developed a CEEMDAN-BO-BiLSTM hybrid model for monthly average temperature forecasting. Nikpour et al. [14] introduced the informer architecture-based Gelato model into multivariate air-pollution forecasting.
To analyze forestry data more accurately, we develop an LSTM-based predictive model [15] in L 2 space and augment it with multiple attention mechanisms to accurately model stem diameter. Our method ensures high performance in dealing with complex time-series relationships and is particularly suitable for forestry application scenarios that require high accuracy. After building the model, various features of the model need to be analyzed, while SHAP (SHapley Additive exPlanations) analysis [4] is employed for in-depth analysis of the model considering the accuracy and time-series data features.

3. Method

By observing the multivariate twinned dataset of the woolly poplar woodland, the missing data can be categorized into two types: (i) small missing range data with frequent occurrence—these data are less affected by other channels, and can be efficiently filled in by using an improved LRTC-TNN method that captures both global and local trends; (ii) large missing range with infrequent occurrences C—these data are more affected by other channels, and can be more accurately complemented using an LSTM-CNN fusion model incorporating a self-attention mechanism that is capable of mining time-series features.
To address the missing data problem, a two-step preprocessing approach is employed. First, the DBSCAN clustering algorithm is utilized to identify patterns within the data and to effectively quantify and categorize the missing entries [16]. Based on the quantification rule, the data are divided into two groups: category C and the remaining data. Next, the LRTC-TNN model is constructed and optimized to complement the original data, and the data after large-scale completion is used as the input to the LSTM-CNN-Attention model as a way to refine the C part of the complementary to obtain the final accurate processed data.
After data refinement, in order to analyze the forestry data for better decision making, we take stem diameter as the growth index of poplar, construct our L 2 prediction model with LSTM and dual-attention mechanism, introduce an interpretable analysis method based on SHAP, and systematically quantify the contribution of each feature to the growth of poplar, and thus analyze the factors affecting the growth of poplar, which will lay a foundation for the later in-depth decision making. Figure 1 shows the pipeline of our L 2  model.

3.1. Data Processing Optimization

3.1.1. Improved LRTC-TNN Model in L 2

Notations. Following [17], we use X to denote a matrix, x i R n to denote a vector, and χ R n 1 × × n d to represent a d-order tensor, with x i 1 i d denoting its individual elements. The Frobenius norms for a matrix X and a tensor χ are given by: X F = i , j x i , j 2 and χ F = i 1 , i 2 , , i d x i 1 , i 2 , , i d 2 .
For a tensor χ , its mode-k unfolding, denoted as χ ( k ) , is a matrix in R n k × ( l k n l ) , where mode-k corresponds to rearranging the tensor elements into a matrix along the kth mode. The folding operator fold k reverses this process, converting a matrix back into a higher-order tensor along mode k. This transformation is expressed as fold k ( χ ( k ) ) = χ .
Non-convex low-rank tensor completion model. Low-rank tensor completion (LRTC) [9] is a family of tensor-completion techniques based on the low-rank assumption of partially observed input tensors, analogous to low-rank matrix completion. In the context of forestry multivariate data completion, we focus on modeling a third-order tensor that represents forestry data in a structured form of C × N × T , where C denotes the cycle period.
Given a third-order tensor γ R C × N × T with partial observations, the LRTC model is formulated as:
min χ rank ( χ ) s . t . P Ω ( χ ) = P Ω ( γ ) ,
where χ R C × N × T is the tensor to be recovered, Ω represents the set of observed indices, and rank ( · ) denotes the tensor rank, which extends the matrix rank concept to higher-order tensors.
The operator P Ω ( χ ) : R C × N × T R C × N × T represents the orthogonal projection onto the observed set Ω , while P Ω ( χ ) : R C × N × T R C × N × T denotes the projection onto the complementary set, satisfying P Ω ( χ ) + P Ω ( χ ) = χ .
Solving the rank minimization problem in Equation (1) is NP-hard due to the discrete nature of the tensor rank function. To address this challenge, recent research has explored convex relaxations of the rank minimization problem. One such approach is replacing the rank function with the nuclear norm, as proposed by Li [18], leading to a reformulated minimization problem, as shown in Equation (2).
min χ k = 3 3 α k χ k * s . t . P Ω ( χ ) = P Ω ( γ ) ,
where α k is a weight parameter satisfying α k > 0 for k = 1 , 2 , 3 .
In this objective function, the nuclear norm of any matrix X is defined as: X * = i σ i ( X ) , where σ i ( X ) denotes the ith largest singular value of X.
LRTC-TNN. For the kernel-based approach, we adopt the truncated nuclear norm (TNN) method [19], which outperforms traditional techniques. Unlike conventional methods that may suffer from information loss, TNN selectively retains larger singular values while appropriately processing smaller ones.
To enhance the accuracy of forestry data completion, we incorporate TNN as a replacement for traditional approaches. To improve the generalizability of the model, we begin by preprocessing the data. Specifically, we replace missing values in γ with a constant con (set to −100 or −1 based on the overall characteristics of the forestry dataset) and normalize the values of γ to the range [ 0 , 1 ] . The transformation is formulated as follows:
min ( γ ) = P min ( γ ) , max ( γ ) = P max ( γ ) γ ( i ) = γ min ( γ ) max ( γ )
Here, γ ( i ) represents the value of each element in γ , while P min ( γ ) and P max ( γ ) denote the minimum and maximum value operations, respectively. This preprocessing step enhances the generalizability of the algorithm for multivariate forestry data.
Next, we introduce the truncated nuclear norm (TNN) and replace the nuclear norm in Equation (2) with TNN, resulting in the following optimization problem:
min ω , χ 1 , χ 2 , χ 3 k = 1 3 α k χ k ( k ) r k , * s . t . χ k = ω , k = 1 , 2 , 3 P Ω ( ω ) = P Ω ( γ )
In this formulation, · r k , * represents the truncated nuclear norm of the tensor along mode k, where the truncation parameter is defined as: r k = θ · min { n k , n k } , k { 1 , 2 , 3 } where · denotes the ceiling function, and θ is a universal rate parameter that controls the truncation level for tensor expansion in mode k, ensuring: 1 r k min { n k , n k } . The rate parameter θ is automatically assigned based on the truncation requirements of each tensor mode.
Furthermore, ω is introduced to store the observed forestry data, which is then propagated into the tensor via χ k = ω for k = 1 , 2 , 3 . In Equation (4), χ k ( k = 1 , 2 , 3 ) is associated with the TNN framework, while ω serves as a link between the reconstructed tensor and the partially observed data γ .
Algorithm solution. The alternating direction method of multipliers (ADMM) [20] is employed to solve Equation (4). ADMM reformulates the tensor completion problem into three iterative subproblems, as shown in Equation (5), following the update sequence: χ 1 l + 1 χ 3 l + 1 ω l + 1 τ l + 1 . When ρ 1 = ρ 2 = ρ 3 = ρ , τ k l + 1 is updated according to Equation (6). The complete algorithmic framework is given in Algorithm 1.
χ k l + 1 = fold k U diag ( σ ( χ ( k ) ) ) V T σ i ( χ ( k ) ) = σ i ( ω ( k ) l ) 1 ρ k τ k ( k ) l α k ρ k + , i > r k σ i ( ω ( k ) l ) 1 ρ k τ k ( k ) k , otherwise ω l + 1 = 1 k = 1 3 ρ k k = 1 3 ( ρ k χ k l + 1 + τ k l ) τ k l + 1 : = τ k l + ρ k ( χ k l + 1 ω l + 1 )
τ k l + 1 : = τ l + ρ ( χ ˜ l + 1 ω ˜ l + 1 )
Then, we adopt a dual-criterion approach based on the primal and pairwise residuals to determine the termination condition of the ADMM iterations [21]. Specifically, the primal residual is defined as r ( n ) = χ ( n ) ω ( n ) F , and the pairwise residual as s ( n ) = ρ n ω ( n ) ω ( n 1 ) F . The ADMM iteration is terminated when both of the following conditions are satisfied: r ( n ) N ε pri , s ( n ) N ε dual , where n denotes the iteration number, N is the total number of elements in the tensor, ε pri and ε dual are the preset tolerance thresholds, and ρ is the penalty parameter.
To enhance convergence, adaptive adjustment rules are applied to ρ . If the primal and dual residuals are imbalanced, ρ is dynamically updated according to the strategy described in Equation (7). To prevent numerical instability, an upper bound is imposed on the maximum allowable reduction in ρ .
ρ ( n + 1 ) 2 ρ ( n ) ,   i f r ( n ) > 10 s ( n ) 0.5 ρ ( n ) ,   i f s ( n ) > 10 r ( n ) ρ ( n ) , o t h e r w i s e
Algorithm 1 Ip-LRTC-TNN
Require:  γ C × N × T γ D × T , χ C × N × T χ D × T ,
Ensure:  c o n 100 , Initialize α , ρ , θ , ε , m a x _ i n t e r
for each attribute y in γ  do
     if y is missing then replace y with c o n
     end if
end for
χ ˜ 3 × C × N × T 0
τ ˜ 3 × C × N × T 0
ω γ
I t 0
while true do
      ρ = min { ρ × 1.05 , ρ m a x }
     for each attribute k in 1 , 2 , 3  do Updated χ ˜ [ k ] by ω and τ ˜  
     end for
     Updated ω by χ ˜ and τ ˜  
     Updated τ ˜   by χ ˜ and ω
      χ = e i n s u m ( χ )
     Calculate t o l from γ and χ
      I t + +
     if  ( t o l < ε ) o r ( I t m a x _ i n t e r )  then break
     end if
end while
 return χ

3.1.2. LSTM-CNN-Attention in L 2

To address the periodic characteristics of forestry data, we employ the LSTM-CNN model for C partial data completion and integrate a self-attention mechanism to enhance the model’s ability to capture complex patterns. The overall framework of the model is shown in Figure 2.
The input layer receives multivariate matrices selected from forestry datasets. The CNN layer comprises two convolutional layers with kernel sizes of 5 × 5 and 3 × 3, respectively, employing 128 filters in the first layer and 64 in the second. This configuration—identified as the optimal hyperparameter combination through grid search [22]—achieved an average cross-validation accuracy of approximately 0.89. The choice of 5 × 5 and 3 × 3 kernels is inspired by the cascade architecture of VGGNet [23], which strategically stacks smaller convolutional filters to maintain the receptive field while reducing the number of parameters. A 2 × 2 pooling layer follows to reduce dimensionality, and a ReLU activation function is applied to enhance nonlinear feature representation [24]. Extracted features are then passed to a two-layer LSTM network, a deep-learning model designed for processing sequential data [15]. However, relying solely on LSTM may not be sufficient to capture complex dependencies, so a self-attention mechanism is introduced to dynamically assign weights, allowing the model to identify key information more effectively [25].
In the self-attention mechanism, the input data are first linearly mapped into query (Q), key (K), and value (V) matrices: q n = W q α n , k n = W k α n , v n = W v α n where α represents the input forestry data sequence, and W q , W k , W v are weight matrices. The attention weights are then computed to measure the importance of different time steps, formulated as: A = Softmax Q K T d k V where d k denotes the mapping dimension [26]. Finally, the predicted values are obtained through a weighted summation of the value matrix. The structure of the attention mechanism is shown in Figure 3.

3.2. Forestry Data Analysis

To capture the periodicity in forestry data, we integrate temporal and feature attention mechanisms with the LSTM model in L 2 for stem diameter prediction. The model structure is shown in Figure 4.
The feature attention layer transforms the multiple input features into a matrix form. Let x be the feature matrix of environmental and growth data for Populus tomentosa at time t, and M represent the number of feature types [27]. The input feature vector is x t = [ x t , 1 , x t , 2 , , x t , M ] . The neural network calculates the feature weight vector e t as e t = W e σ ( W e x t + b e ) , where e t = [ e t , 1 , e t , 2 , , e t , M ] is the attention weight vector for each feature at time t, and σ is the sigmoid activation function. The Softmax function normalizes the weight, yielding the feature attention weight matrix α t , where each α t , M is calculated as: α t , M = e x p ( e t , M ) i = 1 M e t , i . The new weighted feature vector y t is obtained by multiplying the feature attention weights with input vector x t , and is passed to the LSTM layer for further training.
The LSTM Layerconsists of two LSTM layers to process the sequential data. The temporal attention layer [28] improves the model’s ability to understand the relationship between the target and historical data, enhancing prediction accuracy over time. The input for the temporal attention module is the hidden layer state of the LSTM at time t, denoted as h t = [ h t , 1 , h t , 2 , , h t , M ] . After applying the ReLU activation function, the attention weight vector for each historical moment is computed as l t = ReLU σ ( W d h t + b d ) [29]. The Softmax function normalizes the attention weights to generate the temporal attention vector β t = [ β t , 1 , β t , 2 , , β t , n ] , where β t , τ is calculated as β t , τ = e x p ( l t , τ ) i = 1 n l t , i . The final time-related state z t is obtained by weighting the historical states with the temporal attention vector z t = β t h t . This approach captures the influence of past data on the prediction.

4. Experiments

All experiments were implemented in Python and executed on a high-performance dual-socket server. The system is powered by an AMD EPYC processor (2.7 GHz base frequency, up to 4.1 GHz boost, 64 cores per socket), supported by 1 TB of RAM, and an NVIDIA RTX 4070 GPU with 12 GB of VRAM.

4.1. Dataset

We use a Populus tomentosa forest as the subject of our experiment, collecting relevant data through sensors. The datasets include: (i) growth-related data–trunk DIAmeter (DIA), soil moisture (SM), and liquid flow (LF); (ii) meteorological data–TEMPerature (TEMP), RADiation (RAD), and relative humidity (RH).
The dataset has a missing rate of over 30%, with some data having continuous missing values exceeding 50%, and more than 15% of the dataset consists of missing entries. During the training process, the required dataset is constructed by selecting appropriate data based on both growth-related and meteorological parameters.

4.2. Ablation Study

Our ablation study conducts an analysis of the data processing optimization component, evaluating the contribution of each module to the model’s performance through a systematic combination experiment. As shown in Table 1 and Table 2, we tested combinations of four core modules—the improved LRTC-TNN module, the convolutional neural network, the long short-term memory (LSTM) network, and the attention mechanism—using in our L 2 model as the baseline. A simplified model was constructed by removing individual modules (indicated by a ‘0’ for missing modules) to assess the impact of each component on prediction performance.
During training, we employ three key regularization strategies to enhance generalization and prevent overfitting: (i) dropout is applied with a rate of 0.3 before the CNN’s fully connected layer, and an additional dropout layer with a rate of 0.5 is placed after the LSTM layer; (ii) weight decay with a coefficient of 1 × 10 4 is incorporated into the optimizer to regularize the parameters of the fully connected layer; (iii) an early stopping mechanism is implemented, whereby training is halted if the validation loss fails to improve for three consecutive epochs.
The experimental results show that the complete L 2 model outperforms its ablation variants across most metrics (p < 0.05). The improved LRTC-TNN (+Lr) module effectively addresses small-scale missing data in the original dataset while enhancing the model’s feature learning capability. The attention mechanism plays a crucial role in multidimensional feature fusion. Additionally, a moderate expansion of the convolutional layer (+2C) and an increase in the LSTM layers (+2Ls) enhance short-term feature extraction, further validating the effectiveness of the model architecture.
In the forestry data analysis, an ablation study was conducted on the stem diameter prediction model, which consists of a feature attention layer, an LSTM layer, and a temporal attention layer. The experiment employed a module stacking strategy, sequentially introducing each component to evaluate its impact on overall model performance. This modular design approach allows precise identification of each module’s contribution and provides a strong theoretical foundation for model optimization and enhancement.
Table 3 gives the experimental results in an intuitive format. “All” represents our complete L 2 model, integrating all three key modules as the study’s core architecture. “BL” denotes the baseline model, containing only a single LSTM layer, serving as a reference to highlight performance improvements. “+F” indicates the inclusion of the feature attention mechanism, while “+T” represents the addition of the temporal attention mechanism. This setup allows a clear observation of how each component enhances model performance.
The experiments yield the following insights: first, the feature attention mechanism significantly improves prediction accuracy by dynamically weighting key feature dimensions, achieving a performance gain of approximately 39.44%; second, among different LSTM layer configurations, the two-layer stacked structure offers the best balance, demonstrating superior deep feature extraction and gradient propagation stability compared to single-layer and three-layer architectures; lastly, the temporal attention layer effectively mitigates long-range dependency attenuation, further boosting time-series prediction accuracy by about 10.64%. The hybrid attention architecture, combining spatial and temporal attention mechanisms, optimizes feature selection and temporal modeling in a coordinated manner. This suggests that the structure performs best.

4.3. Comparative Analysis

For data processing optimization, we selected models capable of time-series completion, including both traditional approaches—such as linear interpolation, exponential interpolation, and the KNN method [11]—and recent advanced methods from the past two years, including BTMF [30], BiLSTM-GRU [31], and transformers [8]. These models were then compared with our L 2 model.
The comparison results are presented in Table 4 and Table 5. The findings reveal the following. (i) Traditional interpolation methods perform significantly worse when the missing rate is high and consecutive data points are absent. This limitation arises from their inability to capture latent patterns in time-series data, making it particularly challenging to model long-term dependencies in non-stationary sequences. (ii) Performance variability among modern time-series completion methods. The effectiveness of newer approaches varies considerably across different datasets. For instance, BiLSTM-GRU performs well in completing LF but struggles with DTR. (iii) Superior overall performance of our L 2 model. Our L 2 hybrid architecture consistently delivers the best results across all test scenarios, demonstrating stable performance and outperforming other methods in key metrics such as RMSE and MASE.
In the forestry data analysis, we selected a range of classical statistical models and cutting-edge deep-learning models from the past two years for comparison with our L 2 model. These include RNN [32], the ARIMA-LSTM hybrid model [12], Bi-directional BO-BiLSTM [13], and Informer [14]. To ensure a comprehensive evaluation of prediction accuracy, we still employed MAPE and RMSE as the metrics.
As shown in Table 6, the comparative results reveal significant performance variations among different models in time-series forecasting and our L 2 model achieves the lowest RMSE and MASE values, indicating superior predictive performance. Our L 2 model, specifically designed for stem diameter prediction, demonstrates a clear advantage in capturing complex temporal patterns, outperforming alternative approaches.

4.4. SHAP Interpretability

Building on the successful development of our L 2 prediction model, we employ SHAP interpretability analysis based on game theory to quantitatively assess feature importance [4]. By computing the Shapley value for each feature in the prediction process, the model’s output is equitably attributed to its input features, treating each feature as a player in the game, with the predicted value representing the total payoff. SHAP effectively measures feature contributions by evaluating their marginal effect across all possible feature combinations. The analysis results are illustrated in Figure 5.
According to the absolute value distribution of feature contributions, soil moisture plays the most crucial role in stem diameter prediction, while relative humidity and precipitation exert a secondary influence. In contrast, wind speed has a relatively minor impact. This quantitative analysis highlights soil moisture dynamics as the primary driver of forest stem growth, suggesting that forestry managers should prioritize precise management strategies based on soil moisture monitoring.

5. Conclusions

Forestry datasets often face severe challenges due to frequent sensor failures, leading to significant data gaps. Simultaneously, an in-depth understanding of tree growth requires a more rigorous and comprehensive approach to accurately identify key influencing factors. To address these challenges, we focuses on effectively completing missing forestry time-series data and systematically analyzing critical growth-related features through a novel modeling framework.
Our L 2 model is specifically designed for forestry time-series data, which can achieve robust and precise data recovery under complex missing conditions. This approach integrates the strengths of LRTC-TNN in preserving global consistency and local trends with the powerful temporal feature extraction capabilities of the LSTM-CNN architecture. By embedding a self-attention mechanism within the LSTM-CNN framework, our model further enhances its ability to capture and reconstruct intricate missing patterns in forestry data.
Building upon this, we develop a trunk diameter prediction model that incorporates our L 2 method alongside a dual-attention mechanism, leveraging key aspen growth factors. This model not only capitalizes on LSTM’s ability to process sequential data but also strengthens feature selection and temporal modeling through the attention mechanism. Furthermore, SHAP analysis provides valuable insights into the relative importance of different environmental and meteorological factors in poplar growth.
Comprehensive ablation studies and comparative experiments validate the effectiveness and stability of our L 2 model. Compared with other models, our L 2 model has significantly improved the complementary and prediction accuracies, along with greater stability across all types of forestry data, thus proving its superiority in the task of forestry time-series completion and prediction. These findings offer a solid foundation for improving forestry data analysis and provide theoretical support for precision forest management strategies.

Author Contributions

Conceptualization, L.J. and M.Y.; methodology, L.J., M.Y. and W.M.; software, L.J.; validation, L.J., W.M. and J.D.; formal analysis, L.J.; investigation, M.Y.; resources, B.X.; data curation, J.D.; writing—original draft preparation, L.J.; writing—review and editing, W.M.; visualization, M.Y.; supervision, W.M.; project administration, B.X.; funding acquisition, B.X. and W.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (Nos. 32271983, 62376271, U22B2034, 62262043, 62172416 and 62365014), the National Key Research and Development Program of China (Grant No. 2021YFD2201203), Beijing Natural Science Foundation L241056, and Shenzhen S&T programme (No. CJGJZD20240729141906008).

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Qiu, H.; Zhang, H.; Lei, K.; Zhang, H.; Hu, X. Forest digital twin: A new tool for forest management practices based on Spatio-Temporal Data, 3D simulation Engine, and intelligent interactive environment. Comput. Electron. Agric. 2023, 215, 108416. [Google Scholar] [CrossRef]
  2. Buonocore, L.; Yates, J.; Valentini, R. A Proposal for a Forest Digital Twin Framework and Its Perspectives. Forests 2022, 13, 498. [Google Scholar] [CrossRef]
  3. Li, W.; Yang, M.; Xi, B.; Huang, Q. Framework of Virtual Plantation Forest Modeling and Data Analysis for Digital Twin. Forests 2023, 14, 683. [Google Scholar] [CrossRef]
  4. Wang, X.; Ke, Z.; Liu, W.; Zhang, P.; Cui, S.A.; Zhao, N.; He, W. Compressive Strength Prediction of Basalt Fiber Reinforced Concrete Based on Interpretive Machine Learning Using SHAP Analysis. Iran. J. Sci. Technol. Trans. Civ. Eng. 2024, 49, 2461–2480. [Google Scholar] [CrossRef]
  5. Kabir, G.; Tesfamariam, S.; Hemsing, J.; Sadiq, R. Handling incomplete and missing data in water network database using imputation methods. Sustain. Resilient Infrastruct. 2020, 5, 365–377. [Google Scholar] [CrossRef]
  6. Yu, J.; Stettler, M.E.; Angeloudis, P.; Hu, S.; Chen, X.M. Urban network-wide traffic speed estimation with massive ride-sourcing GPS traces. Transp. Res. Part C Emerg. Technol. 2020, 112, 136–152. [Google Scholar] [CrossRef]
  7. Zhu, L.; Yang, L. Audio completion method based on tensor analysis. J. Phys. Conf. Ser. 2024, 2849, 012097. [Google Scholar] [CrossRef]
  8. Pan, J.; Zhong, S.; Yue, T.; Yin, Y.; Tang, Y. Multi-Task Foreground-Aware Network with Depth Completion for Enhanced RGB-D Fusion Object Detection Based on Transformer. Sensors 2024, 24, 2374. [Google Scholar] [CrossRef]
  9. Liao, T.; Wu, Z.; Chen, C.; Zheng, Z.; Zhang, X. Tensor completion via convolutional sparse coding with small samples-based training. Pattern Recognit. 2023, 141, 109624. [Google Scholar] [CrossRef]
  10. Cheng, F.; Peng, L.; Zhu, H.; Zhou, C.; Dai, Y.; Peng, T. A Defect Data Compensation Model for Infrared Thermal Imaging Based on Bi-LSTM with Attention Mechanism. JOM 2024, 76, 3028–3038. [Google Scholar] [CrossRef]
  11. Zhou, X.; Wang, H.; Xu, C.; Peng, L.; Xu, F.; Lian, L.; Deng, G.; Ji, S.; Hu, M.; Zhu, H.; et al. Application of kNN and SVM to predict the prognosis of advanced schistosomiasis. Parasitol. Res. 2022, 121, 2457–2460. [Google Scholar] [CrossRef] [PubMed]
  12. Saleti, S.; Panchumarthi, L.Y.; Kallam, Y.R.; Parchuri, L.; Jitte, S. Enhancing Forecasting Accuracy with a Moving Average-Integrated Hybrid ARIMA-LSTM Model. SN Comput. Sci. 2024, 5, 704. [Google Scholar] [CrossRef]
  13. Zhang, X.; Ren, H.; Liu, J.; Zhang, Y.; Cheng, W. A monthly temperature prediction based on the CEEMDAN–BO–BiLSTM coupled model. Sci. Rep. 2024, 14, 808. [Google Scholar] [CrossRef] [PubMed]
  14. Nikpour, P.; Shafiei, M.; Khatibi, V. Gelato: A new hybrid deep learning-based Informer model for multivariate air pollution prediction. Environ. Sci. Pollut. Res. 2024, 31, 29870–29885. [Google Scholar] [CrossRef]
  15. Tang, Y.; Yu, F.; Pedrycz, W.; Li, F.; Ouyang, C. Oriented to a multi-learning mode: Establishing trend-fuzzy-granule-based LSTM neural networks for time series forecasting. Appl. Soft Comput. 2024, 166, 112195. [Google Scholar] [CrossRef]
  16. Yang, W.; Zhang, Z.; Zhao, Y.; Gu, Y.; Huang, L.; Zhao, J. CABGSI: An efficient clustering algorithm based on structural information of graphs. J. Radiat. Res. Appl. Sci. 2024, 17, 101040. [Google Scholar] [CrossRef]
  17. Liu, J.; Musialski, P.; Wonka, P.; Ye, J. Tensor Completion for Estimating Missing Values in Visual Data. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 208–220. [Google Scholar] [CrossRef]
  18. Chen, X.; Yang, J.; Sun, L. A nonconvex low-rank tensor completion model for spatiotemporal traffic data imputation. Transp. Res. Part C 2020, 117, 102673. [Google Scholar] [CrossRef]
  19. Pandey, S. An Improved Analysis of TNN-Based Managed and Unverified Learning Approaches for Optimum Threshold Resolve. SN Comput. Sci. 2023, 4, 719. [Google Scholar] [CrossRef]
  20. Hien, L.T.K.; Papadimitriou, D. An inertial ADMM for a class of nonconvex composite optimization with nonlinear coupling constraints. J. Glob. Optim. 2024, 89, 927–948. [Google Scholar] [CrossRef]
  21. Jin, Z.F.; Fan, Y.; Shang, Y.; Ding, W. A dual symmetric Gauss-Seidel technique-based proximal ADMM for robust fused lasso estimation. Numer. Algorithms 2024, 98, 1337–1360. [Google Scholar] [CrossRef]
  22. Gholizadeh, M.; Saeedi, R.; Bagheri, A.; Paeezi, M. Machine learning-based prediction of effluent total suspended solids in a wastewater treatment plant using different feature selection approaches: A comparative study. Environ. Res. 2024, 246, 118146. [Google Scholar] [CrossRef] [PubMed]
  23. Linck, I.; Gómez, A.T.; Alaghband, G. SVG-CNN: A shallow CNN based on VGGNet applied to intra prediction partition block in HEVC. Multimed. Tools Appl. 2024, 83, 73983–74001. [Google Scholar] [CrossRef]
  24. Chagnon, J.; Hagenbuchner, M.; Tsoi, A.C.; Scarselli, F. On the effects of recursive convolutional layers in convolutional neural networks. Neurocomputing 2024, 591, 127767. [Google Scholar] [CrossRef]
  25. Wang, Y.; Zhang, Z.; Wang, Y.; You, L.; Wei, G. Modeling and structural optimization design of switched reluctance motor based on fusing attention mechanism and CNN-BiLSTM. Alex. Eng. J. 2023, 80, 229–240. [Google Scholar] [CrossRef]
  26. Zhang, S.; Xie, L. Advancing neural network calibration: The role of gradient decay in large-margin Softmax optimization. Neural Netw. 2024, 178, 106457. [Google Scholar] [CrossRef]
  27. Mao, Y.; Cheng, Y.; Shi, C. A Job Recommendation Method Based on Attention Layer Scoring Characteristics and Tensor Decomposition. Appl. Sci. 2023, 13, 9464. [Google Scholar] [CrossRef]
  28. Tang, Y.; Wang, Y.; Liu, C.; Yuan, X.; Wang, K.; Yang, C. Semi-supervised LSTM with historical feature fusion attention for temporal sequence dynamic modeling in industrial processes. Eng. Appl. Artif. Intell. 2023, 117, 105547. [Google Scholar] [CrossRef]
  29. Ramadevi, B.; Kasi, V.R.; Bingi, K. Fractional ordering of activation functions for neural networks: A case study on Texas wind turbine. Eng. Appl. Artif. Intell. 2024, 127, 107308. [Google Scholar] [CrossRef]
  30. Liu, H.; Li, L. Missing Data Imputation in GNSS Monitoring Time Series Using Temporal and Spatial Hankel Matrix Factorization. Remote Sens. 2022, 14, 1500. [Google Scholar] [CrossRef]
  31. Çelebi, M.; Öztürk, S.; Kaplan, K. An emotion recognition method based on EWT-3D–CNN–BiLSTM-GRU-AT model. Comput. Biol. Med. 2024, 169, 107954. [Google Scholar] [CrossRef] [PubMed]
  32. Xiaohui, H.; Yuan, J.; Jie, T. MAPredRNN: Multi-attention predictive RNN for traffic flow prediction by dynamic spatio-temporal data fusion. Appl. Intell. 2023, 53, 19372–19383. [Google Scholar]
Figure 1. The pipeline of our L 2 model. Accurate complementation was carried out by an L 2 -complementary model, and then an L 2 -stem diameter prediction model was constructed, and the factors affecting the growth of woolly poplar were obtained and analyzed by SHAP analysis.
Figure 1. The pipeline of our L 2 model. Accurate complementation was carried out by an L 2 -complementary model, and then an L 2 -stem diameter prediction model was constructed, and the factors affecting the growth of woolly poplar were obtained and analyzed by SHAP analysis.
Forests 16 00895 g001
Figure 2. The framework of LSTM-CNN and Attention in our L 2 model. The input layer uses SHAP transformation based on time step = 72, the convolutional layer uses filters of 128, 64, pooling pool of 2, and the LSTM layer uses two layers of LSTM.
Figure 2. The framework of LSTM-CNN and Attention in our L 2 model. The input layer uses SHAP transformation based on time step = 72, the convolutional layer uses filters of 128, 64, pooling pool of 2, and the LSTM layer uses two layers of LSTM.
Forests 16 00895 g002
Figure 3. Attention mechanism. “seq” means input fixed length sequence divided by data set, “linear” means linear mapping, “at” denotes the similarity value obtained by dot product of q with other positions k for the sequence x i (i between 0 and n), “st” represents the attention weight obtained by performing softmax, “wt” represents the final weighted sum with v to obtain the prediction, and “b” represents the prediction sequence that integrates all the predicted values.
Figure 3. Attention mechanism. “seq” means input fixed length sequence divided by data set, “linear” means linear mapping, “at” denotes the similarity value obtained by dot product of q with other positions k for the sequence x i (i between 0 and n), “st” represents the attention weight obtained by performing softmax, “wt” represents the final weighted sum with v to obtain the prediction, and “b” represents the prediction sequence that integrates all the predicted values.
Forests 16 00895 g003
Figure 4. The operation of LSTM and dual attention mechanism in our L 2 model. x is the stem diameter feature matrix, α is the computed feature attention weight matrix, • is the weighting operation, h denotes the feature vector layer of each backward lstm layer, h l denotes the hidden layer obtained by LSTM, β is the vector of attention weights at each moment after activation function as well as normalization, and z denotes the temporal information state obtained by weighting β with h l .
Figure 4. The operation of LSTM and dual attention mechanism in our L 2 model. x is the stem diameter feature matrix, α is the computed feature attention weight matrix, • is the weighting operation, h denotes the feature vector layer of each backward lstm layer, h l denotes the hidden layer obtained by LSTM, β is the vector of attention weights at each moment after activation function as well as normalization, and z denotes the temporal information state obtained by weighting β with h l .
Forests 16 00895 g004
Figure 5. SHAP analysis shows that soil moisture contributed more than 16%, more than half of the other contributions, while wind speed contributed the least.
Figure 5. SHAP analysis shows that soil moisture contributed more than 16%, more than half of the other contributions, while wind speed contributed the least.
Forests 16 00895 g005
Table 1. The ablation study on RMSE shows that our L 2 model achieves the best performance across most indices. ‘&’ indicates module division. ‘+2L+A’ includes 2 LSTMs and Attention, while ‘+0C’ to ‘+3C’ represent 0–3 convolutional layers. Additionally, ‘+1L’ and ‘+3L’ test 1 and 3 LSTMs (0 LSTM not tested), ‘+0Lr’ in ‘+2C+A+2Ls’ removes LRTC-TNN module and ‘+0A’ in ‘+2C+2L’ removes Attention.
Table 1. The ablation study on RMSE shows that our L 2 model achieves the best performance across most indices. ‘&’ indicates module division. ‘+2L+A’ includes 2 LSTMs and Attention, while ‘+0C’ to ‘+3C’ represent 0–3 convolutional layers. Additionally, ‘+1L’ and ‘+3L’ test 1 and 3 LSTMs (0 LSTM not tested), ‘+0Lr’ in ‘+2C+A+2Ls’ removes LRTC-TNN module and ‘+0A’ in ‘+2C+2L’ removes Attention.
MethodDIA (um)↓SM (%)↓DTR (℃)↓LF (cm/s)↓TEMP (℃)↓RAD (W/m2)↓RH (%)↓
+0Lr(+2C+A+2Ls)428.8945041.31769826.1680930.01532418.112322431.69012552.421295
+0C(+Lr&+2L+A)359.1403640.72877617.6622640.0078946.112322313.11848127.512348
+1C(+Lr&+2L+A)349.4175840.4515706.6623140.0048925.636185264.61502110.757118
+3C(+Lr&+2L+A)354.8543140.55453912.5300760.0049696.077727223.37673111.173975
+1L(+Lr&+2C+A)304.8167270.60277812.8068790.0050346.016144231.24294811.106624
+3L(+Lr&+2C+A)302.9759940.5786113.9807850.0049985.974214195.35313410.067917
+0A(+Lr&+2C+2L)502.5507280.58715921.1422210.0048666.045146229.41153914.266718
L 2 (+Lr&+2C+A+2Ls)283.4483470.3656273.0101840.0041573.734599205.3752857.712643
Table 2. The ablation study on MASE similarly show that our L 2 model achieves the best performance across most indices.
Table 2. The ablation study on MASE similarly show that our L 2 model achieves the best performance across most indices.
MethodDIA (um)↓SM (%)↓DTR (℃)↓LF (cm/s)↓TEMP (℃)↓RAD (W/m2)↓RH (%)↓
+0Lr(+2C+A+2Ls)0.0513370.7712630.2321260.0021751.6251943.3752172.284953
+0C(+Lr&+2L+A)0.0513370.4055970.1977050.0014960.4495491.7306000.729766
+1C(+Lr&+2L+A)0.0361080.0213620.0226020.0005490.2543120.8428310.298929
+3C(+Lr&+2L+A)0.0389550.0253410.0231830.0006780.2787130.3752050.285411
+1L(+Lr&+2C+A)0.0316560.0259690.5034590.0008810.3040890.4218640.382649
+3L(+Lr&+2C+A)0.0328280.0247680.0228960.0006480.2708190.3282730.285227
+0A(+Lr&+2C+2L)0.0476010.0250750.0243320.0008170.2962140.4805720.409644
L 2 (+Lr&+2C+A+2Ls)0.0324490.0176340.0211840.0004880.1722580.3542130.109663
Table 3. The ablation study on RMSE and MASE shows that our L 2 model achieves the best performance. The baseline model (BL) represents a single-layer LSTM architecture. ‘+F’ denotes the feature attention layer, “+T” denotes the temporal attention layer, and “ALL” denotes the L 2 model. In addition, ‘+2Ls’ was added to test the optimal number of LSTM layers.
Table 3. The ablation study on RMSE and MASE shows that our L 2 model achieves the best performance. The baseline model (BL) represents a single-layer LSTM architecture. ‘+F’ denotes the feature attention layer, “+T” denotes the temporal attention layer, and “ALL” denotes the L 2 model. In addition, ‘+2Ls’ was added to test the optimal number of LSTM layers.
MethodBLBL+FBL+F+TBL+F+T+2LsALL (BL+F+T+Ls)
RMSE329.918288208.372894181.225549172.334553167.373654
MASE0.0197500.0119000.0106330.0100560.009315
Table 4. Comparative experiments on RMSE show that our L 2 model achieves the best performance and stability in most cases. Columns denote forestry data types, and rows denote RMSEs for various models, including the L 2 model, linear interpolation (Lerp), exponential interpolation (ExpInterp), KNN [11], BTMF [30], BiLSTM-GRU [31], and transformers [8].
Table 4. Comparative experiments on RMSE show that our L 2 model achieves the best performance and stability in most cases. Columns denote forestry data types, and rows denote RMSEs for various models, including the L 2 model, linear interpolation (Lerp), exponential interpolation (ExpInterp), KNN [11], BTMF [30], BiLSTM-GRU [31], and transformers [8].
MethodOursLerpExpInterpKNNBTMFBiLSTM-GRUTransformers
DIA↓283.448347392.970190392.968706460.784986344.282280332.356781335.611604
SM ↓0.3656270.5211590.5211780.4817120.4515700.3733100.420748
DTR↓3.0101845.2806555.7176434.2469149.53278814.9455928.217452
LF↓0.4157090.5100560.5082450.78790.5010730.4061840.496458
TEMP↓3.7345995.3117045.3076454.9986225.6361854.7111329.337577
RAD↓205.375285253.323734254.571491254.269126233.928561232.286006242.198772
RH↓7.7126437.8039397.8272279.3375778.2152428.1146609.337577
Table 5. Comparative experiments on MASE also shows that our L 2 model achieves the best performance and stability in most cases.
Table 5. Comparative experiments on MASE also shows that our L 2 model achieves the best performance and stability in most cases.
MethodOursLerpExpInterpKNNBTMFBiLSTM-GRUTransformer
DIA↓0.0324490.0425230.0425250.0498020.0395620.0369000.035601
SM↓0.0176340.0307850.0308190.02707780.0203890.0176760.019759
DTR↓0.0211840.0263430.0259080.0402730.0233440.0200250.023918
LF↓0.0004880.0005040.0005510.0004920.0009620.0007830.000851
TEMP↓0.1722580.2097520.1968360.2213530.2543120.20115340.229836
RAD↓0.3042130.3423710.3418240.3567140.3239180.3249670.343871
RH↓0.1096630.1099250.1115150.1108510.1238160.1217190.110742
Table 6. Comparative experiment on RMSE and MASE shows that our L 2 model predicts best. Columns denote the prediction models and rows denote the RMSE and MAPE of the prediction models.
Table 6. Comparative experiment on RMSE and MASE shows that our L 2 model predicts best. Columns denote the prediction models and rows denote the RMSE and MAPE of the prediction models.
MethodOurs RMSE/MASERNN [32] RMSE/MASEARIMA-LSTM Hybrid [12] RMSE/MASEBO-BiLSTM [13] RMSE/MASEInformer [14] RMSE/MASE
DIA167.373654397.583270223.580558191.700792195.645380
/0.009315/0.0185713/0.0122556/0.010274/0.010690
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Jiang, L.; Yang, M.; Xi, B.; Meng, W.; Duan, J. L2: Accurate Forestry Time-Series Completion and Growth Factor Inference. Forests 2025, 16, 895. https://doi.org/10.3390/f16060895

AMA Style

Jiang L, Yang M, Xi B, Meng W, Duan J. L2: Accurate Forestry Time-Series Completion and Growth Factor Inference. Forests. 2025; 16(6):895. https://doi.org/10.3390/f16060895

Chicago/Turabian Style

Jiang, Linlu, Meng Yang, Benye Xi, Weiliang Meng, and Jie Duan. 2025. "L2: Accurate Forestry Time-Series Completion and Growth Factor Inference" Forests 16, no. 6: 895. https://doi.org/10.3390/f16060895

APA Style

Jiang, L., Yang, M., Xi, B., Meng, W., & Duan, J. (2025). L2: Accurate Forestry Time-Series Completion and Growth Factor Inference. Forests, 16(6), 895. https://doi.org/10.3390/f16060895

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop