Next Article in Journal
Penicillium bialowiezense Causing Blue Mold on Bag-Cultivated Shiitake (Lentinula edodes) in China: Morphological, Molecular and Pathogenic Characterization
Previous Article in Journal
Genome-Wide Identification of the AdSPS Gene Family and Light Quality Response in Kiwifruit (Actinidia deliciosa)
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Prediction Method of Canopy Temperature for Potted Winter Jujube in Controlled Environments Based on a Fusion Model of LSTM–RF

1
College of Horticulture, North West Agriculture and Forestry University, Xianyang 712100, China
2
Key Laboratory of Horticultural Engineering in Northwest Facilities, Ministry of Agriculture, Xianyang 712100, China
3
Northwest A&F University & Xi’an Jiaotong University Agricultural Equipment Research Institute, Xianyang 712100, China
*
Author to whom correspondence should be addressed.
Horticulturae 2026, 12(1), 84; https://doi.org/10.3390/horticulturae12010084
Submission received: 9 December 2025 / Revised: 7 January 2026 / Accepted: 9 January 2026 / Published: 12 January 2026
(This article belongs to the Section Fruit Production Systems)

Abstract

The canopy temperature of winter jujube serves as a direct indicator of plant water status and transpiration efficiency, making its accurate prediction a critical prerequisite for effective water management and optimized growth conditions in greenhouse environments. This study developed a data-driven model to forecast canopy temperature. The model serially integrates a Long Short-Term Memory (LSTM) network and a Random Forest (RF) algorithm, leveraging their complementary strengths in capturing temporal dependencies and robust nonlinear fitting. A three-stage framework comprising temporal feature extraction, multi-source feature fusion, and direct prediction was implemented to enable reliable nowcasting. Data acquisition and preprocessing were tailored to the greenhouse environment, involving multi-sensor data and thermal imagery processed with Robust Principal Component Analysis (RPCA) for dimensionality reduction. Key environmental variables were selected through Spearman correlation analysis. Experimental results demonstrated that the proposed LSTM–RF model achieved superior performance, with a determination coefficient (R2) of 0.974, mean absolute error (MAE) of 0.844 °C, and root mean square error (RMSE) of 1.155 °C, outperforming benchmark models including standalone LSTM, RF, Transformer, and TimesNet. SHAP (SHapley Additive exPlanations)-based interpretability analysis further quantified the influence of key factors, including the “thermodynamic state of air” driver group and latent temporal features, offering actionable insights for irrigation management. The model establishes a reliable, interpretable foundation for real-time water stress monitoring and precision irrigation control in protected winter jujube production systems.

1. Introduction

Winter jujuba (Ziziphus jujuba Mill. ‘Dongzao’), the most widely cultivated fresh jujube variety, is very popular among consumers because of its thin skin, crispy flesh, and sweet flavor [1]. As a high-value horticultural product, there has been a steady increase in the demand for winter jujube in both domestic and international markets [2]. Protected cultivation has gradually become the dominant production mode for winter jujube; this cultivation strategy not only ensures yield stability and improves fruit quality but also enhances the market competitiveness of winter jujube. Nevertheless, the achievement of stable, high-quality production under protected cultivation is highly dependent on scientific irrigation management, in which canopy temperature serves as a pivotal indicator for guiding precise irrigation practices.
Specifically, canopy temperature functions as a vital indicator of plant water status and physiological activity, with direct consequences for crop development [3], yield, and product quality [4,5,6,7]. Effective irrigation management requires real-time canopy temperature monitoring to identify water stress [8]. However, the enclosed greenhouse environment exhibits high humidity, limited air circulation, and dynamically coupled environmental factors that generate complex nonlinear interactions governing canopy temperature. These interactions establish strong temporal dependencies among variables, rendering real-time canopy temperature prediction a persistent challenge to implementing proactive stress mitigation and precision irrigation in controlled horticultural production.
Currently, machine learning and deep learning techniques have been extensively applied to model and predict crop growth environments [9]. Early attempts to model canopy temperature often relied on conventional machine learning or static neural networks using limited input features, such as basic meteorological data [10,11,12,13]. Despite their utility, such models often struggle to capture the complex temporal dynamics and feature interactions characteristic of greenhouse environments [14], resulting in limited accuracy under dynamic conditions.
Recognizing the temporal nature of the problem, subsequent studies have employed advanced deep learning models capable of capturing long-term dependencies, such as LSTM, GRU, and their variants [15,16,17,18,19,20,21]. These models have shown success in related agricultural prediction tasks (e.g., canopy area temperature [16], yield [18], greenhouse air temperature [20], and CO2 [21]). However, their direct application to predicting canopy temperature in facilities like winter jujube greenhouses is not straightforward. A critical gap exists: these models either do not fully account for the unique, strong coupling between multi-scale environmental variables (e.g., air temperature, humidity, radiation, and soil moisture) specific to protected cultivation, or they lack the design to extract fine-grained temporal features from the short-term, rapid fluctuations characteristic of canopy temperature. Consequently, their generalization capability and prediction accuracy for this specific task are often suboptimal.
To bridge this gap, this study focuses on achieving high-precision prediction of canopy temperature for facility-cultivated winter jujube. We posit that an effective model must simultaneously excel at learning complex temporal patterns and understanding the relative importance of heterogeneous input features. Therefore, we propose a novel integrated forecasting framework that synergistically combines a Long Short-Term Memory (LSTM) network and a Random Forest (RF) model. The LSTM component is designed to capture temporal dependencies and dynamic patterns in sequential data, while the RF component complements it by robustly handling heterogeneous multi-source inputs and evaluating feature importance, thereby enhancing model interpretability and robustness against overfitting. The key innovations of this study are summarized in three aspects:
(1)
We explore the integration of RF with LSTM for greenhouse crop canopy temperature prediction, a combination that has not been widely applied in this context;
(2)
A comprehensive comparison and analysis of multiple deep learning algorithms, demonstrating the superior predictive performance of the proposed model;
(3)
The identification and evaluation of key features influencing canopy temperature dynamics in protected winter jujube cultivation, providing a novel approach for managing water stress and optimizing irrigation strategies.

2. Materials and Methods

As shown in Figure 1, the main processes of this study include data collection, data preprocessing, model construction, model training, hyperparameter optimization, and interpretation.

2.1. Experimental Data

The experimental site under study is located at Rougu Nongyuan Agricultural Park in Yangling District, Xianyang City, Shaanxi Province. This region has a temperate continental monsoon climate with distinct seasonal changes, as reported by the Environmental Meteorological Data Service Platform. As shown in Figure 2, the studied greenhouse is a large-span plastic structure oriented east–west. The experiments employed cylindrical, prefabricated tree pots made of PET-polyethylene composite materials, with a volume of 0.1413 m3, a diameter of 60 cm, and a height of 50 cm, and irrigation schedules based on their growth stages. Specifically, irrigation management followed the water demand characteristics of winter jujube, with a stage-specific protocol setting lower field capacity thresholds at 60% during shoot elongation and leaf expansion, and at 75% during both the flowering/fruit setting and fruiting stages, while maintaining a uniform upper threshold of 90% throughout the growth cycle.
Thermal infrared images of potted winter jujube canopies were automatically collected using an FLIR infrared thermal camera (FLIR AX8, FLIR Systems Inc., Wilsonville, OR, USA). This camera employs an uncooled microbolometer detector to measure long-wave radiation within the 7.5–13 μm spectral range and convert it into dynamic temperature data, with a thermal accuracy of ±2 °C or ±2% of the reading and a resolution of 320 × 240 pixels. The emissivity (ε) setting was fixed at 0.97 for all measurements, based on established values for plant leaves [22,23]. Images were captured at 30 min intervals from a viewing angle of 30° above the canopy top and at a fixed distance of 1.2 m from the canopy, ensuring full coverage and consistent viewing geometry. The camera was permanently mounted inside the greenhouse. This fixed setup, combined with the controlled greenhouse environment, which buffered against external wind and sky radiation fluctuations, minimized variations in atmospheric attenuation and background thermal conditions, thereby enhancing the temporal consistency of the measurements.
To ensure spatial consistency of the measurement area over time, a region of interest (ROI) was carefully defined for analysis. The fixed camera setup captured the entire canopy area in both visible-light and thermal images simultaneously. This enabled the use of visible-light images for precise boundary identification. For each major growth stage, the ROI encompassing sun-exposed leaves was delineated by referencing the co-registered visible-light image of that specific period. This approach ensured that the selected ROI accurately represented the active, sun-exposed portion of the canopy despite changes in leaf density and canopy size over the growth cycle. The fixed camera position guaranteed that the spatial context remained stable, allowing for reliable comparison of these defined ROIs across different time points. The acquired images were transmitted in real time and stored via a data cable.
Temperature values were extracted from upper, sun-exposed leaves within the canopy images. As temperature data averaged over a canopy region or measurement area are more representative than measurements from individual leaves [24], an effort was made to select as many temperature data points as possible within the effective region. As shown in Figure 3, ROIs containing individual leaves or leaf-covered areas were delineated either by manual comparison with visible-light images or through automated selection of the central portion of the thermal images, which is predominantly covered by foliage [22,25,26].
To process the spatially distributed data within each ROI and extract a robust, representative temperature value, this study proposes a data dimensionality reduction method that employs an integrated two-step procedure based on a Gaussian spatial kernel function [27] and Robust Principal Component Analysis (RPCA) [28].
Step 1: Denoising via Gaussian–RPCA. The raw canopy temperature field matrix (T) within the delineated ROI was decomposed into a low-rank component (L) and a sparse component (S) by solving the optimization problem defined in Equation (1). This convex optimization problem was solved using the Alternating Direction Method of Multipliers (ADMM) [29]. Key parameters were set as follows: the regularization parameter λ was set to 0.1 (Figure S1), a value determined via cross-validation; the Gaussian spatial kernel parameter σ was defined as σ = r/2, where r is the equivalent radius (in pixels) of the ROI, calculated from its spatial dimensions; the kernel was centered on the canopy’s centroid (μ). This process effectively separated the stable background thermal signal (L) from transient noise and outliers (S), such as sensor artifacts or sporadic reflections.
m i n L , S L * + λ S 1 + ( i , j ) Ω G i , j T i j L i j S i j F 2
where the variables are with the matrix. T is the original canopy temperature field matrix derived from thermal imaging data, with dimensions corresponding to the spatial resolution of the image pixels; T _ i j is the temperature value at pixel location i ,   j ; L is the low-rank matrix representing the stable, slowly varying background component of the temperature field, such as the overall thermal distribution of the canopy, S is a sparse matrix capturing outliers, anomalies, or noise in the temperature field, such as localized hot spots, cold spots, or sensor artifacts; and Ω is the set of all valid pixel coordinates i ,   j within the canopy region of interest (ROI), used to define the spatial domain of the temperature field. Paradigms and regular terms: L * is the nuclear norm of matrix L , defined as the sum of its singular values; this norm is used as a convex surrogate to promote a low-rank structure in L . S 1 is the L 1 -norm of matrix S , defined as the sum of the absolute values of all its elements; this norm encourages sparsity in S . λ is a regularization parameter that balances the trade-off between the low-rank component L   and the sparse component   S   in the decomposition. The spatial kernel function G i , j = e x p i , j μ 2 2 σ 2 , radiatively weighted with the center of mass of the canopy μ as the origin, is determined by the equivalent radius r ; a low-rank representation of the temperature field L is realized.
Step 2: Target extraction via sampling averaging. Canopy temperature is directly derived from the denoised low-rank matrix (L). The Canopy Temperature Index (CTI) defined in Equation (2) is essentially the spatial average of L over the ROI. To balance computational efficiency and resistance to local anomalies, we use uniform sampling of 100 points followed by averaging as an engineering estimation of CTI [23]. This operation provides a computationally efficient and approximately unbiased approximation to full-pixel averaging and serves as a robust estimator of CTI.
C T I = 1 Ω i , j Ω L i , j
where L i , j is the value of the low-rank matrix L at position i ,   j and represents the temperature after denoising, and Ω is the total number of pixels in the region. The spatial coordinate attributes are stripped away while retaining the thermal radiation principal component.
This allowed continuous monitoring of canopy temperature for potted winter jujube in the greenhouse from the sprouting and leaf expansion stage through to fruit maturity. Figure 4 presents the collected temperature data (due to the long fruiting stage, only part of the data is shown). As shown in the figure, the temperature data exhibit clear periodic characteristics, displaying repetitive patterns over time. This temporal periodicity makes the data suitable for time-series forecasting.
Environmental parameter collection devices included sensors for monitoring greenhouse temperature and humidity, solar radiation, wind speed, and soil temperature and humidity. The specific parameters and details of these sensors are summarized in Table 1.
The environmental information collection sensors are connected to a serial server via an RS485 bus and communicate using the Modbus-RTU protocol. The serial server converts the collected serial data stream into TCP/IP protocol data packets, which are then transmitted wirelessly to the data host through a wireless bridge. This setup enables real-time data transmission, providing a foundation for subsequent data analysis and model training.
In this study, data were collected from 1 April to 7 September, resulting in a total of 7680 sets of data, encompassing greenhouse environmental data, soil environmental data, and thermal imaging of the crop canopy. All variables were recorded at 30 min intervals.

2.2. Data Preprocessing

2.2.1. Data Cleaning

Sensor data collected during the monitoring process are susceptible to various uncertainties arising from environmental fluctuations and sensor malfunctions, which may lead to the presence of outliers and missing values alongside meaningful patterns [30]. Therefore, before model development, it is necessary to detect and handle abnormal data points in the raw dataset. An outlier is defined as an observation that significantly deviates from the majority of data points within a dataset. As illustrated in Figure 5, such anomalies can distort the underlying temporal patterns and adversely affect model performance if not properly addressed.
In this study, the Isolation Forest (IF) algorithm was applied to detect and remove outliers. This method isolates anomalies by constructing an ensemble of random trees, facilitating automated identification with minimal manual intervention. The algorithm demonstrates robust performance against noise and outliers, effectively distinguishing true anomalies even under significant data contamination. Based on the principle that anomalies are few and distinct, IF achieves high detection accuracy without requiring extensive parameter tuning or distributional assumptions. Its computational efficiency and scalability make it particularly suitable for processing complex environmental time-series data. Ultimately, 330 sets of data records (approximately 4.3% of the total 7680 sets) were identified as outliers and removed.

2.2.2. Data Normalization

Environmental variables often differ significantly in both unit of measurement and numerical magnitude. Such disparities can lead to imbalanced parameter weighting during model training, where features with larger numerical ranges dominate those with smaller scales, thereby distorting the model’s ability to accurately capture the underlying relationships among variables and compromising its stability and convergence [31]. To address this issue, normalization was applied to the environmental data to ensure all input features lie within a comparable range.
N o r m a l i z e d   v a l u e Y i = Y i Y m i n Y m a x Y m i n
The data were normalized to a range of 0 to 1 using Equation (3), where the minimum and maximum values are the smallest and largest numerical values in the dataset, respectively. All data were sorted chronologically after preprocessing. A forward time-series split was applied: the first 80% served as the training set, and the remaining 20% served as an independent test set to evaluate the model’s generalization to future data. To prevent data leakage, the normalization scaler was fitted exclusively on the training set. The fitted scaler was then applied to transform both the training and the independent test set, ensuring that no future information from the test set influenced the scaling parameters.

2.2.3. Correlation Analysis

Canopy temperature variation results from both internal plant physiology and external environmental drivers including soil temperature, soil moisture, greenhouse air conditions, and vapor pressure deficit. To optimize model inputs and minimize noise from weakly associated variables, Spearman’s rank correlation coefficient was applied to evaluate monotonic relationships between environmental factors and canopy temperature. Variables demonstrating high correlation coefficients were retained as model inputs. The coefficient interpretation followed the classification shown in Table 2 [16].
Spearman’s correlation analysis was performed between canopy temperature and environmental variables, including solarrad, air humidity, air temperature, wind speed, vapor pressure deficit (VPD), soil temperature, and soil moisture. As illustrated in Figure 6, soil temperature, VPD, and air temperature demonstrated strong to very strong correlations with canopy temperature, while solar radiation, air humidity, and soil moisture showed moderate correlations. Wind speed exhibited negligible correlation.
To simplify model architecture and reduce computational load, only variables showing strong to moderate correlations were retained as predictors. Consequently, air temperature, air humidity, soil temperature, solar radiation, VPD, and soil moisture were selected as model input features.

2.3. Model Architecture and Construction

A hybrid model combining Long Short-Term Memory (LSTM) networks with Random Forest (RF) was developed to improve winter jujube canopy temperature prediction. This LSTM–RF model processes time-series data using an LSTM network to generate hidden representations, which are then fed into an RF for final regression prediction.
RF enhances accuracy and reduces overfitting by integrating multiple decision trees, while LSTM captures long-term dependencies in sequential data via its gated architecture. Their combination improves generalization and robustness for complex time-series analysis while simplifying training through feature selection.
As shown in Figure 7, the model architecture performs precise canopy temperature prediction through multi-level feature fusion and ensemble learning. Three feature types are integrated during the fusion stage:
  • Original features from the raw dataset;
  • Lagged features were constructed using the sliding window method;
  • LSTM-extracted deep features revealing long-term dependencies.
This multi-source fusion constructs a comprehensive feature space for representing complex temporal behavior. RF then serves as the ensemble framework, training multiple decision trees in parallel using bootstrap sampling and feature subset partitioning to mitigate overfitting.
Key hyperparameters, including n_estimators, max_depth, and min_samples_split are optimized via grid search with cross-validation to balance generalization and precision. Final predictions are generated through mean aggregation of all decision tree outputs, integrating diverse feature perspectives while smoothing individual prediction biases. This synergistic architecture effectively captures temporal dynamics while overcoming single-model limitations for accurate canopy temperature forecasting.

2.3.1. Lagged Feature Construction

A sliding window approach was applied to capture the continuity and dynamic dependencies in environmental time-series data for model input construction. Cross-correlation function (CCF) analysis identified the optimal temporal window size by quantifying correlations between environmental variables and canopy temperature across multiple time lags. As illustrated in Figure 8, the correlation strength exhibited distinct patterns depending on the environmental variable: for air humidity, air temperature, and VPD, it generally increased and then decreased with lag duration, while for soil temperature, solar radiation, and soil moisture, it generally decreased. In all cases, the peak correlation occurred before 100 min, reflecting the delayed cumulative response of canopy temperature to environmental drivers, driven by plant thermal inertia and physiological regulation. Based on the 30 min sampling interval and the peak correlation, a three-step lag window was selected.
To strictly prevent temporal data leakage during window construction, we ensured that no sliding window straddled the train–test boundary. For the training set, windows were formed exclusively from data points within the training period. For the test set, each window was constructed using only historical points available up to the current time step within the test sequence, excluding any future observations or training set data. This approach guarantees that the model’s inputs contain no information from beyond the prediction time, thereby preserving the integrity of the temporal split.

2.3.2. LSTM-Based Temporal Feature Extractor

Figure 9 presents the LSTM network, which serves as the front-end temporal feature-extraction module to uncover deep dynamic patterns within sequentially windowed data. The multi-layer LSTM architecture utilizes input, forget, and output gates to mitigate gradient vanishing and explosion problems commonly associated with modeling long-term dependencies in conventional RNNs. The final hidden state of the top LSTM layer produces a high-level, abstract representation of temporal dynamics within each input segment, capturing complex nonlinear patterns, such as trends, periodicity, and delayed responses, that are not evident in raw data. This process compresses temporal in the formation into a compact feature vector. The resulting deep temporal representations better capture intricate physiological–environmental interactions that affect canopy temperature evolution, providing robust and discriminative inputs for the subsequent Random Forest prediction stage, thereby enhancing model performance and generalization capacity.

2.3.3. Multi-Source Feature Fusion Strategy

This study employed a feature concatenation strategy to integrate raw environmental variables, lagged features, and LSTM-extracted deep features, thereby constructing a comprehensive multi-scale feature space that enhances model representation and identification of complex nonlinear interactions in greenhouse environments.

2.3.4. Random Forest Predictor

The fused features were then processed by an RF model for final prediction. As an ensemble method, RF constructs multiple decision trees and aggregates their outputs to mitigate overfitting risks while maintaining strong nonlinear fitting capability and noise robustness. To optimize performance, grid search was applied to systematically tune key hyperparameters, including n_estimators, max_depth, and min_samples_split, with search ranges detailed in Table 3.

2.4. Model Evaluation and Interpretation

To comprehensively evaluate the predictive performance of the proposed models, the following assessment procedures were implemented:
(1)
Model predictive performance evaluation
The performance of each model was evaluated using standard regression metrics, including the root mean square error (RMSE), mean absolute error (MAE), and the Coefficient of Determination (R2) (Equations (4)–(6)). These metrics were adopted as evaluation functions during the training and validation phases. Higher R2 values indicate higher prediction accuracy.
R M S E = i = 1 n y i y i ^ 2 n
M A E = 1 n i = 1 n y i ^ y i
R 2 = 1 i = 1 n y i y i ^ 2 i = 1 n y i y ¯ 2
(2)
Model Interpretation
The SHAP (SHapley Additive exPlanations) value method was used to analyze and interpret the best-performing prediction model’s explainability [32]. The SHAP algorithm is based on Shapley values from Game Theory. Its core idea is to view the recognition model as a collaborative game where each feature contributes to the model’s prediction. SHAP values provide a theoretical foundation for explaining model predictions, quantifying each feature’s specific contribution to the model output, and revealing the internal relationships between features and the target variable.
The SHAP value for a given feature represents its marginal contribution to the prediction relative to the average prediction across all possible feature combinations. This method ensures that the influence of each feature is fairly and consistently assessed, even in the presence of complex interactions or nonlinear relationships, as represented by Equation (7):
g z = ϕ 0 + i = 1 M ϕ i
where g z is the explanatory model and represents a specific instance, ϕ 0 is the constant of the explanatory model, M is the dimensionality of input features, and ϕ i represents the SHAP value of the feature of the current instance.
SHAP value is calculated using Equation (8):
ϕ i = S X 1 , X 2 , X p X i S ! p S 1 ! p ! f S X i f S
where p is the number of input features, S is the subset of input features in the prediction model, X 1 , X 2 , X p is the set of input features, f S is the prediction with feature subsets, and f ( S { X i } ) is the average value of the sample recognition results obtained after fusing feature variables ( X i ) in the subset. S ! p S 1 ! p ! indicates that, when the total number of features is p , there are p ! possible feature combinations considering the order, and when the i -th feature is fixed, there are S ! p S 1 ! possible combinations.

2.5. Ablation Experiment

A systematic ablation study based on SHAP-derived feature importance was performed to quantify individual feature contributions to the hybrid model’s predictive performance. The process involved identifying key features through SHAP value analysis while recording baseline performance with the full feature set. These key features were then sequentially removed in descending order of importance, with the model retrained and evaluated under consistent conditions after each removal. Performance was compared against baseline with emphasis on R2 changes; substantial R2 reduction indicated a feature’s critical role in capturing canopy temperature patterns, while minimal degradation suggested limited predictive value or information redundancy.

3. Results

3.1. Model Training Results and Analysis

To evaluate the performance of the proposed LSTM–RF hybrid model for predicting winter jujube canopy temperature, a comparative study was conducted using the same dataset across multiple state-of-the-art and baseline models. The selected benchmark algorithms included TimesNet [33], Transformer [34], RF [35], and the standard LSTM [36]. The hyperparameter settings for LSTM, Transformer, and TimesNet are provided in Tables S1–S3. All models were trained using the same training dataset and evaluated on an independent hold-out test set to ensure the fairness and reproducibility of the comparison. Predictive performance was quantified using three widely adopted regression metrics: R2, MAE and RMSE.
The proposed LSTM–RF model uses three historical time steps combined with current environmental observations as input, incorporating six environmental variables: air temperature, humidity, VPD, solar radiation, soil temperature, and soil moisture. After sliding window transformation, the raw feature vector dimension reaches 24 (6 variables × 4 time points). A multi-layer LSTM network extracts 32-dimensional deep temporal features capturing nonlinear dynamics, which are concatenated with the 24-dimensional raw features to form a 56-dimensional fused feature space for RF regression. This fusion strategy enables the model to integrate static environmental context with dynamic temporal evolution through complementary information synthesis. Operating in single-step prediction mode, the model forecasts current canopy temperature using historical inputs. Figure 10 presents predicted versus observed canopy temperature trajectories, illustrating representative high- and low-temperature intervals by comparing observed (blue) and predicted (red) curves.
The LSTM–RF model closely matched the observed canopy temperature trends during both high- and low-temperature periods. By leveraging relationships among multiple environmental variables and canopy temperature, the model effectively forecasts temperature dynamics and demonstrates strong performance in predicting extreme events, such as heatwaves and cold spells, providing reliable decision support for crop temperature monitoring and extreme weather warnings.
Table 4 summarizes the prediction errors for all compared models. The LSTM–RF model achieved superior performance, with an R2 of 0.974, an MAE of 0.844 °C, and an RMSE of 1.155 °C for canopy temperature prediction. The hybrid approach substantially outperformed the standalone RF and LSTM models, with the LSTM module playing a critical role in extracting deep temporal features to capture complex nonlinear dynamics and enhance overall predictive accuracy.
Figure 11 presents the predicted versus observed canopy temperature values for all evaluated models. In each plot, deviations of data points from the Y = X reference line (representing perfect agreement) directly reflect prediction errors, with larger deviations indicating lower predictive accuracy. Additionally, the pattern of points around the reference line provides an intuitive assessment of model performance: tighter clustering around the line indicates greater consistency and reliability.
The models exhibit varying performance levels. Among them, the LSTM–RF model demonstrates the tightest clustering of predicted values around the diagonal reference line, indicating the best fit and highest prediction accuracy. This proposed model innovatively combines the temporal feature-extraction capabilities of LSTM with the nonlinear fitting and generalization strength of RF, effectively reducing error accumulation during prediction. Consequently, the LSTM–RF model yielded the lowest prediction error among all benchmark models.

3.2. Model Interpretability Analysis

Given its superior accuracy and efficiency, the LSTM–RF model was selected for further interpretability analysis using SHAP. SHAP values provide a theoretical foundation for explaining model predictions by quantifying each feature’s specific contribution to the model output and revealing the internal relationships between features and the target variable.

3.2.1. SHAP Heatmap

The SHAP heatmap visualizes the distribution of SHAP values for each feature across all samples, revealing both the direction and magnitude of each feature’s contribution to model predictions. In the plot, color intensity represents the magnitude of the SHAP value, with red indicating positive contributions and blue indicating negative contributions.
As shown in Figure 12, air temperature and LSTM-derived high-level features significantly impact canopy temperature variations. Given the strong multicollinearity among air temperature, relative humidity, and VPD, the variables airtemp_t, airtemp_t-1, airtemp_t-2, VPD_t, and VPD_t-1 were collectively interpreted as a “thermodynamic state of air” driver group. Specifically, the “thermodynamic state of air” driver group, along with LSTM-derived latent features (LSTM_feature_19/22/26/29), shows high SHAP values, indicating strong effects on canopy temperature prediction.

3.2.2. SHAP Bar Chart and Scatter Plot

Figure 13 illustrates the rankings of key feature importance for the optimal canopy temperature prediction model (LSTM–RF) with the optimal number of key features (N = 20). Figure 13a displays the specific contributions of each key feature to the model, while Figure 13b presents the feature rankings by calculating the average absolute SHAP values for each sample.
The “thermodynamic state of air” driver group was ranked highest overall, with red points predominantly clustered on the right side of the vertical axis, indicating its significant positive impact on model performance. This is because the air’s thermal condition is the primary external driver of canopy temperature.
LSTM-derived features (LSTM_feature_19/22/26/29) also make substantial contributions, highlighting the importance of capturing complex temporal dynamics with deep learning techniques.
Solar radiation features (solarrad_t series) contribute relatively steadily to the model’s predictive performance, suggesting their consistent role in influencing canopy temperature throughout the day.
Features lagging by more than three-time steps, such as airtemp_t-3, show a marked decrease in importance, indicating that the canopy system’s thermal memory diminishes beyond approximately one hour. This finding supports the earlier observation regarding the characteristic timescale of canopy temperature response.
The analysis indicates that, in the LSTM–RF hybrid model, accurate prediction of winter jujube canopy temperature critically depends on the “thermodynamic state of air” driver group, LSTM-derived features, and lagged features.

3.3. Ablation Experiment Results

3.3.1. Feature Ablation

To examine the contribution of key predictors identified via SHAP analysis, a feature-ablation study was conducted on the LSTM–RF model. As shown in Figure 14, the full model, including all 56 fused features, achieved R2 = 0.974. Removing the “thermodynamic state of air” driver group reduced the feature set to 51 and lowered R2 to 0.913, confirming its role as the primary driver of canopy temperature.
Excluding LSTM-derived features (LSTM_feature_19/22/26/29) further reduced R2 to 0.907, emphasizing their importance for capturing nonlinear temporal dependencies. Finally, removing all remaining lagged variables reduced the feature set to 38 and caused a moderate performance decline, illustrating their auxiliary role in encoding the system’s short-term thermal memory.
These results further demonstrate that (1) the “thermodynamic state of air” driver group, (2) LSTM-derived latent features, and (3) short-term lagged inputs are all essential to achieving high predictive performance.

3.3.2. Model Ablation

To validate the effectiveness of the LSTM module in extracting temporal features within the fusion model, a comparison was conducted between the LSTM–RF hybrid model and the standalone RF model using identical inputs and training data. Figure 15 presents absolute prediction residuals for both models. The violin plot illustrates a distinctly narrower distribution range (vertical span) of prediction residuals for the LSTM–RF hybrid model compared to the standalone RF model, with residuals concentrated more closely around zero. The embedded boxplot further reveals that the LSTM–RF model exhibits a median error closer to zero and a smaller interquartile range (IQR), indicating superior prediction accuracy and robustness. This ablation confirms the need for a dedicated temporal feature extractor in the hybrid framework.

4. Discussion

This study demonstrates that serial integration of LSTM and RF models provides a robust framework for high-accuracy canopy-temperature nowcasting in greenhouse-cultivated winter jujube. The proposed LSTM–RF hybrid model achieved superior performance (R2 = 0.974, MAE = 0.844 °C, RMSE = 1.155 °C) compared to standalone LSTM, RF, and other advanced benchmarks like Transformer and TimesNet. The critical innovation lies not merely in the ensemble but in the synergistic, three-stage architecture designed to address the specific challenges of the greenhouse environment: capturing complex temporal dependencies via LSTM, handling heterogeneous multi-source features via RF, and explicitly modeling delayed physiological responses through engineered lagged features.
The proposed LSTM–RF model achieves the best prediction performance. This superiority over its individual components underscores the complementary strengths of deep learning and ensemble machine learning. The LSTM module is critical for extracting nonlinear temporal dynamics and latent patterns from the sequential environmental data [15,36], which pure RF models might overlook. Conversely, the inherent ensemble mechanism and feature importance evaluation of the RF component enhance generalization and mitigate overfitting, thereby addressing a common risk in deep learning models when applied to limited or noisy datasets [13,35]. This architecture effectively balances model expressiveness with robustness, making it particularly suitable for the data-scarce and dynamically coupled context of protected horticulture.
Interpretability analysis using SHAP provided agronomically meaningful insights beyond predictive accuracy. Canopy temperature is primarily determined by its energy balance, which is driven by a series of air physical properties, including air temperature, air humidity, and VPD. In this study, we collectively interpret these tightly coupled variables (airtemp and VPD) as the “thermodynamic state of air”, treating them as a comprehensive driver group to analyze their influence on canopy temperature. This group was consistently identified as the most influential, aligning with the fundamental principle that canopy temperature is primarily governed by energy and water vapor exchange at the leaf–atmosphere interface [3,8]. The significant contribution of LSTM-derived latent features further confirms that the model successfully captures complex, nonlinear temporal interactions that are not explicitly defined in the raw inputs. Furthermore, the sharp decline in the importance of features lagged by more than three steps (e.g., airtemp_t-3) quantitatively delineates the canopy’s short-term “thermal memory” at approximately one hour. This finding is consistent with known plant physiological response timescales and provides a data-driven rationale for selecting optimal temporal windows in irrigation scheduling models [37].
The explicit inclusion of lagged environmental features, informed by cross-correlation analysis, was a key design choice to model hysteresis effects. The ablation experiment confirmed their necessity, as removal led to a measurable performance drop. This addresses a common gap in static or poorly tailored temporal models, which often fail to account for plant system’s delayed integration of environmental drivers [12,14]. Our approach bridges this gap by formally incorporating these delayed responses into the feature space.
Some limitations and future directions should be noted. First, the model’s performance is contingent on continuous, high-quality data streams from calibrated sensors. Practical deployment must incorporate robust fault-tolerance mechanisms. Second, while the model was validated on potted plants, its generalizability to different cultivation scales (e.g., soil-based orchards) warrants further investigation. A promising future direction is the development of a multi-step forecasting version of the model, which would provide a longer planning horizon for irrigation decisions—a point raised during review. Extending the framework to directly recommend irrigation depth by coupling the canopy temperature forecast with an evapotranspiration or soil water balance model represents a logical next step towards fully autonomous decision-support systems [6,23].
In summary, this study presents a reliable, interpretable, and accurate modeling framework for canopy temperature prediction. By effectively fusing temporal deep learning with ensemble learning and physiologically informed feature engineering, it not only provides a high-performance forecasting tool but also yields actionable insights into the dominant drivers of plant water status. This establishes a solid data-driven foundation for advancing precision irrigation management in protected winter jujube production and similar controlled-environment agriculture systems.

5. Conclusions

This study developed a hybrid LSTM–RF model for accurate nowcasting of canopy temperature in greenhouse-cultivated winter jujube, a critical variable for precision irrigation management. The proposed three-stage framework, which integrates temporal feature extraction, multi-source feature fusion, and ensemble-based regression, effectively addressing the challenges posed by strong temporal dependencies and nonlinear environmental couplings in protected cultivation. The model demonstrated superior performance (R2 = 0.974, MAE = 0.844 °C, RMSE = 1.155 °C) compared to several benchmark algorithms. SHAP-based interpretability analysis provided actionable insights, identifying the thermodynamic state of air as the primary driver and quantifying the canopy’s short-term thermal memory. This study establishes a reliable, interpretable, data-driven foundation for real-time water-stress monitoring and intelligent irrigation scheduling in controlled horticulture systems.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/horticulturae12010084/s1, Figure S1: Robustness of canopy temperature extraction with respect to the RPCA regularization parameter (λ); Table S1: Hyperparameter settings for Transformer; Table S2: Hyperparameter settings for LSTM; Table S3: Hyperparameter settings for TimesNet.

Author Contributions

Methodology: S.M., L.K., S.H., Y.F. and F.Z.; Formal Analysis: S.M.; Writing—Original Draft: S.M. and Y.Z.; Writing—Review and Editing: X.S.; Resources: X.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research is financially supported by the National Key R&D Program of China (2024YFD2001001-02), the Key Research and Development Plan of Shaanxi Province (S2024-YF-ZDCXL-ZDLNY-015), and the Key Research and Development Plan of Shaanxi Province (2024NC2GJHX12).

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Liu, J.; Jiang, K.; Cao, X.; Zhang, X.; Liu, M.; Han, S. Gibberellin and shikimic acid modulate cell wall metabolism to attenuate postharvest softening in jujube fruit. LWT 2025, 233, 118487. [Google Scholar] [CrossRef]
  2. Kong, X.; Chen, Q.; Xu, M.; Liu, Y.; Li, X.; Han, L.; Zhang, Q.; Wan, H.; Liu, L.; Zhao, X.; et al. Geographical origin identification of winter jujube (Ziziphus jujuba ‘Dongzao’) by using multi-element fingerprinting with chemometrics. J. Integr. Agric. 2024, 23, 1749–1762. [Google Scholar] [CrossRef]
  3. Singh, J.; Ge, Y.; Heeren, D.M.; Walter-Shea, E.; Neale, C.M.U.; Irmak, S.; Woldt, W.E.; Bai, G.; Bhatti, S.; Maguire, M.S. Inter-relationships between water depletion and temperature differential in row crop canopies in a sub-humid climate. Agric. Water Manag. 2021, 256, 107061. [Google Scholar] [CrossRef]
  4. Hill, D.; Koryzis, A.; Nelson, D.; Hammond, J.; Bell, L. Investigating the utility of potato (Solanum tuberosum L.) canopy temperature and leaf greenness responses to water-restriction for the improvement of irrigation management. Agric. Water Manag. 2024, 303, 109063. [Google Scholar] [CrossRef]
  5. Jian, H.; Gao, Z.; Guo, Y.; Xu, X.; Li, X.; Yu, M.; Liu, G.; Bian, D.; Cui, Y.; Du, X. Supplemental irrigation mitigates yield loss of maize through reducing canopy temperature under heat stress. Agric. Water Manag. 2024, 299, 108888. [Google Scholar] [CrossRef]
  6. Siegfried, J.; Rajan, N.; Adams, C.B.; Neely, H.; Hague, S.; Hardin, R.; Schnell, R.; Han, X.; Thomasson, A. High-accuracy infrared thermography of cotton canopy temperature by unmanned aerial systems (UAS): Evaluating in-season prediction of yield. Smart Agric. Technol. 2024, 7, 100393. [Google Scholar] [CrossRef]
  7. Smigaj, M.; Gaulton, R.; Suárez, J.C.; Barr, S.L. Canopy temperature from an Unmanned Aerial Vehicle as an indicator of tree stress associated with red band needle blight severity. For. Ecol. Manag. 2019, 433, 699–708. [Google Scholar] [CrossRef]
  8. Ballester, C.; Castel, J.; Jiménez Bello, M.A.; Castel Sánchez, J.R.; Intrigliolo Molina, D.S. Thermographic measurement of canopy temperature is a useful tool for predicting water deficit effects on fruit weight in citrus trees. Agric. Water Manag. 2013, 122, 1–6. [Google Scholar] [CrossRef]
  9. He, F.; Ma, C. Modeling greenhouse air humidity by means of artificial neural network and principal component analysis. Comput. Electron. Agric. 2010, 71, S19–S23. [Google Scholar] [CrossRef]
  10. Liu, J.; Meng, X.; Ma, Y.; Liu, X. Introduce canopy temperature to evaluate actual evapotranspiration of green peppers using optimized ENN models. J. Hydrol. 2020, 590, 125437. [Google Scholar] [CrossRef]
  11. Kondo, R.; Tanaka, Y.; Shiraiwa, T. Predicting rice (Oryza sativa L.) canopy temperature difference and estimating its environmental response in two rice cultivars, ‘Koshihikari’ and ‘Takanari’, based on a neural network. Plant Prod. Sci. 2022, 25, 394–406. [Google Scholar] [CrossRef]
  12. Liu, Q.; Ta, N.; Jiao, W.; Kang, H.; Zhao, Z. Spatial Temporal Distribution and Prediction Model of Canopy Temperature and Humidity in Greenhouse. North. Hortic. 2019, 17, 56–65. [Google Scholar]
  13. Banerjee, S.; Singal, G.; Saha, S.; Mittal, H.; Srivastava, M.; Mukherjee, A.; Mahato, S.; Saikia, B.; Thakur, S.; Samanta, S.; et al. Machine Learning approach to Predict net radiation over crop surfaces from global solar radiation and canopy temperature data. Int. J. Biometeorol. 2022, 66, 2405–2415. [Google Scholar] [CrossRef]
  14. Guo, J.; Dong, J.; Zhou, B.; Zhao, X.; Liu, S.; Han, Q.; Wu, H.; Xu, L.; Hassan, S.G. A hybrid model for the prediction of dissolved oxygen in seabass farming. Comput. Electron. Agric. 2022, 198, 106971. [Google Scholar] [CrossRef]
  15. Haider, S.A.; Naqvi, S.R.; Akram, T.; Umar, G.A.; Shahzad, A.; Sial, M.R.; Khaliq, S.; Kamran, M. LSTM Neural Network Based Forecasting Model for Wheat Production in Pakistan. Agronomy 2019, 9, 72. [Google Scholar] [CrossRef]
  16. Huang, L.; Liu, Y.; Qu, K.; Zhu, Y. Canopy area temperature prediction with fusion of LSTM and Informer. Trans. Chin. Soc. Agric. Eng. 2025, 41, 222–232. [Google Scholar] [CrossRef]
  17. Kumar, K.V.; Ramesh, K.V.; Rakesh, V. Optimizing LSTM and Bi-LSTM models for crop yield prediction and comparison of their performance with traditional machine learning techniques. Appl. Intell. 2023, 53, 28291–28309. [Google Scholar] [CrossRef]
  18. Yu, S.; Fan, J.; Lu, X.; Wen, W.; Shao, S.; Liang, D.; Yang, X.; Guo, X.; Zhao, C. Deep learning models based on hyperspectral data and time-series phenotypes for predicting quality attributes in lettuces under water stress. Comput. Electron. Agric. 2023, 211, 108034. [Google Scholar] [CrossRef]
  19. Wu, M.; Li, R.; Lv, C.; Dong, A.; Mu, F.; Niu, W. Hourly photosynthetically active radiation prediction in solar greenhouses using Bayesian optimized machine learning and deep learning based on limited local weather data. Comput. Electron. Agric. 2025, 237, 110680. [Google Scholar] [CrossRef]
  20. Hu, J.; Lei, W.; Lu, Y.; Wei, Z.; Liu, X.; Gao, M. Solar Greenhouse Temperature Prediction Model Based on 1D CNN-GRU. Trans. Chin. Soc. Agric. Mach. 2023, 54, 339–346. [Google Scholar] [CrossRef]
  21. Guo, J.; Zhang, B.; Lin, L.; Xu, Y.; Zhou, P.; Luo, S.; Zhuo, Y.; Ji, J.; Luo, Z.; Gul Hassan, S. Multi-model fusion method for predicting CO2 concentration in greenhouse tomatoes. Comput. Electron. Agric. 2024, 227, 109623. [Google Scholar] [CrossRef]
  22. Gutiérrez, S.; Diago, M.P.; Fernández-Novales, J.; Tardaguila, J. Vineyard water status assessment using on-the-go thermal imaging and machine learning. PLoS ONE 2018, 13, e0192037. [Google Scholar] [CrossRef]
  23. Zhou, Z.; Majeed, Y.; Diverres Naranjo, G.; Gambacorta, E.M.T. Assessment for crop water stress with infrared thermal imagery in precision agriculture: A review and future prospects for deep learning applications. Comput. Electron. Agric. 2021, 182, 106019. [Google Scholar] [CrossRef]
  24. Grant, O.M.; Tronina, L.; Jones, H.G.; Chaves, M.M. Exploring thermal imaging variables for the detection of stress responses in grapevine under different irrigation regimes. J. Exp. Bot. 2007, 58, 815–825. [Google Scholar] [CrossRef] [PubMed]
  25. Bian, J.; Zhang, Z.; Chen, J.; Chen, H.; Cui, C.; Li, X.; Chen, S.; Fu, Q. Simplified Evaluation of Cotton Water Stress Using High Resolution Unmanned Aerial Vehicle Thermal Imagery. Remote Sens. 2019, 11, 267. [Google Scholar] [CrossRef]
  26. Pou, A.; Diago, M.P.; Medrano, H.; Baluja, J.; Tardaguila, J. Validation of thermal indices for water status identification in grapevine. Agric. Water Manag. 2014, 134, 60–72. [Google Scholar] [CrossRef]
  27. Maes, W.H.; Steppe, K. Estimating evapotranspiration and drought stress with ground-based thermal remote sensing in agriculture: A review. J. Exp. Bot. 2012, 63, 4671–4712. [Google Scholar] [CrossRef]
  28. Candès, J.; Li, X.; Ma, Y.; Wright, J. Robust principal component analysis? J. ACM 2011, 58, 11:1–11:37. [Google Scholar] [CrossRef]
  29. Zhou, L.; Min, C.; Yi, M. The augmented Lagrange multiplier method for exact recovery of a corrupted low-rank matrices. arXiv 2013, arXiv:1009.5055. [Google Scholar] [CrossRef]
  30. Liu, Y.; Ding, X.; Wang, H.; Li, K.; Zhang, G.; Yin, Y.; Pan, S. Prediction model for winter and summer lettuce root zone temperature based on dung beetle algorithm to optimize BP. Trans. Chin. Soc. Agric. Eng. 2024, 40, 231–238. [Google Scholar] [CrossRef]
  31. Zong, C.; Wang, J.; Song, W.; Gen, R.; Liu, P.; Xu, D. Construction and validation of hourly air temperature prediction model in solar greenhouse at night. Trans. Chin. Soc. Agric. Eng. 2022, 38, 218–225. [Google Scholar] [CrossRef]
  32. Yan, G.; Jia, H.; Lin, H.; Li, H.; Shi, Z.; Wang, Z. XGBoost⁃based Heat Stress Prediction of Dairy Cows and SHAP⁃based Model Interpretation. Trans. Chin. Soc. Agric. Mach. 2025, 56, 408–414. [Google Scholar] [CrossRef]
  33. Wu, H.; Hu, T.; Liu, Y.; Zhou, H.; Wang, J.; Long, M. TimesNet: Temporal 2D-Variation Modeling for General Time Series Analysis. arXiv 2023, arXiv:2210.02186. [Google Scholar] [CrossRef]
  34. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. arXiv 2023, arXiv:1706.03762. [Google Scholar] [CrossRef]
  35. Probst, P.; Wright, M.; Boulesteix, A.L. Hyperparameters and Tuning Strategies for Random Forest. WIREs Data Min. Knowl. Discov. 2019, 9, e1301. [Google Scholar] [CrossRef]
  36. Van Houdt, G.; Mosquera, C.; Nápoles, G. A review on the long short-term memory model. Artif. Intell. Rev. 2020, 53, 5929–5955. [Google Scholar] [CrossRef]
  37. Wu, B.; Song, Y.; Wang, W.; Xu, W.; Li, J.; Sun, F.; Zhang, C.; Yang, S.; Ning, J.; Xi, Y. Hysteresis in flag leaf temperature based on meteorological factors during the reproductive growth stage of wheat and the design of a predictive model. Comput. Electron. Agric. 2025, 232, 110113. [Google Scholar] [CrossRef]
Figure 1. Research methodology.
Figure 1. Research methodology.
Horticulturae 12 00084 g001
Figure 2. Greenhouse layout and crop cultivation.
Figure 2. Greenhouse layout and crop cultivation.
Horticulturae 12 00084 g002
Figure 3. Thermal image processing and temperature extraction workflow. (a) Original visible-light image of a potted winter jujube canopy. (b) Delineated region of interest (ROI) encompassing sun-exposed leaves. (c) Spatial distribution of 100 sampled temperature points within the ROI after RPCA denoising (applied to the matrix L).
Figure 3. Thermal image processing and temperature extraction workflow. (a) Original visible-light image of a potted winter jujube canopy. (b) Delineated region of interest (ROI) encompassing sun-exposed leaves. (c) Spatial distribution of 100 sampled temperature points within the ROI after RPCA denoising (applied to the matrix L).
Horticulturae 12 00084 g003
Figure 4. Canopy temperature data.
Figure 4. Canopy temperature data.
Horticulturae 12 00084 g004
Figure 5. Handling outliers in input parameters.
Figure 5. Handling outliers in input parameters.
Horticulturae 12 00084 g005aHorticulturae 12 00084 g005b
Figure 6. Heat map of the correlation between canopy temperature and environmental factors. Note: * p < 0.05, ** p < 0.01; negative values indicate negative correlations.
Figure 6. Heat map of the correlation between canopy temperature and environmental factors. Note: * p < 0.05, ** p < 0.01; negative values indicate negative correlations.
Horticulturae 12 00084 g006
Figure 7. Flow chart of LSTM–RF prediction model.
Figure 7. Flow chart of LSTM–RF prediction model.
Horticulturae 12 00084 g007
Figure 8. Cross-correlation between environmental variables and canopy temperature across different time lags.
Figure 8. Cross-correlation between environmental variables and canopy temperature across different time lags.
Horticulturae 12 00084 g008
Figure 9. Structure of multi-layer LSTM.
Figure 9. Structure of multi-layer LSTM.
Horticulturae 12 00084 g009
Figure 10. The variation curve of the predicted value of canopy temperature by the model in this study.
Figure 10. The variation curve of the predicted value of canopy temperature by the model in this study.
Horticulturae 12 00084 g010
Figure 11. Comparison of predicted versus observed canopy temperatures among the evaluated models.
Figure 11. Comparison of predicted versus observed canopy temperatures among the evaluated models.
Horticulturae 12 00084 g011
Figure 12. SHAP heat map of the contribution of different features to canopy temperature prediction.
Figure 12. SHAP heat map of the contribution of different features to canopy temperature prediction.
Horticulturae 12 00084 g012
Figure 13. Importance ranking of key features. (a) Bar chart of key feature importance. (b) The scatter plot of key feature importance.
Figure 13. Importance ranking of key features. (a) Bar chart of key feature importance. (b) The scatter plot of key feature importance.
Horticulturae 12 00084 g013aHorticulturae 12 00084 g013b
Figure 14. Comparison of model prediction performance after ablation experiments.
Figure 14. Comparison of model prediction performance after ablation experiments.
Horticulturae 12 00084 g014
Figure 15. Absolute prediction residuals for canopy temperature from the standalone RF and LSTM–RF hybrid model.
Figure 15. Absolute prediction residuals for canopy temperature from the standalone RF and LSTM–RF hybrid model.
Horticulturae 12 00084 g015
Table 1. Environmental factor sensor parameters.
Table 1. Environmental factor sensor parameters.
ParameterCollection ScopeCollection Accuracy
Temperature (°C)−40~120±0.1
Humidity (%)0~100% RH±1% RH
Soil Temperature (°C)−40~80±0.5 °C
Soil Humidity (%)0~100% RH±2% RH
Wind Speed/(m·s−1)0~30±0.1
Solar Radiation (W·m−2)0~1800±3
Table 2. Correlation degree corresponding to the absolute values of correlation coefficient.
Table 2. Correlation degree corresponding to the absolute values of correlation coefficient.
Absolute Value of Correlation CoefficientDegree of Correlation
(0.8, 1]Highly strong relevance
(0.6, 0.8]Strong relevance
(0.4, 0.6]Moderate relevance
(0.2, 0.4]Weak relevance
[0, 0.2]Extremely weak relevance
Table 3. Grid search optimization parameter range.
Table 3. Grid search optimization parameter range.
ParameterParameter ImplicationsRange of Values
n_estimatorsNumber of decision trees in a Random Forest[100, 300]
max_depthMaximum depth of the decision tree[15, None]
min_samples_splitMinimum number of samples required to split internal nodes[2]
min_samples_leafMinimum number of samples required for leaf nodes[1]
max_featuresNumber of features to consider when finding the optimal segmentation[‘sqrt’]
bootstrapWhether to use self-sampling when constructing decision trees[True]
Table 4. Comparison of prediction results of different models.
Table 4. Comparison of prediction results of different models.
ModelR2MAE/°CRMSE/°C
Transformer0.9161.3961.806
TimesNet0.8491.5732.135
RF0.9560.9561.505
LSTM0.9410.9851.747
LSTM–RF0.9740.8441.155
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ma, S.; Zhang, Y.; Kou, L.; Huang, S.; Fu, Y.; Zhang, F.; Sun, X. Prediction Method of Canopy Temperature for Potted Winter Jujube in Controlled Environments Based on a Fusion Model of LSTM–RF. Horticulturae 2026, 12, 84. https://doi.org/10.3390/horticulturae12010084

AMA Style

Ma S, Zhang Y, Kou L, Huang S, Fu Y, Zhang F, Sun X. Prediction Method of Canopy Temperature for Potted Winter Jujube in Controlled Environments Based on a Fusion Model of LSTM–RF. Horticulturae. 2026; 12(1):84. https://doi.org/10.3390/horticulturae12010084

Chicago/Turabian Style

Ma, Shufan, Yingtao Zhang, Longlong Kou, Sheng Huang, Ying Fu, Fengmin Zhang, and Xianpeng Sun. 2026. "Prediction Method of Canopy Temperature for Potted Winter Jujube in Controlled Environments Based on a Fusion Model of LSTM–RF" Horticulturae 12, no. 1: 84. https://doi.org/10.3390/horticulturae12010084

APA Style

Ma, S., Zhang, Y., Kou, L., Huang, S., Fu, Y., Zhang, F., & Sun, X. (2026). Prediction Method of Canopy Temperature for Potted Winter Jujube in Controlled Environments Based on a Fusion Model of LSTM–RF. Horticulturae, 12(1), 84. https://doi.org/10.3390/horticulturae12010084

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop