1. Introduction
Chinese solar greenhouses (CSGs) are regarded as a unique and cost-effective greenhouse type in China, featuring excellent thermal insulation and high energy efficiency, which enables winter vegetable production without auxiliary heating, even when the monthly average temperature during the coldest three months drops below −10 °C [
1,
2]. However, this superior thermal performance presents a contrasting challenge during warmer periods. During summer months, the internal temperatures of CSGs frequently exceed 50 °C due to intense solar radiation, leading to stunted growth or even crop mortality [
3]. High temperatures and humidity in greenhouses could severely limit production by reducing yields and compromising produce quality during extended periods. However, correct climate control using various cooling methods enables sustained crop cultivation in warm seasons, particularly crucial for heat-sensitive vegetables such as tomatoes, cucumbers, peppers, and lettuce, which exhibit growth declines when temperatures exceed their optimal 29–30 °C range, making cooling system implementation essential for successful summer cultivation [
4,
5]. Various greenhouse cooling methods have been developed, with shading and natural ventilation being the most common methods of temperature control, although their cooling effectiveness remains limited. Among existing greenhouse cooling technologies, evaporative cooling systems have proven to be the most effective solution [
6]. The pad–fan configuration represents the most widely implemented form of this technology. Pad–Fan Cooling (PFC) systems have emerged as an effective solution for summer overheating in greenhouses due to their efficient and stable cooling performance. When properly designed, these systems could reduce air temperature by 5–12 °C and create more favorable conditions for crop growth. PFC systems are widely used in greenhouse temperature control and have high cooling efficiency, but they have several obvious limitations: non-uniform temperature distribution within the greenhouse, and substantial installation and maintenance costs [
7]. These limitations not only impact the overall performance of the cooling systems, but also pose significant challenges for growers in terms of cost management and crop uniformity. Thus, the spatio−temporal distribution of temperature, humidity, and airflow under Pad–Fan Cooling operation is crucial for optimizing greenhouse cooling strategies. Traditional approaches to optimizing PFC systems relies heavily on computational fluid dynamics (CFD) simulations, which, although effective in modeling airflow and thermal distribution, suffer from high computational costs, complex boundary condition setups, and limited real-time applicability. Moreover, existing research mainly focuses on environmental simulations of empty greenhouses, leaving significant gaps in our understanding of crop−environment interaction mechanisms. Particularly for low-profile crops like strawberries, the mechanism by which canopy structures influence greenhouse microenvironments remains unclear, which hinders the optimal design of cooling systems. Meanwhile, existing studies analyzing greenhouse microclimates have shown insufficient attention to the role of soil. As a crucial component of greenhouse ecosystems, soil directly influences indoor temperature distribution through various heat exchange mechanisms such as thermal conduction and radiation within the air [
8,
9]. Additionally, its moisture evaporation process alters air humidity dynamics [
10]. Specifically, soil surface temperature rises by absorbing solar radiation during the day and releasing stored heat at night, buffering extreme temperature fluctuations in the greenhouse [
11]. However, the soil-mediated microclimate regulation and crop canopy structure further interact with each other to form a complex “soil−crop−microclimate” dynamic relationship. The mechanism of this process has not been fully revealed, which further hinders the accurate design of cooling systems. To address this gap, microclimate prediction has become a crucial component that bridges the interactions between soil, crops, and greenhouse micro-environments with practical greenhouse regulation. In solar greenhouses equipped with PFC systems, “microclimate prediction” refers to the quantitative forecasting of key environmental parameters such as temperature, humidity, and wind speed across spatial and temporal dimensions. In other words, its primary objective is to clarify how the micro-environment in greenhouses (including temperature, humidity, and wind speed) dynamically changes across different locations and time periods under various influencing factors such as PFC system operation, solar radiation, soil heat and moisture exchange, and crop–canopy interactions. This research provides theoretical foundations and data support for the precise regulation of cooling systems and scientific management of crops.
In recent years, the development of machine learning (ML) had provided new possibilities for predicting and optimizing greenhouse climates. Compared to traditional CFD methods, ML models could process real-time sensor data, capture complex relationships between environmental variables, and calculate higher computational efficiency. A multiple linear regression model was used to predict the evapotranspiration and transpiration of greenhouse tomatoes [
12,
13,
14]. Photosynthetically active radiation, total radiation, day–night temperature and humidity, and leaf fresh weight were regarded as input variables to achieve precise irrigation and efficient utilization of water resources. Neural networks and multiple-regression models were introduced to predict the roof temperature and air humidity inside a half-daylight-temperature room [
15,
16,
17]. The results show that neural networks have significantly improved prediction accuracy than multiple-regression models, demonstrating their potential for the intelligent prediction of greenhouse environments. A greenhouse environment model was constructed by using an online sparse least squares support vector machine regression model [
18,
19,
20]. The indoor and outdoor temperatures were regarded as input variables, achieved low mean square error, and had a concise model structure. Compared to methods such as recursive weighted least squares estimation and Elman neural networks, this model exhibited higher prediction accuracy. This work provides a nonlinear modeling method suitable for online learning, which has practical value for the intelligent control of greenhouse environments. The greenhouse tomato transpiration prediction model was established based on the random forest algorithm [
21,
22]. The input variables included air temperature, relative humidity, light intensity, and relative leaf area index. The determination coefficients of the model in the seedling stage and flowering stage reached 0.9472 and 0.9654, respectively, and the prediction error was much lower than that of the BP neural network and GA–BP neural network. This indicated that the model has practical significance in precision irrigation management. A tomato yield prediction model was developed by combining recurrent neural networks and Time Convolutional Networks [
23,
24,
25]. The model was trained by using greenhouse environmental parameters and historical yield data, with root mean square errors as low as 6.76–10.45 g/m
2 on multiple test sets. The study also found that historical production information had the most significant impact on the prediction results. This method helps growers to develop management strategies scientifically. The greenhouse climate prediction model was proposed by using long short-term memory networks [
26,
27,
28]. This model took into account six factors, including temperature, humidity, light, carbon dioxide concentration, air temperature, and soil moisture, and updated the data every 5 min. In practical applications in tomato, cucumber, and chili greenhouses, the model performed stably and maintains good predictive ability, even when encountering abnormal data, which is of great significance when there is a need to adjust the greenhouse environment in advance and ensure crop growth. A gated cyclic unit model was developed to predict the minimum temperature [
29,
30]. The model was trained by using monthly average minimum temperature data from over 30 years in the local area. The model had minimal errors during the testing phase, and was able to make reasonable predictions of the minimum temperature for the next year or more. This prediction method provides strong support for greenhouse temperature regulation. The ARMA model was used to predict the weekly harvest of sweet peppers [
31,
32,
33]. Production data from previous weeks were incorporated into the model, which resulted in improved predictive performance over traditional reference models. For some varieties, the addition of environmental variables reduced the prediction error by 20%, which indicated that the method had certain practical value in greenhouse production arrangements. The trend of carbon dioxide emissions in greenhouses was analyzed by using ARIMA models, which shows that the model performs well in short-term forecasting and is helpful in evaluating changes in greenhouse effects [
34,
35]. Research is increasingly focusing on the application of machine learning in greenhouse environment prediction, but most studies only verify the effectiveness of a certain type of model separately without conducting a systematic comparative analysis. The summary of the literature is presented in
Table 1.
In the current study, the experimental microclimate data in CSG with PFC systems of Shenyang Agricultural University were utilized by two machine learning algorithms and three deep learning algorithms for their ability to predict a PFC system’s cooling effects, including humidity, temperature, and wind speed, which were ARMA, ARIMA, LSTM, TCN, and GRU, respectively; it was found that deep learning models such as TCN, GRU, and LSTM perform better in greenhouse climate prediction than traditional ARMA and ARIMA models. TCN, with its multi-scale receptive field of dilated convolution, has greater potential for capturing high-frequency fluctuations and extreme nonlinear relationships in temperature and humidity, and could better adapt to the spatial attenuation of airflow. LSTM balanced long- and short-term dependencies by gating mechanisms and exhibiting robustness. GRU simplified the gating structure of LSTM, ensuring accuracy while improving efficiency. Introducing multiple machine learning methods for joint prediction in the future is expected to further improve the accuracy and practicality of predictions.
Core Contributions of This Study: This study compares the two traditional algorithms (ARMA and ARIMA) with three deep learning algorithms (LSTM, TCN, and GRU) to address the lack of systematic comparisons of models in existing studies. It was found that the deep learning model equipped with PFC significantly outperformed the traditional model in predicting temperature, humidity, and wind speed in the CSG system.
Compared to the traditional model (which finds it difficult to deal with nonlinear fluctuations) and other deep learning models (TCN has limitations, and LSTM has high cost), GRU achieves an optimal balance between accuracy and efficiency (convergence is 30% faster than LSTM).
2. Methods and Materials
Figure 1 shows the research steps for predicting the microclimates of CSG with PFC systems. This framework consists of multi-sensor monitoring and machine learning forecasting. There are five main parts: (1) Preparing for microclimate monitoring. (2) Collecting and preprocessing data. (3) Building machine learning models. (4) Training, evaluating, and comparing multiple models. (5) Finding the best prediction model. Firstly, in the preparatory stage of the research, several measuring points were configured in the CSG. These points were marked for temperature (T1–T7), humidity (T1, T2, T3, T6, T7), and wind speed (T2, T6, T7). All instruments were calibrated and prepared, such as the ADCON temperature sensor, Testo45 anemograph, and multi-function temperature–humidity meter. The next part was data acquisition and preprocessing. With the installed sensors, microclimate data could be obtained, including temperature, humidity, and wind speed, which were subsequently preprocessed. Missing values were address by employing a min–max scaler for normalization and performing correlation testing to select the features. In the model construction stage, traditional statistical models (ARMA, ARIMA) and deep learning models (LSTM, TCN, GRU) were introduced. These models underwent processes such as data preprocessing adaptation, orthogonality verification, and feature engineering implementation, which is beneficial for fitting the CSG microclimate data. For the purposes of model training, evaluation, and comparison, the preprocessed data was utilized in conjunction with performance metrics (e.g., loss functions) and visualization techniques (e.g., loss curves and comparisons of sequence predictions). ARMA, ARIMA, LSTM, TCN, and GRU models were compared using R2 and root mean square error (RMSE). This checked how well they predict greenhouse microclimate variables. Finally, for optimal model identification, we consider both prediction accuracy and efficiency. From the compared models, we picked the best one (GRU, which is 30% faster and still accurate). This provided a reliable tool for predicting and managing greenhouse microclimates.
2.1. Experimental Greenhouse
The CSG is located at Shenyang Agricultural University in Shenyang, Liaoning Province (41°49′ N, 123°34′ E), specifically within Greenhouse No. 46, as shown in
Figure 2 and
Figure 3. This single-span greenhouse is oriented in a north–south direction, with a slight deviation of 7° west of due south, designed to maximize solar radiation capture during winter. The CSG has a total length of 60 m, a span of 10 m, a rear-wall height of 3.2 m, and a ridge height of 5 m, covering a total ground area of 480 m
2. The greenhouse structure consists of four primary components: the front wall, side walls, rear slope, and foundation soil. The rear wall is consist of an inner brick wall, an expanded polystyrene (EPS) insulation layer, and an outer brick wall (240 mm thick solid clay bricks with 20 mm plaster on both surfaces). The rear slope is composed of wooden boards, an EPS insulation layer, cement, and a waterproof layer. The structure adopts an arched truss design that eliminates the need for supporting pillars, with its roof, gable walls, and front roof structure all covered with 0.1 mm thick polyethylene plastic film. A Pad–Fan Cooling system was installed for indoor temperature regulation, which consists of a galvanized aluminum alloy wet pad (dimensions: 3 m in length × 0.15 m in width × 1.5 m in height, with a total area of 4.5 m
2), two negative-pressure fans (each with a diameter of 1.06 m, width of 0.4 m, power of 0.55 kW, blade speed of 650 rpm, and airflow rate of 38,000 m
3/h), a horizontal centrifugal pump, a rectangular water tank (dimensions: 1 m in length × 0.5 m in width × 0.45 m in height, with a volume of 0.225 m
3 or 225 L), and an control unit. Strawberries are grown in the greenhouse at a planting density of 8500 plants per hectare, arranged in rows oriented in the north–south direction. A thermal insulation curtain is operated daily (opened at 08:30 and closed at 16:30) to enhance temperature regulation between day and night.
2.2. Experimental Methods
2.2.1. Experimental Apparatus
Indoor greenhouse temperatures at various positions were measured using ADCON instruments manufactured in the United States. These instruments have a measurement range from −40 °C to 120 °C and provide a high accuracy of ±0.2 °C. A multifunctional temperature and humidity meter was employed to measure indoor humidity within the CSG, offering a temperature measurement range from −30 °C to 50 °C (accuracy: ±1 °C) and a humidity measurement range of 0–100% RH. Wind speed at different indoor locations was assessed using a Testo 425 handheld thermal anemometer, which has a measurement range from 0 to 40 m/s, an accuracy of ±(0.2 m/s + 1.5% of the reading), and a resolution of 0.1 m/s. Outdoor temperature and humidity were monitored using the TRM−ZSF GPRS wireless remote control system developed by Jinzhou Sunshine Technology Co., Ltd. (based in Jinzhou City, China), capable of measuring humidity within the range of 0–90% RH and temperature from −25 °C to 70 °C. The external climate remained stable throughout the experimental period. Each measurement point was continuously monitored for 24 h, with data recorded at 30 min intervals, and all measurements were repeated three times to ensure reliability.
2.2.2. Experimental Preparation
Seven measurement points were arranged inside the experimental CSG, as shown in
Figure 4 (side view) and
Figure 5 (top view). Specifically, T1, T2, and T3 were located 15 m horizontally from the wet pad, with vertical heights of 0.5 m, 1.5 m, and 2.0 m, respectively. T4 and T5 are both positioned at a vertical height of 1.5 m, with T4 located 2 m horizontally from the rear wall and T5 situated 2 m horizontally from the film. Additionally, T6, T2, and T7 are placed 3 m, 6 m, and 9 m away from the wet pad along the horizontal axis, all at a vertical height of 1.5 m. This sampling strategy aimed to capture the most representative and dynamically significant microclimate changes in the solar greenhouse under the influence of the PFC system. Initially, a total of 10 measurement points were tested, and, through comparative analysis of preliminary data, 7 points were finally selected, following two core principles: prioritizing positions with the largest temperature fluctuations (to highlight the impact of the PFC system, as the microclimate changes at T1–T7 are the most significant) and ensuring that diverse spatial coverage (T1/T2/T3 reflects the vertical stratification characteristics from the top of the crop canopy to the ground near the wet pad. T6/T7 track the horizontal airflow attenuation along the wet-pad–fan path, such as temperature and humidity attenuation; T4/T5 represent the peripheral “side zone” environment, used to supplement central airflow data and reflect spatial non-uniformity, although their role is secondary to the high-fluctuation central measurement points). Measurements during the experiment were conducted as follows: Air temperature results were collected from seven locations (T1–T7) every 30 min over four days: August 23rd–26th. Specifically, indoor and outdoor air temperatures were recorded every 30 min over a continuous 24 h period. Air humidity at measurement points T1, T2, T3, T6, and T7 was monitored every 30 min throughout the 7 h operation of the wet-pad–fan system, from its startup to shutdown. Similarly, airflow velocity at T2, T6, and T7 was recorded at 30 min intervals during the same 7 h operational period. All experiments were conducted on clear days and nights, with the greenhouse empty of crops, to minimize the influence of human activity, external environmental factors, and plant growth on indoor temperature and airflow. The reasons for not measuring the areas above and below T4 are as follows: First, in terms of spatial relevance, T4 is close to the rear wall of the greenhouse and is not located in the main influence zone of the PFC system (the central airflow path), so the surrounding areas of T4 have minimal correlation, with the airflow effects induced by the fans. Second, from the perspective of redundancy, this study focuses on capturing the core impacts of the PFC system (which are most significant in the wet-pad–fan axis area). Including the areas above and below T4 would add redundant data and fail to deepen our understanding of key effects such as cooling efficiency and airflow attenuation. Therefore, priority was given to retaining the T1–T3/T6/T7 measurement points, which are crucial for PFC analysis, while T4/T5 serve as supplementary side-zone references. The selection of the seven measurement points was validated through preliminary data screening (after testing 10 initial points, T1–T7 were chosen due to their largest temperature fluctuations and unique spatial functions) and structural coverage (encompassing vertical stratification, horizontal airflow attenuation, and side-zone references). This design conforms to the norms of greenhouse microclimate research and is reproducible, with clear spatial coordinates and standards.
2.3. Machine Learning Algorithms, Deep Learning Models, and Training Process
Five prediction models were employed in the current paper to analyze greenhouse environmental data. Among these, TCN, GRU, and LSTM are deep learning models, whereas ARMA and ARIMA represent traditional statistical approaches. These models are primarily utilized for time series forecasting, allowing for the prediction of temporal changes in environmental variables such as temperature and humidity. To determine the most effective model, two evaluation metrics were adopted: root mean square error (RMSE) and coefficient of determination (R
2). A lower RMSE indicates predictions that are closer to the actual values, while a higher R
2 suggests that the model accounts for a greater proportion of variability in the data [
26,
27,
28,
29,
34,
35]. In addition to these metrics, the performance of each model on both the training and testing datasets was examined. By comparing the predicted values with the observed measurements, the most reliable model could be identified. The objective was to select the model that not only fits the training data well, but also maintains accuracy when applied to new, unseen data.
2.3.1. ARMA Method
ARMA is a statistical model used for analyzing and forecasting stationary time series. It combines Autoregressive (AR) and Moving Average (MA) components to capture temporal patterns. The mathematical representation of the ARMA(
p, q) model is as follows:
where the AR(
p) part uses
past values
to model long-term dependencies, while the MA(
q) part incorporates
past error terms
to account for short-term fluctuations. Orders
and
are typically determined by via autocorrelation (ACF) and partial autocorrelation (PACF) functions. ARMA demonstrates strengths in simplicity, interpretability, and computational efficiency when working with small datasets, which has led to its widespread adoption in short-term forecasting across fields such as finance (e.g., stock prices), economics (e.g., inflation), and business (e.g., demand) [
20]. However, the model requires stationary input data and performs poorly in capturing nonlinear patterns or complex trends, thereby restricting its applicability to relatively simple time series scenarios [
20,
21,
22,
23].
2.3.2. ARIMA Method
ARIMA is a statistical model designed for forecasting non-stationary time series. It extends the ARMA model by incorporating an “integration” component to address non-stationarity. The model is denoted as ARIMA, where represents the order of the autoregressive component, which uses past values to capture long-term dependencies. indicates the degree of differencing applied to transform non-stationary data into stationary form, thereby addressing trends or seasonality.
denotes the order of the moving average component, which incorporates past error terms to model short-term fluctuations. ARIMA is widely used in short-term forecasting tasks such as energy consumption prediction and meteorological data extrapolation [
24,
35]. Its advantages include adaptability to data with mild trends, a straightforward mathematical formulation that supports theoretical analysis, and relatively low computational requirements, making it suitable for real-time forecasting [
25,
26,
35].
2.3.3. LSTM Method
LSTM is a variant of recurrent neural networks (RNNs) specifically designed to overcome the vanishing gradient problem commonly encountered in traditional RNN architectures. By employing a set of gated units—including input, forget, and output gates—LSTM regulates the flow of information through its memory cells, thereby enabling the effective capture and retention of long-term temporal dependencies in sequential data. This architecture demonstrates superior performance in modeling time series data with complex long-range dependencies [
23,
24,
25,
26,
27]. Nevertheless, compared to simpler models, LSTM entails higher computational costs and may be prone to overfitting when trained on limited datasets [
23,
26].
2.3.4. TCN Method
A Temporal Convolutional Network (TCN) is a deep learning architecture specifically designed for modeling sequential and time series data. Built upon the foundation of Convolutional Neural Networks (CNNs), TCNs utilize dilated causal convolutions to effectively capture long-range temporal dependencies while preserving computational efficiency through parallel processing. In contrast to recurrent architectures such as long short-term memory (LSTM) networks, TCNs process input sequences in parallel using a fully convolutional framework, thereby avoiding issues like vanishing gradients and enabling more efficient training on long sequences [
20,
23]. The architecture introduces several key innovations: dilated convolutions that exponentially expand the receptive field, causal convolutions that preserve temporal order, and residual connections that facilitate stable training of deep networks. These characteristics allow for TCNs to effectively model complex temporal dynamics, accommodate variable-length sequences, and perform well in applications requiring precise temporal understanding [
23].
2.3.5. GRU Method
GRU is a simplified variant of RNN, derived from the LSTM architecture. It simplifies the gating mechanism by merging the input and forget gates into a single update gate, while preserving the reset gate [
29,
30,
31,
32,
33,
34]. This structural simplification reduces both the number of parameters and computational complexity compared to LSTM. The GRU effectively captures long-term dependencies in sequential data by regulating the flow of information through its two gating mechanisms: the reset gate controls how much past information to discard, while the update gate determines how much new information to retain [
31,
34]. This dynamic enables the model to efficiently manage temporal patterns and facilitates faster training due to its streamlined architecture [
30,
32,
33,
34].
2.4. Hyperparameter Tuning with K-Fold Cross-Validation
K-fold cross-validation is a robust data partitioning technique used to impartially evaluate the predictive performance of forecasting methods [
19,
20,
21,
22,
23,
24,
25,
26,
27,
28,
29,
30,
31,
32,
33,
34,
35]. In this approach, the dataset is divided into
k mutually exclusive subsets. During each iteration, one subset is used as the validation set, while the remaining k−1 subsets are used for training. The process is repeated k times, with each subset serving exactly once as the validation set. Finally, the performance metrics obtained from the k iterations are averaged to produce a single, comprehensive evaluation score. In this study, a 5-fold cross-validation setup was employed. Hyperparameter tuning was conducted automatically using grid search.
In this study, hyperparameter tuning was conducted automatically using grid search combined with a 5-fold cross-validation setup to ensure the reliability and generalizability of optimal parameters. Detailed tuning processes for each algorithm are as follows:
ARMA: Tuned hyperparameters include autoregressive order and moving average order , with value ranges and (determined via ACF/PACF analysis). The optimal combination was selected by maximizing the validation set R2.
ARIMA: Tuned hyperparameters include autoregressive order , differencing order , and moving average order , with value ranges and , (justified by unit root tests for non-stationarity), and . The optimal combination was chosen by minimizing the validation set RMSE.
LSTM: Tuned hyperparameters focus on the number of neurons per layer, with the following ranges: 1st layer [128, 256, 512], 2nd layer [64, 128, 256], and 3rd layer [32, 64, 128]. The optimal configuration (256, 128, 64) was determined via a grid search combined with an early-stopping strategy (patience = 20) monitoring validation loss. The learning rate uses the default value of the Adam optimizer (0.001).
TCN: Tuned hyperparameters include kernel size [3, 5, 7], dilation rate sequences ([(1, 2, 4, 8), (1, 2, 4, 8, 16)]), and dropout rate ([0.2, 0.3]). The optimal combination (kernel size = 3, dilation rates = (1, 2, 4, 8), dropout rate = 0.2) was selected by prioritizing validation set R2 and training efficiency. The learning rate uses the default value of the Adam optimizer (0.001).
GRU: Tuned hyperparameters focus on the number of neurons per layer, with the following ranges: 1st layer [128, 256, 512], 2nd layer [64, 128, 256], and 3rd layer [32, 64, 128]. The optimal configuration (256, 128, 64) was determined via grid search combined with an early-stopping strategy (patience = 20) monitoring validation loss. The learning rate uses the default value of the Adam optimizer (0.001).
2.5. Data Processing and Partitioning
The raw data, including temperature, humidity, and wind speed collected from the CSG equipped with a PFC system, were first preprocessed to ensure quality and consistency. Missing values were removed to avoid bias in model training, and all features and target variables were normalized to the range [0, 1] using ‘MinMaxScaler’ to unify data scales, facilitating stable model convergence. For time series construction, a sliding window approach with a time step of 40 was adopted to convert the raw data into input–output pairs. Specifically, each input sequence (X) consisted of 40 consecutive time steps of multi-feature data, and the corresponding output (y) was the target variable at the next time step, preserving the temporal dependencies inherent in the greenhouse microclimate data. The dataset was partitioned into training and testing sets using an 8:2 ratio, following temporal continuity. The first 80% of the samples were assigned to the training set and the remaining 20% to the testing set, with no random shuffling to maintain the integrity of the time series structure. Detailed sample sizes for each subset are as follows:
- –
Temperature data (720 measurements per point): 576 training samples and 144 testing samples.
- –
Humidity data (300 measurements): 240 training samples and 60 testing samples.
- –
Wind speed data (180 measurements): 144 training samples and 36 testing samples.
For model validation, an early-stopping strategy was implemented to prevent overfitting. During training, the testing set was temporarily used as the validation set to monitor the validation loss (‘val_loss’). Training was terminated if no improvement in validation loss was observed for 20 consecutive epochs, and the model weights corresponding to the minimum validation loss were restored to ensure optimal generalization performance. All data processing and partitioning steps were executed in the Jupyter Notebook 6.5.4. environment.
2.6. Model Construction and Training
2.6.1. Machine Learning Models
The implementation of the ARMA model involves loading temperature data from Excel files, screening out feature columns and target columns, removing missing values to ensure data integrity, and normalizing data to [0, 1] range using MinMaxScaler for consistent scaling. The model is constructed using the create_arma_dataset function using a univariate approach, utilizing only historical data from the target sequence as input (without integrating multiple features since ARMA is a univariate model). A time step length of 40 is set to extract target sequence data, generating sequences for training and prediction. The target sequence is divided into training and test sets in an 8:2 ratio while maintaining temporal continuity to meet time series prediction requirements. The model order () is determined using the Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) plots (e.g., ). The ARIMA model in the statsmodels library is employed (with d = 0 to simulate ARMA) using training sequences as input. The model fits the training data, generates model summaries to verify parameter estimation and significance, then performs in-sample predictions on the training set and out-of-sample predictions on the test set (with prediction length matching the test set). Finally, the inverse scaling function is applied to normalize both the predicted and actual values, restoring the original temperature scale. Model performance is quantified using root mean square error (RMSE), coefficient of determination (R2), mean square error (MSE), and mean absolute error (MAE). Visualization methods include loss curves, prediction comparison plots, R2 scatter plots, and full-sequence overlay diagrams. While ARIMA and ARMA share core processes—including data loading, preprocessing, univariate series construction, 8:2 time series partitioning, prediction and normalization, and evaluation metrics—the key distinction lies in ARIMA’s introduction of the difference parameter (set to 1 in this study via order = ()). This enables ARIMA to transform non-stationary data into stationary sequences, addressing trend or seasonal issues, whereas ARMA requires stationary data () and relies solely on autoregressive (AR) terms () and moving average (MA) terms () to capture temporal patterns, failing to overcome these limitations.
2.6.2. Deep Learning Models
This study selects three common deep learning models—GRU, LSTM, and TCN—for time series prediction in greenhouse environments. In data processing, all three models follow the same workflow: reading temperature data from Excel files, filtering out feature columns and target columns, removing missing values, and then normalizing the data to [0, 1] range using MinMaxScaler. Training samples are constructed using 40 time steps of sliding windows, each containing multi-dimensional environmental feature data from consecutive time steps and the predicted target value for the next moment. After partitioning the time series at an 8:2 ratio, the data dimensions are uniformly adjusted to [samples, time steps, features] to meet model input requirements. In model construction, GRU adopts a three-layer structure (neural units: 256,128,64), with the first two layers retaining sequence information (return_sequences set to True) and outputting single-value predictions through a fully connected layer. Its distinctive characteristic is that it combines the forget gate and input gate into an update gate mechanism, which leads to a reduced parameter scale. Similarly, LSTM also has a three-layer structure (neural units: 256,128,64), with the first two layers similarly set to return_sequences True. It handles sequence information through independent forget gate, input gate, and output gate mechanisms, making it better suited for capturing complex temporal dependencies, although its parameter scale exceeds that of GRU. TCN constructs multi-scale receptive fields through causal convolution (padding = ‘causal’) with dilation rates (1, 2, 4, 8), employing a custom TCNLayer stack architecture that demonstrates superior performance in high-frequency feature extraction. During training, all three models utilize MSE loss with Adam optimization, set to a batch size of 32 and a maximum iteration count of 1500. The early stopping strategy for GRU and LSTM allows for the tolerance of up to 20 epochs of loss stagnation, while TCN maintains this threshold at 10 epochs. Predictions are normalized through inverse scaling and are evaluated using RMSE, R2 metrics, and visualizations. Practical applications reveal GRU’s distinct advantages in greenhouse environmental time series prediction: its compact parameter architecture reduces training time while maintaining comparable accuracy to LSTM, enhancing computational efficiency. Compared to TCN’s complex network structure, GRU’s simpler design requires fewer computing resources and is more deployable, making it better suited for precision control scenarios in greenhouse environments where practicality and efficiency are paramount.
5. Conclusions
The variation process of the microclimate environment within solar greenhouses under the influence of PFC systems was systematic investigated using experimental measurements. According to the spatial distribution perspective, the microclimate in greenhouses with PFC systems exhibit significant non-uniformity. The temperature and humidity gradients along the wet-pad–fan direction (T6 → T2 → T7) and vertically (T1 → T2 → T3) differ significantly, and the wind speed also decreases significantly with distance (from 2.31 m/s in T6 to 0.67 m/s in T7). This spatial feature indicates that the influence of airflow distribution should be fully considered when optimizing the model. ARMA, ARIMA, LSTM, TCN, and GRU were used in predicting microclimate parameters (temperature, humidity, and wind speed). The results indicate that deep learning models significantly outperform traditional linear models in capturing the complex nonlinear relationships and spatiotemporal dynamics within greenhouse environments. Furthermore, notable differences in applicability and predictive accuracy were observed among the various models. Specifically, the traditional time series models ARMA and ARIMA demonstrated relatively limited performance in forecasting greenhouse microclimatic conditions.
The ARMA model is based on linear and stationarity assumptions, which are reasonable under short-term stable conditions; however, it is incapable of handling nonlinear fluctuations caused by factors such as wet pad start–stop cycles or solar radiation. The R2 values for temperature, humidity, and wind speed in the test set have dropped significantly, even becoming negative, reflecting the model’s limitations. Although ARIMA addresses non-stationarity through differencing, its linear structure still hampers its ability to capture the highly nonlinear interactions between evaporation and wind speed.
In contrast, deep learning models exhibit excellent performance in handling complex dynamics. LSTM captures long-term dependencies through gating mechanisms, and it outperforms traditional models in tracking sudden temperature changes and spatial airflow attenuation, with R2 values in the test set being 0.941 for temperature, 0.855 for humidity, and 0.879 for wind speed, respectively. TCN shows outstanding performance in identifying high-frequency fluctuations and extreme nonlinear features, especially in wind speed prediction (with an R2 of 0.847). However, it has limitations in modeling certain temperature dynamics (e.g., the R2 of the T1 test set is −0.645) and humidity evaporation processes (e.g., the R2 of the T7 test set is 0.1). GRU achieves a good balance between accuracy and efficiency. It attains the highest prediction accuracy in temperature (test set R2 = 0.975) and humidity (test set R2 = 0.973) prediction, and is only slightly inferior to TCN in wind speed prediction. Notably, GRU converges approximately 30% faster than LSTM and is more efficient than TCN on large-scale datasets, which makes it highly suitable for greenhouse applications with limited computational resources.
This study clarifies the limitations of traditional models: compared to traditional machine learning models (such as multiple regression [
12] and support vector machine regression [
18]) limited to capturing single parameters or linear relationships, it further quantifies the failure degree (with R
2 being negative) of traditional time series models like ARMA and ARIMA in handling nonlinear microclimate fluctuations (e.g., from wet curtain start–stop and solar radiation) and verifies their shortcomings in complex dynamic scenarios through multi-parameter synchronous prediction. It deepens our scenario-based understanding of deep learning models: building on early neural networks [
16,
20] and recent models like LSTM [
26], GRU [
29], and RNN−TCN [
23], through systematic comparison, it clarifies GRU’s balanced advantages in accuracy (temperature R
2 = 0.975) and efficiency (30% faster than LSTM) in greenhouse multi-parameter prediction, reveals TCN’s limitations in specific scenarios, provides more specific model selection guidance, and fills gaps in existing research regarding deep learning models’ adaptability to complex microclimates. It also establishes a cross-model comparison framework: breaking through previous studies’ “single model for single task” limitation [
17,
19], by using parallel tests of five models, it reveals differences in models’ adaptability to greenhouse microclimates’ spatial non-uniformity, offering comprehensive selection criteria for practical scenarios with limited computing resources or real-time regulation needs, and also addressing early studies’ lack of diverse model application scenarios.
In summary, deep learning models, particularly GRU, offer more reliable methodological support for greenhouse microclimate prediction, thereby facilitating precise regulation of cooling systems and scientifically informed crop management. Future research should aim to expand the number of monitoring points, integrate crop canopy structure information, and explore multi-scale prediction approaches tailored to minute- and hour-level timeframes in order to further enhance the model’s adaptability and practical applicability to complex agricultural environments.