Evaluating Time-Series Prediction of Temperature, Relative Humidity, and CO 2 in the Greenhouse with Transformer-Based and RNN-Based Models

: In greenhouses, plant growth is directly in ﬂ uenced by internal environmental conditions, and therefore requires continuous management and proper environmental control. Inadequate environmental conditions make plants vulnerable to pests and diseases, lower yields, and cause impaired growth and development. Previous studies have explored the combination of greenhouse actuator control history with internal and external environmental data to enhance prediction accuracy, using deep learning-based models such as RNNs and LSTMs. In recent years, transformer-based models and RNN-based models have shown good performance in various domains. However, their applications for time-series forecasting in a greenhouse environment remain unexplored. Therefore, the objective of this study was to evaluate the prediction performance of temperature, relative humidity (RH), and CO 2 concentration in a greenhouse after 1 and 3 h, using a transformer-based model (Autoformer), variants of two RNN models (LSTM and SegRNN), and a simple linear model (DLinear). The performance of these four models was compared to assess whether the latest state-of-the-art (SOTA) models, Autoformer and SegRNN, are as e ﬀ ective as DLinear and LSTM in predicting greenhouse environments. The analysis was based on four external climate data samples, three internal data samples, and six actuator data samples. Overall, DLinear and SegRNN consistently outperformed Autoformer and LSTM. Both DLinear and SegRNN performed well in general, but were not as strong in predicting CO 2 concentration. SegRNN outperformed DLinear in CO 2 predictions, while showing similar performance in temperature and RH prediction. The results of this study do not provide a de ﬁ nitive conclusion that transformer-based models, such as Auto-former, are inferior to linear-based models like DLinear or certain RNN-based models like SegRNN in predicting time series for greenhouse environments.


Introduction
Plant growth in greenhouses is greatly influenced by internal environmental conditions, and therefore requires continuous management and proper environmental control.Inadequate environmental conditions make plants vulnerable to pests and diseases, result in decreased yields, and cause impaired growth and development [1][2][3].To ensure optimal growing conditions, monitoring is critical to predict changes in the greenhouse environment.However, environmental factors in the greenhouse exhibit complex and nonlinear dynamics, making prediction a particularly challenging task.As a result, simple models and formulas are often insufficient to accurately predict environmental changes [4,5].
In previous studies, researchers have primarily used the internal and external environmental data of the greenhouse to predict the conditions inside, or they have combined the data with the control history of the actuators in the greenhouse.As an example of the use of only greenhouse environmental data to predict indoor conditions, Moon et al. [6] utilized nine environmental parameters that were recorded at 10 min intervals to predict changes in the carbon dioxide (CO2) concentration, using an artificial neural network (ANN) model.These authors reported high accuracy even in the absence of ventilation data, demonstrating the potential of a neural network-based model for predicting CO2 in greenhouses.Subsequently, Moon et al. [7] used an LSTM model to predict the CO2 concentration and achieved promising accuracy despite the rapid and inconsistent fluctuations in CO2 levels in the greenhouse.As an alternative to neural network-based approaches, Cao et al. [8] implemented a tree-based machine learning model combined with time-series features.Their proposed model demonstrated high predictive performance even though the model was simple and fast to train, in contrast with deep learning-based models such as RNNs and LSTMs.
Recent studies have attempted to predict greenhouse conditions by integrating internal and external environmental data, as well as data from actuator control history.Choi et al. [9] used a combination of environmental data and operating values of control devices to predict the temperature and relative humidity in a greenhouse 10-120 min in advance.Jung et al. [10] conducted a comparative analysis of RNN-LSTM-based models and NARX models and obtained satisfactory forecasts for temperature and CO2 concentration.However, the models did not perform as well in predicting relative humidity in the greenhouse, particularly under unusual outdoor weather conditions such as heavy rainfall and storms.Ullah et al. [11] attempted to predict three greenhouse environmental factors: temperature, CO2 concentration, and relative humidity.To improve the prediction accuracy, they proposed an ANN-based model using a refined Kalman filter and reported high accuracy even with noisy sensor readings in the greenhouse.Cai et al. [12] utilized a light gradient boosting machine learning (LightGBM) model.They reported that LightGBM was applicable not only to the prediction of the greenhouse environment but also to realtime predictive control applications.Jung et al. [13] used an LSTM model to predict two crucial factors related to moisture in greenhouses: relative humidity and evapotranspiration.Their study validated the feasibility of applying data-driven deep learning models and demonstrated that the model was able to predict greenhouse conditions in real-world applications, assuming a consistent and extensive collection of sufficient environmental data.
Overall, the above-mentioned studies have shown that incorporating the greenhouse actuator control history with indoor and outdoor environmental data tends to achieve enhanced prediction performance.Various approaches ranging from machine learning models to neural network-based deep learning models have been investigated, and most of them have demonstrated satisfactory performance in the context of the given experiment.However, to the best of our knowledge, there are no reports using transformerbased models for time-series prediction in greenhouse environments, despite their superior performance over conventional deep learning models in many applications across various domains.
In recent years, the transformer has demonstrated exceptional performance in many fields, including natural language processing [14], speech recognition [15], and computer vision [16].Consequently, many transformer-based models have been proposed, and the performance of these models has been continuously enhanced [17,18].Research on transformer-based models is also gaining popularity in the field of time-series forecasting [19].Along with this trend, the comparative advantage of transformer-based models over nontransformer-based models has become a primary focus of investigation [20][21][22].
Nonetheless, transformer-based models have not yet been explored to address time-series forecasting in a greenhouse environment.
In addition, recent research indicated that the SegRNN model, an enhanced version of the conventional RNN model for time-series prediction, has achieved remarkable performance.The conventional RNN often suffers from performance degradation due to excessively long look-back windows and forecast horizons.The SegRNN model counters this by integrating segment-wise iterations and parallel multi-step forecasting, which greatly reduces the number of RNN iterations required.These changes have led to significant improvements in both prediction accuracy and inference speed.However, just like transformer-based models, the SegRNN model has not yet been tested for time-series prediction in greenhouse environments.
Therefore, the objective of this study was to evaluate the predictive performance of temperature, relative humidity (RH), and CO2 concentration in a greenhouse after 1 and 3 h, using a transformer-based model (Autoformer) [23], variants of two RNN models (LSTM [24] and SegRNN [25]), and a simple linear model (DLinear) [21].The performance of these four models was compared to assess whether the transformer-based model (Autoformer) and RNN-based model (SegRNN) are as effective as the linear-based model in predicting greenhouse environments.The models were trained and tested using inputs from greenhouse climate conditions and the status of the greenhouse actuators, collected over the previous three days.Prediction performance was evaluated with four metrics: the mean absolute error (MAE), mean squared error (MSE), root-mean-squared error (RMSE), and coefficient of determination (R 2 ).

Data Acquisition and Preprocessing
The dataset for this study was obtained from a Venlo-type tomato greenhouse operated by the Korea Institute of Science and Technology (KIST) in Gangneung, Gangwondo, Republic of Korea, with coordinates 37.79868 N, 128.85617E (Figure 1).An internal sensor module (SH-VT260, Soha Tech, Seoul, Republic of Korea) for measuring temperature, humidity, and CO2 inside the greenhouse was installed in height-adjustable positions according to the height of the tomato plants.In addition, an external weather station (Vantage Pro2, Davis Instruments, Hayward, CA, USA) was installed 2 m above the roof of the greenhouse.From 22 September 2020 to 29 June 2021, data were collected on an hourly basis from an internal sensor module as well as from an external weather station, and included the status of various greenhouse actuators.This resulted in a total of 6744 records being included in the dataset.
For model training, a set of 13 sensor and actuator values was used (Table 1): four measurements from an external weather station (temperature, relative humidity, wind direction, and wind speed), three measurements from an internal sensor module (temperature, relative humidity, and CO2 concentration), and the statuses of six actuators (circulating fan, fogging valve, CO2 injection valve, window opening ratio, shading curtain opening ratio, and heat retention curtain opening ratio).The range of each value is shown in Table 2.   Collected data were preprocessed to handle missing values and outliers.In the raw dataset, missing values accounted for 0.38% of the internal environmental variables and 0.69% of the external environmental variables and actuators.Missing values were interpolated from the previous and subsequent observations, under the assumption that there were no substantial changes in the values within an hour.Outliers were identified as values 1.5 times higher or lower than the interquartile range of the 25th and 75th percentiles of the raw data distribution and were then removed.After preprocessing, the dataset was divided into train, validation, and test sets in a ratio of 7:1:2, respectively.

Transformer-Based Model: Autoformer
Over the past few years, transformers have demonstrated outstanding performance in a wide range of applications, such as natural language processing (NLP) and computer vision (CV) [14][15][16].In NLP, the GPT and BERT models are representative transformerbased models that have received considerable attention in recent years [17,26].In CV, transformer-based models have been increasingly applied and shown to outperform conventional convolutional neural networks (CNNs) in classification, detection, and segmentation tasks [27][28][29].
Researchers have recently applied transformers to time-series forecasting.Unlike RNNs, which rely on recurrent mechanisms, or CNNs, which use convolutions, the transformer is based purely on an attention mechanism.This mechanism allows it to capture long-range dependencies more effectively than RNN-based models, and thus provides an advantage in predicting time series [30].However, the transformer's self-attention makes it difficult to efficiently handle long sequences of input and output, which is a drawback for time-series prediction.
To address these issues, Autoformer was proposed [23].Designed to improve prediction performance, Autoformer incorporates features specific to time-series data.It discriminates between seasonals and trends (series decomposition) before feeding the time-series data to the decoder (Figure 2a), allowing in-depth learning of complex temporal patterns during training.In addition, Autoformer uses an auto-correlation mechanism to overcome the inefficiencies of the self-attention in existing transformers (Figure 2b).As a result, Autoformer has demonstrated better prediction performance compared to current state-ofthe-art (SOTA) models [23].In particular, for weather forecasting such as outdoor temperature and humidity, which show a similar pattern to greenhouse time-series data, Autoformer outperformed the previous SOTA model, LSTM, by 21% in terms of MSE.auto-correlation mechanism to overcome the inefficiencies of the self-attention (green blocks) (adapted from [23]).

RNN-Based Model: (2) Segment RNN (SegRNN)
Despite their widespread use, RNN-based models have fallen behind transformerbased models in terms of their prediction performance for time-series data.However, recent research has revealed that SegRNN, an improved version of the conventional RNN, achieves remarkable performance in time-series prediction [25].Conventional RNNs typically face performance issues due to excessively long look-back windows and forecast horizons.SegRNN addresses this by reducing the number of iterations through segmentwise iteration (Figure 4a) and parallel multistep forecasting (Figure 4b), which can enhance the prediction performance for time series.

Linear-Based Model: DLinear
DLinear is proposed to preserve the fundamental properties of time-series data while avoiding the complexity associated with the transformer.Similar to Autoformer, DLinear uses a time-series decomposition approach, and its structure is straightforward: (1) it splits the input time series into trend and remainder components (Figure 5a), and (2) it applies a single-layer linear network (Figure 5b).The formula for this procedure is outlined in Equations ( 1)-( 3).
Internally, DLinear operates as a linear model where the weights corresponding to the seasonality and trend of the time series are multiplied by its decomposition inputs (Equations ( 2) and ( 3)).This approach allows an intuitive interpretation by analyzing the weights.In addition, the single-layer linear network reduces computational time, memory usage, and the number of parameters compared to transformer-based models, and thus efficiently performs without the need for hyperparameter tuning [21].Recent studies have suggested that this approach can predict time series better than transformer-based models [45,46].
where  is the prediction values,  ∈ ℝ and  ∈ ℝ are two linear layers,  is the future timesteps, and  is the history timesteps, as shown in Figure 3.

Impliementation Details and Model Evaluation
The models were trained using the Adam optimizer with an initial learning rate of 5 10 .Model training was performed with 100 epochs and 16 batch sizes.The default value of the Gaussian Error Linear Unit (GELU) was set to the activation function.
Model performance was evaluated using four metrics, MAE (mean absolute error), MSE (mean squared error), RMSE (root-mean-squared error), and R 2 (R-squared coefficient of determination), as shown in Equations ( 4)-( 7): where  is the number of values,  are the observed values,  are the predicted values, and  is the mean value of the observed outputs.An overview of the experiments conducted in this study is presented in Figure 6.

Results
Four models (Autoformer, DLinear, LSTM, and SegRNN) were compared to evaluate prediction performance for greenhouse conditions using three days of time-series data as input.Overall, a simple linear model, DLinear, consistently outperformed the others in most of the metrics for the prediction after 1 h and 3 h.The RNN-based model, SegRNN, showed an almost similar, but slightly lower, performance than DLinear.
Table 3 shows the predictions of the temperature, RH, and CO2 concentration inside the greenhouse after 1 h.The R 2 values of DLinear were considerably high with values of 0.938, 0.857, and 0.783 for temperature, RH, and CO2, respectively.SegRNN also showed a similar performance, but slightly lower compared to DLinear.In terms of CO2 concentration, SegRNN had the highest R 2 value of 0.875, which was 11.7% better than that of DLinear and 203% better than that of LSTM.
In contrast, both the transformer-based model, Autoformer, and another RNN-based model, LSTM, showed poor performance.The R 2 values for Autoformer were 0.744, 0.636, and 0.590 for temperature, RH, and CO2, respectively.For LSTM, these values were 0.645, 0.404, and 0.289, respectively.In each case, these values were considerably lower than those achieved by DLinear and SegRNN.Table 4 shows the predictions of the temperature, RH, and CO2 concentration inside the greenhouse after 3 h.Again, the R 2 values of DLinear were relatively high with values of 0.833, 0.680, and 0.580 for temperature, RH, and CO2, respectively.SegRNN also showed a similar performance, but slightly lower compared to DLinear.In terms of CO2 concentration, SegRNN had the highest R 2 value of 0.711, which was 22% better than that of DLinear and 299% better than that of LSTM.
In contrast, both the transformer-based model, Autoformer, and another RNN-based model, LSTM, showed worse performance.The R 2 values for Autoformer were 0.554, 0.411, and 0.488 for temperature, RH, and CO2, respectively.For LSTM, these values were 0.447, 0.253, and 0.178, respectively.In each case, these values were considerably lower than those achieved by DLinear and SegRNN.
This strong performance of DLinear and SegRNN supports their effectiveness in processing complex greenhouse time-series data and illustrates their suitability for predicting greenhouse conditions.However, there was a large decrease in the accuracy of both models for the 3 h predictions compared to 1 h predictions.The largest decrease in performance, as indicated by the R 2 values, was observed for the DLinear CO2 prediction, where the 3 h prediction decreased by 26% compared to the 1 h.Despite an 18% decrease, from 0.875 to 0.711 in R 2 values, SegRNN's CO2 prediction maintained the highest performance of all models for both 1 h and 3 h.
Figures 7 and 8 show a detailed comparison of actual and predicted values by Autoformer, DLinear, LSTM, and SegRNN.A visual analysis of these plots suggests that DLinear and SegRNN generally displayed better prediction performance compared to the other models.For temperature, both DLinear and SegRNN showed reasonable prediction.For RH, LSTM showed a prediction curve that was considerably different from the other three models and had the worst performance of all.In terms of CO2 concentration, SegRNN appeared to display the best performance, closely matching the actual values.

Discussion
This study evaluates the performance of four models-Autoformer, DLinear, LSTM, and SegRNN-in predicting time-series data for greenhouse environments.Although transformer-based models are known to perform very well in various domains, our results indicate that a simpler model, DLinear, and an RNN-based model, SegRNN, perform better in time-series prediction for greenhouses, compared to LSTM and Autoformer.This finding is in line with other recent studies that have suggested that transformer-based models may not be as effective in capturing the sequential characteristics of time-series data [21,44].Indeed, DLinear has demonstrated strong predictive performance, especially when the time-series data have a clear trend and periodicity [20].Moreover, DLinear's capability to capture short-and long-range temporal relationships in time-series data, combined with its lower computational costs due to reduced memory and parameter requirements compared to transformer-based models, could potentially make it a viable baseline model for greenhouse environment prediction.In addition, SegRNN, which is designed to overcome the limitations of conventional RNNs for time-series prediction, demonstrated superior performance to LSTM, which is known for its robust performance in numerous studies.Since only very few studies have evaluated the effectiveness of SegRNNs in predicting greenhouse environmental time series, the result of this study is expected to have important implications in this field.Nevertheless, further studies are needed to validate these findings.
The results of this study do not provide a definitive conclusion that transformerbased models and RNN-based models, LSTM in particular, are inferior to linear-based models in predicting time series for greenhouse environments.In general, transformerbased models require extensive data for training [20].However, the dataset used in this study was collected from only a single growing season.With a more comprehensive set of data collected over multiple growing seasons, the transformer-based model may perform better.The observed poor predictive performance of the transformer-based model in this study is likely due to incorrect trend prediction and over-fitting to sudden changes in the training data, which may have led to the performance degradation [47].Therefore, there is potential to improve the performance of transformer-based models with larger and more diverse greenhouse time-series datasets.In support of this, a recent study showed that an improved variant of the transformer model outperformed DLinear in time-series prediction on larger datasets [30].Research is still underway to improve the prediction of time series by developing a specialized structure of transformer variants that are focused on application scenarios and data types.
The results of this study are similar to or slightly worse than those of previous studies regarding the prediction of changes in a greenhouse environment.This may be due to the comparatively shorter data collection period in this study compared to those of other studies [9].Nevertheless, our results with DLinear and SegRNN models were satisfactory in predicting temperature and RH changes after 1 and 3 h.SegRNN was particularly effective in predicting CO2 concentrations.CO2 concentrations in greenhouses are generally difficult to predict as their concentrations fluctuate rapidly and are also influenced by the complex interactions of various environmental factors within the greenhouse [7].Factors such as photosynthetic activity, ventilation, and external weather conditions can all affect CO2 concentration, which makes them more challenging to predict than other environmental variables.Indeed, during the experimental greenhouse cultivation in this study, CO2 control had to be manually adjusted for a period of time due to external market factors.Temporary CO2 supply shortages and sudden price increases had occasionally led to restrictions on its use in the real world.Such issues pose a significant challenge to predictive models and suggest the need for a more sophisticated approach.Despite these conditions, SegRNN performed relatively well at predicting CO2 concentrations.Future research could focus on making these prediction models more robust to the factors mentioned above.This could be achieved by including additional external variables that have a direct or indirect impact on CO2 consumption.The investigation of advanced machine learning approaches for dealing with high-frequency data fluctuations in time series could also possibly improve the accuracy of predicting CO2 concentrations in greenhouse environments [48].
The implementation of time-series prediction models in greenhouse management can potentially provide significant environmental and economic benefits [49].First, integrating these models with real-time control data allows for dynamic greenhouse management that enables a rapid response to changing environmental conditions, thereby improving crop yields.More accurate predictions of environmental conditions also allow for the more efficient use of resources such as water and energy, thereby reducing waste and minimizing environmental impact.Economically, such an approach can help reduce greenhouse operating costs and increase profitability.
Finally, the applicability of our results to other types of greenhouses remains uncertain because this research focused on a Venlo-type greenhouse.Therefore, future studies need to include environmental datasets acquired from different types of greenhouses.Expanding the dataset could potentially help develop more accurate models.

Figure 1 .
Figure 1.External view of the Venlo-type experimental greenhouse located at KIST, Gangneung, Republic of Korea.

Table 1 .
Data description for model training.Thirteen features were obtained, comprising four external climate data samples, three internal climate data samples, and six actuator data samples.Each feature was collected every hour.Input Variables (Unit) Description Environmental values Outside temperature (°C) Temperature acquired from an external weather station Outside relative humidity (%) Relative humidity acquired from an external weather station Outside wind direction (°) Wind direction acquired from an external weather station Outside wind velocity (m•s −1 ) Wind speed acquired from an external weather station Temperature (°C) Air temperature acquired from an internal sensor module Relative humidity (%) Relative humidity acquired from an internal sensor module CO2 concentration (ppm) Carbon dioxide concentration acquired from an internal sensor module Actuator values Fan (on/off) Circulating fan status Fogging (on/off) Fogging valve status CO2 injection (on/off) CO2 injection valve status Window openness (%) Lee-side window opening ratio Shade curtain (%) Shading curtain opening ratio Heat retention curtain (%) Heat retention curtain opening ratio

Figure 2 .
Figure 2. Autoformer architecture: (a) discrimination between seasonals and trends before feeding the data to the decoder, allowing in-depth learning of complex temporal patterns (red block); (b)

Figure 6 .
Figure 6.Schematic representation of this study.To predict temperature, relative humidity, and CO2 concentrations inside the greenhouse after 1 and 3 h, seven environmental parameters and six actuator data samples were used.The Autoformer, DLinear, LSTM, and SegRNN models were trained and tested.

Table 2 .
Range of environmental values and actuator values.

Table 3 .
A comparison of the 1 h prediction performance of temperature, relative humidity, and CO2 concentration using the Autoformer, DLinear, LSTM, and SegRNN.The best results are shown in bold.

Table 4 .
A comparison of the 3 h prediction performance of temperature, relative humidity, and CO2 concentration using the Autoformer, DLinear, LSTM, and SegRNN.The best results are shown in bold.