Forecasting Agricultural Commodity Prices Using Dual Input Attention LSTM

: Fluctuations in agricultural commodity prices affect the supply and demand of agricultural commodities and have a signiﬁcant impact on consumers. Accurate prediction of agricultural commodity prices would facilitate the reduction of risk caused by price ﬂuctuations. This paper proposes a model called the dual input attention long short-term memory (DIA-LSTM) for the efﬁcient prediction of agricultural commodity prices. DIA-LSTM is trained using various variables that affect the price of agricultural commodities, such as meteorological data, and trading volume data, and can identify the feature correlation and temporal relationships of multivariate time series input data. Further, whereas conventional models predominantly focus on the static main production area (which is selected for each agricultural commodity beforehand based on statistical data), DIA-LSTM utilizes the dynamic main production area (which is selected based on the production of agricultural commodities in each region). To evaluate DIA-LSTM, it was applied to the monthly price prediction of cabbage and radish in the South Korean market. Using meteorological information for the dynamic main production area, it achieved 2.8% to 5.5% lower mean absolute percentage error (MAPE) than that of the conventional model that uses meteorological information for the static main production area. Furthermore, it achieved 1.41% to 4.26% lower MAPE than that of benchmark models. Thus, it provides a new idea for agricultural commodity price forecasting and has the potential to stabilize the supply and demand of agricultural products.


Introduction
Agricultural commodities play a significant role in the daily lives of people. Fluctuations in agricultural commodity prices can burden consumers and cause instability in farm household income. The abnormal climate in recent years has further aggravated fluctuations in agricultural commodity prices, making it difficult for governments to develop policies and make decisions to stabilize supply and demand [1]. The Ministry of Agriculture, Food and Rural Affairs (MAFRA), in South Korea, has been endeavoring to manage supply and demand to ensure the stability of price and farm household income by designating cabbage, radish, onion, garlic, and hot peppers grown in the field as "five vegetables sensitive to supply and demand". Stabilizing the supply and demand of agricultural commodities is difficult. However, by providing more accurate price forecasts for agricultural commodities, it is possible to reduce the risk caused by price fluctuations and ultimately achieve this goal [2].
Meteorological factors have a direct impact on agricultural production and, hence, meteorological information is essential for the prediction of agricultural commodity prices [3]. Agricultural commodities grown in the open field are more affected by meteorological agricultural prices. Wei et al. [7] performed price forecasting for various agricultural commodities using a backpropagation neural network (BPNN). They proved that the improved BPNN model is an efficient method to predict agricultural commodity prices by comparing the proposed model to the statistical model. Weng et al. [25] conducted a study to predict the price of horticultural products. In a monthly, weekly, and daily average price forecasting, the neural network recorded higher performance than the ARIMA model. Li et al. [26] conducted a study to predict the weekly retail price of eggs in the Chinese market by proposing a chaotic neural network. The performance of the chaotic neural network and ARIMA was compared, and the chaotic neural network recorded better nonlinear fitting ability and higher prediction precision in the weekly retail price of eggs. Hemageetha and Nasira [8] predicted tomato prices using a BPNN and a radial basis function neural network (RBF), which proved the superiority of the RBF model through experiments. Another type of ANN called the extreme learning machine (ELM) [27] has been applied to predict the price of agricultural commodities using various techniques. Wang et al. [28] predicted the price of corn using a hybrid model that combined the singular spectrum analysis (SSA) method and ELM. This shows that the proposed method can improve the accuracy of forecasting by better understanding the overall trend of price changes.
ANNs have limitations in modeling sequential data because they handle input data points independently without considering the correlation among input data. Because the agricultural commodity prices to be predicted in this study are time series data, it is important to model the time series characteristics of the price data. RNN specializing in learning sequential data can be trained with time series information of data through its internal cyclic structure. Long short-term memory (LSTM), a type of RNN, is considered one of the most popular methods for dealing with time series prediction problems. Shin et al. [9] used LSTM to predict the price of green onion, onion, zucchini, rice, and spinach. In their study, a predictive model was trained using various variables that affect the price of agricultural commodities, such as weather data, the rate of price increase of agricultural commodities, the previous year's yield of agricultural commodities, and the area cultivated in the previous year. They reported that their method exhibited better performance than previous time series prediction models. Jin et al. [12] predicted the price of cabbage and radish by using an STL-LSTM model that combines the STL technique and the LSTM model. The STL technique was used to solve the high seasonality of agricultural commodity price data and the "lag" phenomenon that appears in the prediction results. They reported MAPEs of approximately 7.95% and 11.25% for the price predictions of cabbage and radish, respectively.
The attention mechanism introduced in neural machine translation has the advantage of overcoming the long-term dependency and information loss of RNNs, which enables better characterization of the input data by assigning different importance to each element of the input sequence and paying attention to the more relevant input [29]. Currently, the attention mechanism is widely used in fields, such as natural language processing and computer vision, with recent attempts being made to apply it to time series prediction in various ways. In previous studies [30][31][32], different attention-based LSTM models were proposed and applied to travel time, financial time series, and temperature prediction. Yin et al. [11] applied the STL method and the attention-based LSTM model to predict the price of five agricultural commodities: cabbage, radish, onion, pepper, and garlic. However, previous studies have a common limitation of ignoring the dependency between time series data and the time series of data. To compensate for this problem, Qin et al. [33] proposed a dual-stage attention-based recurrent neural network (DA-RNN). Their proposed model could adaptively select the most relevant input features through the input and temporal attention mechanism, as well as learn the long-term temporal dependencies of the time series well. Liu et al. [34] proposed a dual-stage two-phase-based recurrent neural network (DSTP-RNN) to identify spatial correlations and temporal relationships.
Both the DA-RNN and DSTP-RNN models, which use the dual attention mechanism, have an encoder-decoder structure, and, instead of applying feature and temporal attention to the input at the same time, attention is applied to the input and context vector. No published study has reported application of the dual attention mechanism to agricultural commodity price prediction. This study incorporates the feature attention layer and temporal attention layer, which were designed to identify feature correlations and temporal relationships of the input data. Three previous studies used price and meteorological data as input variables [9,11,12], and this directly affected the growth of agricultural commodities. However, those studies only considered the meteorological information of the main production area to a limited extent. In this study, a dynamic main production area selection method was developed to determine the effects of meteorological conditions more accurately.
The contributions of this study are as follows.
(1) Price, data, trading volume data, and meteorological data, which are rarely used in previous studies because they are difficult to handle, are used as input variables. (2) A dual input attention LSTM (DIA-LSTM), that concurrently applies feature and temporal attention, an upgraded version of the existing sequentially applied dual attention mechanism, is proposed. The proposed model is shown to provide 1.41% to 4.26% higher performance than benchmark models. (3) Considering the real situation, the meteorological information for the dynamic main production area is used. The performance of the model using the meteorological information for the dynamic main production area is shown to be an improvement of approximately 2.8 to 5.5% compared to the conventional model using the meteorological information for the static main production area.
The remainder of this paper is organized as follows. Section 2 introduces the proposed dual attention LSTM after explaining the data used in this study. Section 3 describes the performance metrics used in the experiment and discusses the experimental design and results. Finally, Section 4 presents the conclusions of this study and future research directions.

Wholesale Price of Agricultural Products
The data for the prices of cabbage and radish were downloaded from the Outlook & Agricultural Statistics Information System (OASIS) [35], provided by the Korea Rural Economic Institute (KREI) and Korea Agricultural Marketing Information Service (KAMIS), as well as by the Korea Agro-Fisheries & Food Trade Corporation (aT) [36]. OASIS and KAMIS provide daily prices for cabbage and radish. To predict the monthly average price, this study calculated and used the average value of the monthly grouped prices. The changes in the average monthly price of cabbage and radish are shown in Figure 1, indicating unstable fluctuations in the price. The average price of the previous month, exponential moving average (EMA), relative strength index (RSI), Williams %R, and median price, used as investment indicators, were used as derived variables.
Among them, the average price means the average value of the remaining three prices after removing the highest and lowest prices from the month's price for the last five years. The open and close values used to calculate Williams %R were the prices on the first and last days of each month.

Trading Volume of Agricultural Products
The price of agricultural commodities is affected by the yield. Although it is desirable to use the yield of agricultural commodities as an input variable for the prediction model, it is difficult to apply the production data to the monthly price prediction because the statistics on production are published annually. Therefore, in this study, the trading volume of agricultural commodities was used instead of the yield. The agricultural products distribution system (NongNet) of aT provides daily trading volumes from each wholesale market. Among them, the average price means the average value of the remaining three prices after removing the highest and lowest prices from the month's price for the last five years. The open and close values used to calculate Williams %R were the prices on the first and last days of each month.

Trading Volume of Agricultural Products
The price of agricultural commodities is affected by the yield. Although it is desirable to use the yield of agricultural commodities as an input variable for the prediction model, it is difficult to apply the production data to the monthly price prediction because the statistics on production are published annually. Therefore, in this study, the trading volume of agricultural commodities was used instead of the yield. The agricultural products distribution system (NongNet) of aT provides daily trading volumes from each wholesale market.
In this study, the trading volume data provided by NongNet were divided into the national wholesale market, Garak market, and top five local market trading volumes. The national wholesale market trading volume is the sum of all wholesale markets. The Garak and top five local market trading volumes refer to the quantity brought in from a specific wholesale market. Among the numerous wholesale markets nationwide, the Garak and top five local markets play an important role in the daily lives of ordinary people and the distribution of agricultural and fishery products. Therefore, in this study, the Garak and the top five local market trading volumes were separately extracted and used as input variables for the model. Garak market is a wholesale market for agricultural and marine commodities located in Garak-dong (Songpa-gu, Seoul, South Korea), and the Garak market trading volume refers to the quantity of agricultural and marine commodities brought into Garak market. The top five local markets refer to the wholesale markets for agricultural and marine commodities in Eomgung-dong, Busan; Gakhwa-dong, Gwangju; Guwol-dong, Incheon; Buk-gu, Daegu; and Ojeong-dong, Daejeon. The trading volume from the top five local markets is equal to the sum of the trading volumes from the aforementioned five wholesale markets. The daily data on the trading volumes of cabbage and radish were grouped monthly, and the sum was calculated and used. In this study, the trading volume data provided by NongNet were divided into the national wholesale market, Garak market, and top five local market trading volumes. The national wholesale market trading volume is the sum of all wholesale markets. The Garak and top five local market trading volumes refer to the quantity brought in from a specific wholesale market. Among the numerous wholesale markets nationwide, the Garak and top five local markets play an important role in the daily lives of ordinary people and the distribution of agricultural and fishery products. Therefore, in this study, the Garak and the top five local market trading volumes were separately extracted and used as input variables for the model. Garak market is a wholesale market for agricultural and marine commodities located in Garak-dong (Songpa-gu, Seoul, Korea), and the Garak market trading volume refers to the quantity of agricultural and marine commodities brought into Garak market. The top five local markets refer to the wholesale markets for agricultural and marine commodities in Eomgung-dong, Busan; Gakhwa-dong, Gwangju; Guwol-dong, Incheon; Buk-gu, Daegu; and Ojeong-dong, Daejeon. The trading volume from the top five local markets is equal to the sum of the trading volumes from the aforementioned five wholesale markets. The daily data on the trading volumes of cabbage and radish were grouped monthly, and the sum was calculated and used.

Meteorological Data
Because cabbages and radishes are mainly grown in open fields, their yields are greatly affected by meteorological conditions [2]. Changes in the yield also affect changes in the price. Therefore, in this study, meteorological data provided by the Korean Meteorological Administration (KMA) was used as an input variable for the model. Meteorological indicators used include average temperature and humidity, accumulated precipitation, and typhoon advisory and warning in the main production area. Typhoon advisories and warnings have binary values indicating whether or not they are issued. The day typhoon advisories and warnings are issued is marked as 1, and a value counting the number of occurrences per month was used.
Meteorological data are generally provided by region; however, it is difficult to use all of this data in practice. Not all meteorological conditions in all regions affect the cultivation of specific agricultural commodities. In this study, the meteorological conditions of the main production area, where cabbages and radishes were grown, were examined. To use the meteorological data of the main production areas, it is necessary to know where the Agriculture 2022, 12, 256 6 of 18 main production areas for each agricultural commodity are located. Although the Korean Statistical Information Service (KOSIS) [37] provides information on the main production areas of agricultural commodities every year, it is difficult to explain the main production areas that change according to the seasons because the data are provided annually. In the previous study [11], the main production area of agricultural commodities by harvest time provided by aT was used. In this method, different main production areas were used for each cropping season, but the same main production area is applied to the same cropping season in different years. Using this method, the selected main production area can be defined as a static main production area. However, the main production area may change slightly over time owing to climate change or urban development. To solve this problem, this study proposes a method for selecting a dynamic main production area. An area with a high yield based on its monthly yield, where agricultural commodities are grown, is selected as the dynamic main production area. In this study, the three regions with the highest yields were selected as the main production areas based on the yield a year back. Table 1 shows the main production areas of radish in July-September 2015, when the static and dynamic main production areas selection method were used. Data were collected from September 2013 to May 2021 for the price, trading volume, and weather. Among them, data from September 2014 to January 2021 based on price were used as training data, and the remaining data from February 2021 to May 2021 were used as test data. Fixed data sizes were used for model training. Specifically, data from September 2014 to December 2020 were used as input data, and data up to January 2021 were used as target data. Data were forecasted a month ahead for testing the model by using actual observed past data. For example, to predict the price in February 2021, real data from previous months, such as January 2021, December 2020, etc., were used as inputs to the model. To predict the price during March 2021, the observed real data up until February 2021 were used as inputs to the model. The number of months of past data used to predict future prices depended on the hyperparameter of the time step.

Proposed Dual Input Attention LSTM (DIA-LSTM)
The dual input attention LSTM (DIA-LSTM) model proposed in this study predicts the price of the next month using various variables that affect agricultural commodity prices. The n variables that affect the price of agricultural commodities can be expressed as I = x 1 , x 2 , · · · , x n , where x n means the time series for the n-th variable that affects the price. That is, X = (x 1 , x 2 , · · · , x T ) ∈ R n×T , where T is the length of the time step (or window size). That is, the price of the next month is predicted using data from the past T months. The k-th input variable whose time step is T is expressed as The DIA-LSTM model uses the price of past agricultural commodities, y = (y 1 , y 2 , · · · , y T ), with y t ∈ R and past values of n input variables, (x 1 , x 2 , · · · , x T ) with x t ∈ R n , to predict the price value of the next time step. For instance, next year, it will be y T+1 ∈ R. This is expressed in Equation (1), where M means the proposed DIA-LSTM.ŷ T+1 = M(x 1 , x 2 , · · · , x T , y 1 , y 2 , · · · , y T ). (1) DIA-LSTM consists of a feature attention layer, temporal attention layers, and a recurrent prediction layer. The structure is shown in Figure 2. The feature attention layer learns feature correlation in the input data X = (x 1 , x 2 , · · · , x t ), and the temporal attention layer models the temporal relationship based on the transposed input data X = ( f 1 , f 2 , · · · , f n ). The output of the feature and temporal attention layer is generated by doing an element-wise multiplication (denoted by * ) of the attention weights with the input data. The recurrent prediction layer predicts the final result value using the combined value of the output from the feature and temporal attention layers.
prices. The variables that affect the price of agricultural commodities can be expressed as , , ⋯ , , where means the time series for the -th variable that affects the price. That is, , , ⋯ , ∈ ℝ , where is the length of the time step (or window size). That is, the price of the next month is predicted using data from the past months. The -th input variable whose time step is is expressed as , , ⋯ , ∈ ℝ , and the values of input variables at time is expressed as , , ⋯ , ∈ ℝ . The DIA-LSTM model uses the price of past agricultural commodities, , , ⋯ , , with ∈ ℝ and past values of input variables, , , ⋯ , with ∈ ℝ , to predict the price value of the next time step. For instance, next year, it will be ∈ ℝ. This is expressed in Equation (1), where ℳ means the proposed DIA-LSTM.
DIA-LSTM consists of a feature attention layer, temporal attention layers, and a recurrent prediction layer. The structure is shown in Figure 2. The feature attention layer learns feature correlation in the input data , , ⋯ , , and the temporal attention layer models the temporal relationship based on the transposed input data X , , ⋯ , . The output of the feature and temporal attention layer is generated by doing an element-wise multiplication (denoted by * ) of the attention weights with the input data. The recurrent prediction layer predicts the final result value using the combined value of the output from the feature and temporal attention layers. In previous studies, the dual attention mechanism was applied to the deep learning model of the encoder-decoder structure. Specifically, attention mechanisms were applied In previous studies, the dual attention mechanism was applied to the deep learning model of the encoder-decoder structure. Specifically, attention mechanisms were applied to the input and temporal axes in the encoder and decoder, respectively, and attention weights were calculated using LSTM and softmax. In this study, instead of using the encoder-decoder structure, a structure was used to input the input data into the LSTM model by concatenating the results of applying the attention mechanism to the input and temporal axes of the input data. In addition, a simple attention weight calculation method using a single linear layer and softmax was used. The difference between the structure of the existing dual attention mechanism and the proposed structure is shown in Figure 3.
x 1 , x 2 , · · · , x n denotes the inputs of the attention mechanism, and (a 1 , a 2 , · · · , a n ) is the attention weights obtained by the attention mechanism. The output can be generated by doing an element-wise multiplication (denoted by ×) of the attention weights with the inputs. poral axes of the input data. In addition, a simple attention weight calculation method using a single linear layer and softmax was used. The difference between the structure of the existing dual attention mechanism and the proposed structure is shown in Figure 3.
, , ⋯ , denotes the inputs of the attention mechanism, and ( , , ⋯ , ) is the attention weights obtained by the attention mechanism. The output can be generated by doing an element-wise multiplication (denoted by ) of the attention weights with the inputs.
(a) (b) The feature attention layer (Appendix A.3) and temporal attention layer (Appendix A.4) were implemented by simplifying the single-layer perceptron. This was inspired by the self-attention mechanism (Appendix A.2) that can construct attention using only input values. Attention weights were applied to each input variable in the feature attention layer, as well as to each time step in the temporal attention layer.
The recurrent prediction layer consists of a single-layer stateful LSTM (Appendix A.1) and two fully connected layers (denoted by FC in Figure 2). The stateful LSTM model means that the hidden state ℎ learned in the current time step is transferred to the initial state during the next learning. The LSTM model receives the concatenation of the output of the feature attention layer and the output of the temporal attention layer. Subsequently, dropout is applied to the output of LSTM, and, after flattening (denoted by Flatten in Figure 2), it is input to the fully connected layer (FC). Table 2 shows the hyperparameter settings used in each layer. To predict the final single real value, the number of neurons in the last fully connected layer is set to 1.  The feature attention layer (Appendix A.3) and temporal attention layer (Appendix A.4) were implemented by simplifying the single-layer perceptron. This was inspired by the self-attention mechanism (Appendix A.2) that can construct attention using only input values. Attention weights were applied to each input variable in the feature attention layer, as well as to each time step in the temporal attention layer.
The recurrent prediction layer consists of a single-layer stateful LSTM (Appendix A.1) and two fully connected layers (denoted by FC in Figure 2). The stateful LSTM model means that the hidden state h t learned in the current time step is transferred to the initial state during the next learning. The LSTM model receives the concatenation of the output X f of the feature attention layer and the output X t of the temporal attention layer. Subsequently, dropout is applied to the output of LSTM, and, after flattening (denoted by Flatten in Figure 2), it is input to the fully connected layer (FC). Table 2 shows the hyperparameter settings used in each layer. To predict the final single real value, the number of neurons in the last fully connected layer is set to 1.

Training Procedure
An Adam optimizer with a learning rate of 0.001 was used to train the model. To train the stateful LSTM, the size of the minibatch was set to 1, which was the highest common factor of the training and test data. Because DIA-LSTM is end-to-end differentiable, the parameters of the model can be learned through the backpropagation algorithm with the mean squared error as an objective function, as shown in Equation (2), where O means the objective function.
In Equation (2), N denotes the number of training samples,ŷ i T+1 is the value predicted by DIA-LSTM, and y i T+1 denotes the actual observed price value.

Results
This section describes the performance evaluation metrics used in the experiment and the experimental method to measure the performance of the proposed model. In this study, three experiments were conducted. The first experiment was to find the most suitable time step for the proposed DIA-LSTM model. The second experiment compares the performance of the model using meteorological data for the dynamic main production area and the model using meteorological data for the static main production area. The third experiment compares the performance of the DIA-LSTM model proposed in this study with the models proposed in other studies.

Evaluation Metrics
In this study, two different evaluation metrics were used to evaluate the performance of the model: root mean square error (RMSE) and mean absolute percentage error (MAPE). RMSE measures the difference between the real value and the predicted value. The definition is as follows: MAPE is a widely used metrics in time series prediction and expresses the error between the real value and the predicted value as a percentage. The definition of MAPE is as follows: In Equations (3) and (4), N is the number of data samples, andŷ i and y i are the real and predicted values of the i-th sample data, respectively. RMSE is obtained by subtracting the real value from the predicted value of each data sample. Subsequently, the average of the square values is calculated, and the root operation is performed. MAPE is obtained by calculating the absolute value after dividing the value obtained by subtracting the real value from the predicted value of each sample by the real value again. MAPE is more intuitive as it expresses the error as a percentage, regardless of the scale of the numbers it predicts.

Optimal Time-Step Search
In time series prediction, the time step is a hyperparameter that determines how many past data samples are used to predict future data; the optimal time step may differ depending on the task to be solved. In previous studies [11,32], several candidate values were set and a grid search was conducted to find the optimal time step. Different optimal time-step values were used for these studies. In this study, an experiment was conducted to find the most suitable time step for the data of two agricultural commodities (cabbage and radish).
In this experiment, the model was trained and performance was measured while changing the time step of the proposed DIA-LSTM. To find the optimal time step, a grid search was performed by setting the time step T ∈ {1, 2, 4, 6, 8, 12}. Table 3 shows the performance measurement results of the proposed model when the grid search was performed for time step T.
Consequently, both agricultural commodities recorded the lowest MAPE and RMSE when t = 6. In a two-dimensional rectangular coordinate system, the graph was plotted with the x-axis as time step and the MAPE value as the y-axis, as shown in Figure 4. In the graph, both cabbage and radish had the lowest MAPE with t = 6, and the error rate gradually increased when the time step became smaller or larger than t = 6. This indicates that too small or large time steps in time series prediction can negatively affect model performance. If the time step is too small, it is difficult to learn sufficient information from past data. As the prediction is performed using only one data sample when t = 1, the characteristics of the time series cannot be ascertained. Conversely, an increase in the MAPE with an increase in the time-step value may have been due to a decrease in the number of training data as the time-step value increased. This means that there were insufficient data to proceed with learning. In particular, because the size of the dataset used in the experiment was small, a large time step resulted in insufficiency in the training of the model.  Consequently, both agricultural commodities recorded the lowest MAPE and RMSE when 6. In a two-dimensional rectangular coordinate system, the graph was plotted with the x-axis as time step and the MAPE value as the y-axis, as shown in Figure 4. In the graph, both cabbage and radish had the lowest MAPE with 6, and the error rate gradually increased when the time step became smaller or larger than 6. This indicates that too small or large time steps in time series prediction can negatively affect model performance. If the time step is too small, it is difficult to learn sufficient information from past data. As the prediction is performed using only one data sample when 1, the characteristics of the time series cannot be ascertained. Conversely, an increase in the MAPE with an increase in the time-step value may have been due to a decrease in the number of training data as the time-step value increased. This means that there were insufficient data to proceed with learning. In particular, because the size of the dataset used in the experiment was small, a large time step resulted in insufficiency in the training of the model.

Dynamic Main Production Area
To prove the superiority of the proposed dynamic main production area selection method, an experiment was conducted to compare the performance of a model using the meteorological data of the dynamic main production area, dynamically selected based on the yield, and a model using the predefined meteorological information of the main production area. The predefined main production area was adopted from a previous study [11], and all other data and parameter settings, except the main production area, were kept consistent. Table 4 shows the performance of the proposed model when the meteorological data of the dynamic and static main production areas are used.
For both cabbages and radishes, the model demonstrated higher performance when using the meteorological data of the dynamically selected main production area than that of the meteorological data of the predefined main production area. For cabbages, the MAPE when the dynamic main production area was used was 4.39%, which was approximately 5.51% lower than when the static main production area was used. For radishes, the MAPE when the dynamic main production area was used was 2.13%, which was improved by approximately 2.8% compared to when the static main production area was used.

Comparison with Benchmark Models
To verify the performance of the proposed model, various time series prediction models proposed in previous studies were selected as benchmark models, and an experiment was conducted to compare their performance. The benchmark models used for performance comparison are as follows.
Simple LSTM: The LSTM model proposed by Hochreiter and Schmidhuber [38] is often used for time series prediction owing its excellent long-term dependency learning ability. In this study, the part of the recurrent prediction layer that eliminated the feature and time attention layers from the DIA-LSTM model was used as a simple SLTM model.
GCN-LSTM: The GCN-LSTM model is based on the T-GCN model structure proposed by Zhao et al. [39]. T-GCN combines the graph convolutional layer [40] and the GRU model and has been applied to traffic prediction. In the GCN-LSTM used in this study, the GRU model of T-GCN was replaced with LSTM, and the dropout and density layers were added.
STL-ATTLSTM: The STL-ATTLSTM proposed by Yin et al. [11] is a model that combines the STL preprocessing technique and the attention mechanism-based LSTM model. The study predicts the prices of five crops, including cabbage and radish, using a variety of input variables.
DA-RNN: The DA-RNN model proposed by Qin et al. [33] is an encoder-decoder structure model, which consists of an encoder with an input attention mechanism applied and a decoder with a temporal attention mechanism. The proposed DA-RNN model recorded an impressive performance in indoor temperature prediction and stock price prediction.
Each model was tested using the same training and test datasets. Table 5 shows the results of measuring the performance of each model using RMSE and MAPE. Table 5 shows the results of comparing the performance of the proposed DIA-LSTM model with the benchmark models. As shown in Table 5, the proposed DIA-LSTM model recorded the lowest RMSE and MAPE for cabbage and radish. Among the tested models, the average MAPE of simple LSTM was the highest, at 7.55%. This may have been due to the relatively simple LSTM model having weak learning ability compared to other benchmark models. Subsequently, GCN-LSTM recorded the second-highest error rate, with an average MAPE of 6.71%. Compared to the time series prediction method with the attention mechanism, the graph convolutional layer optimized for learning spatial information seems to have limitations in learning the characteristics of agricultural commodities with strong time series characteristics. STL-ATTLSTM and DSA-LSTM with an attention mechanism recorded an average MAPE of 4.67% and 4.90%, respectively. This is a lower error rate compared to the errors of simple LSTM and GCN-LSTM. Based on the results, the dual-stage attention-based model with the STL preprocessing technique and attention mechanism of STL-ATTLSTM, as well as the encoder with an input attention mechanism and a decoder with a temporal attention mechanism of DA-RNN, showed excellent performance. The reason for these results is that the attention mechanism used in the two studies can learn the time series characteristics of the input data and the characteristics of the input variables well. The DIA-LSTM model proposed in this study recorded the lowest error rate, of 3.26%.
The model that combines the time and feature attention layers with LSTM was proven to be superior in solving the agricultural commodity price prediction problem.

Conclusions
This study introduces the feature and temporal attention layers that could capture feature correlations and temporal relationships for input variables, respectively, by applying the attention mechanism. Furthermore, a DIA-LSTM model combining two attention layers and an LSTM was proposed to predict the monthly price of cabbage and radish. The proposed model utilizes not only vegetable prices but also trading volumes from various markets, such as the national wholesale and top five local market trading volumes, and meteorological information for the main production areas. For the selection of the main production areas, the top three regions with high production volumes were dynamically selected as the main production areas, rather than using pre-defined static main production areas, as done by previous studies. Consequently, the performance of the model using the meteorological information of the dynamic main production area recorded approximately 5.51% and 2.8% lower error rates for cabbage and radish, respectively, than the model that uses predefined meteorological information of the static main production area. The proposed DIA-LSTM model averaged approximately 3.26% MAPE, with an error rate of approximately 1.41% to 4.26% lower than that of benchmark models.

Conclusions
This study introduces the feature and temporal attention layers that could capture feature correlations and temporal relationships for input variables, respectively, by applying the attention mechanism. Furthermore, a DIA-LSTM model combining two attention layers and an LSTM was proposed to predict the monthly price of cabbage and radish. The proposed model utilizes not only vegetable prices but also trading volumes from various markets, such as the national wholesale and top five local market trading volumes, and meteorological information for the main production areas. For the selection of the main production areas, the top three regions with high production volumes were dynamically selected as the main production areas, rather than using pre-defined static main production areas, as done by previous studies. Consequently, the performance of the model using the meteorological information of the dynamic main production area recorded approximately 5.51% and 2.8% lower error rates for cabbage and radish, respectively, than the model that uses predefined meteorological information of the static main production area. The proposed DIA-LSTM model averaged approximately 3.26% MAPE, with an error rate of approximately 1.41% to 4.26% lower than that of benchmark models.
Fluctuations in agricultural commodity prices affect the supply and demand of agricultural commodities and have a significant impact on consumers and farmers. Fluctuations in agricultural prices leads to uncertainty in the consumer's daily consumption budget and income instability for the farmer. As a result of abnormal climate patterns, price fluctuations of agricultural products have intensified, making it difficult for the government to establish policies and stabilize supply and demand. The agricultural commodity price prediction model proposed in this study will help stabilize the supply and demand of agricultural products through more accurate predictions, thereby reducing the risk of price fluctuations.
The empirical results in this study are constrained by the lack of data and the fluctuation of prediction results. Monthly predictions about the agricultural commodity prices were made using data from September 2013 to May 2021, resulting in insufficient data to train a deep learning model. Sufficient price data can be collected from the 2000s, but the problem lies with the meteorological data, which are only available from 2012. Additional meteorological data should be obtained from the Meteorological Society or other agencies. The volume of data can also be increased by using weekly price forecasts instead of monthly forecasts. Although the prediction accuracy of the proposed model is relatively high, there are still large fluctuations between individual predictions. The stability of the model's predictions can be improved by increasing the number of training variables that have an impact on prices, such as the export and import volumes of agricultural commodities.