Deep Learning Predictor for Sustainable Precision Agriculture Based on Internet of Things System

: Based on the collected weather data from the agricultural Internet of Things (IoT) system, changes in the weather can be obtained in advance, which is an effective way to plan and control sustainable agricultural production. However, it is not easy to accurately predict the future trend because the data always contain complex nonlinear relationship with multiple components. To increase the prediction performance of the weather data in the precision agriculture IoT system, this study used a deep learning predictor with sequential two ‐ level decomposition structure, in which the weather data were decomposed into four components serially, then the gated recurrent unit (GRU) networks were trained as the sub ‐ predictors for each component. Finally, the results from GRUs were combined to obtain the medium ‐ and long ‐ term prediction result. The experiments were verified for the proposed model based on weather data from the IoT system in Ningxia, China, for wolfberry planting, in which the prediction results showed that the proposed predictor can obtain the accurate prediction of temperature and humidity and meet the needs of precision agricultural production.


Introduction
The process of adopting innovation, especially with regard to precision farming (PF), is inherently complex and social and influenced by producers, change agents, social norms, and organizational pressure. Vecchio, Y et al. [1] conducted an empirical analysis on the preliminary results of Italian farmers and found that increasing awareness of using precision farming (PF) tools is very meaningful, and future research should focus on innovations and solutions that offer environmental sustainability. In this process, the application of the Internet of Things system is a very important aspect. It will reduce the amount of information required by farmers as decision makers and thus improve the overall agricultural level. In addition, severe weather has a great impact on agricultural development. Planning in advance can effectively reduce losses by forecasting the weather in the medium-and long-term. At the same time, it also has guiding significance for farm management and agricultural insurance [2,3].
Internet of Things (IoT) technology enables sensors to collect data and has provided important technologies for a variety of intelligent systems. The precision agriculture system is one of the most (1) Trend component: This refers to the main trend direction of temperature data. This part often includes the trend of linear growth and decline. The trend component reflects the changes of temperature over a long period of time.
(2) Period components per day: The temperature data have obvious period characteristics in 1 day; that is, the value during the day is higher and that at night is lower.  The third sub-figure, Figure 1c, shows the period component per day. We can see that the temperature variation of the daytime temperature has obvious periodicity with 24 hours. To show the period each day clearly, we show data for 10 days, about 240 hours. The figure shows that from the early hours of the morning, the temperature rises, and after noon, the temperature begins to drop.
On the other hand, the bottom sub-figure Figure 1d shows the period per year, where the obvious changes in the four seasons can also be found. The average temperature in winter is lower than the average temperature in summer. Therefore, during the year, the temperature changes have two periods: the four seasons rotation and the day to night changes. Similarly, other weather data, such as relative humidity, have the same pattern of change. Because a network still cannot effectively extract the complex nonlinearity of such multicomponent data, researchers have proved that decomposition is an effective method to develop predictions; that is, the data are decomposed into multiple components to reduce their complexity, and then multiple sub-models are used to improve the prediction performance. For example, García et al. [25] applied a decomposition procedure to decompose data sequentially into smaller seasonal component patterns. A trend was compared to recorded changes in land use at varying distances from a city to determine their possible influence on pollen-count variations, and the decomposition proved highly effective for extracting trend components from time sequential data. Jesús et al. [26] divided pollen concentration data sequentially into seasonal and residual parts by the decomposition method, used partial least squares regression to fit the residuals, and established an airborne pollen time sequential model to predict the daily pollen concentration. Ming et al. [27] extracted accurate seasonal signals and used maximum likelihood estimation to estimate the trend of the seasonally adjusted time sequential data, which improved the prediction accuracy of the data. Qin L et al. [28] combined seasonal-trend decomposition procedures based on LOESS (STL) with an echo state network (ESN) for passenger flow prediction, in which two passenger flow forecasting applications based on air data and railway data, respectively, were conducted to verify the effectiveness and scalability of the proposed approaches.
We continued in this decomposition manner, and our innovative contributions are as follows: (1) Based on the characteristics of weather data, we decomposed the data with sequential twolevel structure to find out its periodicity of per day and per year. Comparing with [26][27][28], the proposed two-level decomposition structure can more effectively extract the periodic features of weather data, simplify the complexity of decomposition components, and improve the performance of the final prediction results; (2) We present a general prediction framework for the IoT system that obtains accurate prediction of weather information, in which sub-predictors are designed based on four GRUs for the decomposed trend, periods, and residual components. Using the pick-up-data method, the input and output dimension was reduced to obtain long-term prediction for the following 30 days, and by expanding the prediction of the next day, the sub-prediction results were combined to obtain the accurate hourly prediction of temperature and humidity for the next day.
The rest of the paper is organized as follows. Section 2 discusses the research objective and introduces the data for experiments. The predictor is proposed in Section 3, especially the two-level decomposition and the prediction model structure. Section 3 gives the experiment results of the temperature and humidity data from the wolfberry plantation in Ningxia, China, and the results highlight applicability value of the proposed model. Finally, Section 4 summarizes and concludes the paper.

Research Objectives and Data Description
The IoT system was used for a wolfberry plantation in Ningxia province, China. Ningxia is one of the largest wolfberry planting areas in China. The area is a major crop for local farmers. At present, the planting area of Ningxia is 1 million mu, accounting for 33% of the total area of the whole country. The survival and growth of wolfberry plants are closely related to environmental factors such as temperature, humidity, etc. To understand and predict these weather factors is extremely important for the role and impact of precision cultivation of wolfberry.
According to future weather information, the planters can adjust the planting and picking plan, make full use of the advantages of local natural resources, and maintain the sustainable development of the planting industry. Figure 2 shows the IoT system for precision agriculture. The IOT system is mainly composed of five parts: sensors, a display board, a computer, a controller, and an irrigation actuator. Because the planting is outdoor, we constructed an IoT system with a battery-powered wireless temperature and humidity sensor to collect the temperature and humidity data, being the data transmitted to the data center (a computer) for storage. Furthermore, a large quantity of stored data was used to train the deep learning model to give an accurate prediction of future temperature and humidity. The display board was mainly used to display the current weather conditions. The controller was used to control the irrigation actuator.
Based on the actual application needs, the requirements for the prediction results include the following two-terms prediction: (1) Medium-term prediction: Providing accurate predictions of temperature and humidity for the next 24 hours; (2) Long-term prediction: Providing average daily temperature and humidity for the next 30 days.
The former is used to guide the next day's irrigation plan to ensure the effective use of water resources. Based on the accurate predictions of temperature and humidity in the next 24 hours, the real-time irrigation time and irrigation volume are dynamically determined, and finally the automatic irrigation can be realized by using the irrigation control.
The latter is used to plan fertilization, harvesting, and picking, etc. Based on the accurate predictions of weather changes, these plans can improve agricultural sustainability.
To verify the prediction model, the temperature and humidity data were used with a total of 28,320 records from January 2016 to December 2017.

Model Framework
The model has three parts, i.e., decomposition, prediction, and combination. The prediction framework is shown in Figure 3. We used a two-level decomposition in which the original data were decomposed into four components. Then, each component was treated separately to obtain different GRU sub-predictors in the network training stage. In the prediction stage, different GRUs were used to predict the different components, respectively. Lastly, all the predictions were combined to get the final predicted results in the output node.

Sequential Two-Level Decomposition
A sequential two-level decomposition was used to decompose the raw time series data. The firstlevel decomposition period was 24 hours and the trend, period per day, and residual were obtained. Because the residual obtained by the first-level decomposition still had periodicity, we used the second-level decomposition to decompose the residual into three further components. Figure 4 shows the details of the decomposition node in Figure 3. By the first-level decomposition, the weather data, such as the temperature and humidity, were decomposed to the trend t TD , the period per day

First-Level Decomposition
Assume that the time-sequential data t D has N data, which means t =1, 2, …, N. The relation with t Y and its three independent components, i.e., trend, period per day, and residual, is shown in (1) Set the period for the first-level decomposition as 1 day, i.e., 24 hours. For the data used in this study, that means 24 samples. Calculate the number of periods by

Second-Level Decomposition
We used a similar method for the second-level decomposition with the data as decomposed t RD . Lastly, we obtained the mean value for each day. The relation with t RD and its three independent components, i.e., trend, period per year, and residual is the following.

Deep Learning Predictor
Two trends, i.e., Using the known input and output data, four GUR networks were trained by using supervised learning.

Sub-predictor GRU
The GRU network consists of multiple GRU cells. We set the number of layers as 2. Shown as Figure 5, t S , 1,2,..., t n  is the input of the GRU network, and t n S  , 1, 2,..., t n  is the output. The GRU uses the update gate to control the degree to which the state information of the previous moment is brought into the current state. The update gate and the reset gate were used to model the relation of input and output data. The forward propagation formulas in each GRU cell are as follows [29]: used the data from the historical 24 hours to predict the data of the future 24 hours. The method proposed in this paper can be combined with other system identification methods [30][31][32] to study the modeling and prediction of other dynamic time series and random systems [33,34] and can be applied to other fields [35][36][37] and other signal modeling and control systems [38][39][40][41].

Long-term Prediction
As we mention in Section 2, long-term prediction provides the average daily temperature and humidity for the next 30 days. The period per year component of temperature and humidity of 30 days was used as the input and output data for training the GRU. The trend and residual components were used as compensation for the prediction.
We know from Section 3.2.2 that the period component per year t PY is obtained by hourly data. By the picked-up period per year t PPY , the input and output data were reduced to 30. We organized the dataset for training by pushing back one step. The pushing process is shown in Figure  7, in which the data with the blue bars are input data, and the orange are output. Figure 7a shows the input data from 1st to 30th point and output data from the 31st to 60th point of t PPY and Figure 7b is the input data from the 2nd to 31st, and output data from the 32nd to 61st point of t PPY . Table 1 gives the 600 sets of input and output data used for training the sub-predictor GRU. The advantage of this overlapping input and output is that whenever new data are acquired, we can scroll forward based on the new data.   1  17  33  49  65  81  97  113  129  145  161  177  193  209  225  241  257  273  289  305  321  337  353  369  385  401  417  433  449  465  481  497  513  529  545  561  577  593  609  625  641 period per year

PPY
Once the parameters of sub-predictor GRU were obtained, by the test input data, the period per year could be predicted. By adding the mean of trend and residual component, we could obtain the long-term prediction with the average daily temperature and humidity of the next 30 days.

Medium-term Prediction
Medium-term prediction provides the accurate predictions of temperature and humidity for the next 24 hours. The trend component t T , the period per day t PD and the residual t RY were used to train the other three GRUs to as the sub-predictors. The input and output data were set as 24 points. The dataset for training was organized by pushing back 24 steps. Tables 2-4 give the 600 sets of input and output data used for training.

RY
With the training data shown in Tables 2-4, we could obtain three GRUs to predict the trend component t T , the period per day t PD and the residual t RY for the next 24 hours. Based on the long-term prediction with average daily temperature for the next 30 days, we expended the next day's value to 24 hours' average daily temperature (shown in Figure 8 in the top sub-figure with a yellow star). Then, the predictions were added, including trend component, daily mean value from the period per year, period per day, and residual for the next 24 hours, to obtain the combined prediction.  Predictions by sub-predictors trend daily mean period per day residual + This is the long-term prediction w ith average daily temperature for the next 30 days. Then w e expend the next day's value to 24 hours' average daily temperature.
The predictions are added, including trend component, daily mean value from the period per year, period per day and residual for the next 24 hour s, to obtain the combi ned prediction.

Experimental Setup
The data used for training model were the collected hourly temperature and humidity data with about 35,040 records from January 2016 to December 2017. In the experiments, the ratio of the training set to the test set was 80:20.
The experiment hardware and software environments were set up to run the proposed prediction model. The open source deep learning library Keras, based on TensorFlow, was used to build all learning models. All experiments were performed on a PC with an Intel ® CORE™ CPU i5-4200U 1.60 GHz and 4 GB of memory. In order to model the deep neural network effectively, a large number of hyper parameters need to be set. In experiments, the default parameters in Keras were used for deep neural network initialization (e.g., weight initialization). We also used tanh as the activation function and ReLu as the activation function of the GRU model.
Usually, when we use neural networks to build models, the size of the network layer and the number of neurons is not strictly defined. Instead, the complexity of the model structure is determined based on the data. We determined the parameters of each layer of the model through multiple experimental adjustments. Specifically, we used the ReLu function. There were two GRU layers: the first layer had 30 neurons, and the second layer had 30 for predicting the picked-up period per year, and other GRUs had 24 neurons for these two layers (the number of neurons in the layers is determined by the output dimension of the model). In addition, all models underwent supervised training by using the Adam algorithm, which optimizes a predetermined objective function to obtain model parameters. Figure 9 shows the long-term prediction of the temperature, in which the blue dotted line is the temperature data, the orange line is the average daily temperature, and the gray line is the predictions of the next 30 days. We can notice that at the first several days that the predictions are very close to the average daily temperature, while at the 10th day, the prediction is lower than the average daily temperature, and at the 15th and 23rd days, the prediction is higher than the average daily temperature. We can conclude that the long-term prediction can capture the rough changes of the future, while because the future weather is greatly affected by uncertain factors, it is impossible to accurately predict the effects of warm and cold air currents. There are similarities in the prediction of humidity (shown in Figure 10).  Because long-term forecasts are made every day, planting plans can change at any time if there is a sudden weather condition. The long-term prediction of weather information is used to designate rough planting plans for fertilization, harvesting, etc. for the next month, which is very beneficial for sustainable high-quality agricultural production.   4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 Relative humidity(%)

Time(Day)
The result of the long-term prediction

Relative humidity
Average daily relative humidity Predictions The blue line is the temperature data and the orange line is the prediction result. We can find that these two lines are very close to each other, which indicates the obtained predictions are very accurate. Section 4.3 gives the numerical evaluation.

Comparing with Other Predictors
In this experiment, the proposed model was compared with eight other models, which are RNN [42], LSTM [43], BiLSTM [44], GRU [45], and the decomposition methods called seasonal-trend decomposition procedures based on loess (STL) [19] with RNN, LSTM, BiLSTM, and GRU as the subpredictors, respectively. For evaluating the performance of the proposed predictor, root mean square error (RMSE) (shown in Equation (5)) is used as the evaluation metric to measure the difference between the prediction by a model and the collected data.
where N is the number of prediction dataset; obs x represents the collected data; and pre x is predicted value. The temperature and humidity data are used to show the prediction result. Table 5 gives a comparison of the predicted results of the RNN, LSTM, BiLSTM, GRU, STL_RNN (STL-based RNN), STL_LSTM (STL-based LSTM), STL_BiLSTM, STL_GRU, and two-level decomposition based on GRU (proposed in Section 3) models in terms of RMSE, and Figure 13 shows the histogram of RMSE. It is apparent from the comparison of prediction results that the decomposition models significantly Relative humidity Predictions outperform the undecomposed ones, and the proposed model has more accurate prediction than other models. For example, the prediction RMSE of the proposed model for temperature is approximately 20.34% and 2.04% lower, respectively, than that of the GRU and STL_LSTM models; for humidity, the RMSEs are about 8.61% and 2.19% lower, respectively. The results show that the developed two-level decomposition is effective because the RMSEs can be significantly reduced, and the GRU is the best choice as the sub-predictor.

Conclusions
In the precision agricultural IoT system, accurate prediction of weather data is a key way to improve the performance of the IoT system. The deep learning approach features self-learning capabilities and exhibits excellent performance in complex sensor data.
In this study, the two-level sequential decomposition structure was used to decompose the weather data according to different periods, thus reducing the complex nonlinear relationship of the raw data from sensors. By designing multiple GRUs as sub-predictors, the prediction results of subpredictors were finally combined to obtain long-term and medium-term prediction of weather data. Through the verification of real data, the proposed model has higher prediction accuracy and can meet the needs of precision agriculture. In precision agriculture, the use of the Internet of Things system can effectively reduce the workload of farmers and increase farmers' awareness of the use of precision agricultural tools. According to our paper, long-term weather prediction can provide important guidance information for planning a reasonable growth cycle of crops. In addition, it can also help farmers manage farms. For example, there can be a preliminary forecast and estimation of severe weather in agriculture, so as to reduce risks and increase income.