Artiﬁcial-Intelligence-Based Time-Series Intervention Models to Assess the Impact of the COVID-19 Pandemic on Tomato Supply and Prices in Hyderabad, India

: This study’s objective was to assess the impact of the COVID-19 pandemic on tomato supply and prices in Gudimalkapur market in Hyderabad, India. The lockdown imposed by the government of India from 25 March 2020 to 30 June 2020 particularly affected the supply chain of perishable agricultural products, including tomatoes as one of the major vegetable crops in the study area. The classical time series models such as autoregressive integrated moving average (ARIMA) intervention models and artiﬁcial intelligence (AI)-based time-series models namely support vector regression (SVR) intervention and artiﬁcial neural network (ANN) intervention models were used to predict tomato supplies and prices in the studied market. The modelling results show that the pandemic had a negative impact on supply and a positive impact on tomato prices. Moreover, the ANN intervention model outperformed the other models in both the training and test data sets. The superior performance of the ANN intervention model could be due to its ability to account for the nonlinear and complex nature of the data with exogenous intervention variable.


Introduction
Marketing is critical to moving agricultural products from producer to consumer and maintaining price stability. Changes in the demand and supply of agricultural products and marketing must be coordinated with projected increases in agricultural production. The COVID-19 crisis has caused significant damage to the national and global economy due to the lockdown measures initiated in March 2020 in many countries, including India. Due to the imposed lockdown, activities related to supply chains from the agriculture were notably disrupted and there was delayed of agricultural foodstuff. Tomatoes are the second most productive crop in the world after potatoes with a productivity of 32.8 t/ha. The major tomato producers in India are Andhra Pradesh, Madhya Pradesh, Karnataka, Gujarat, Odisha, and West Bengal. With a production of 2744.32 thousand tons, Andhra Pradesh is the state with the highest tomato production in India, followed by Madhya Pradesh with a production of 2419.28 thousand tons. Tomato is grown throughout the year in Telangana, which ranks seventh in India's tomato production with a total area of over 41,000 hectares and production of 1171.5 tons [1]. Tomato is one of the major vegetable crops whose supply and price are affected by unexpected changes in government policies or other interventions.
Agronomy 2021, 11, 1878 3 of 16 models by incorporating the intervention as exogenous variable in the input layer. The objective of this study was to investigate the impact of national lockdown in India on tomato supply and prices in Gudimalkapur market in Hyderabad, Telegana, by applying the intervention-based ANN and SVR models and comparing them with some of widely used approaches.

Data Description
The secondary data on tomato prices and arrivals were collected daily from 1 January 2020 to 30 June 2020 from the website http://tsmarketing.in/DailyArrivalsnPricesCommB etweenDates.aspx, accessed on 12 September 2020. The Government of India announced a nationwide lockdown from 25 March 2021 to 30 June 2020; thus, 25 March 2020 was considered as the intervention date and the data from 1 January 2020 to 24 March 2020 were considered as the pre-intervention period, whereas the data from 25 March 2021 to 30 June 2020 were considered as the post-intervention period. For modeling and forecasting, the data from 1 January 2020 to 23 June 2020 was used for model building and the data from 24 June 2020 to 30 June 2020 were used for model testing and validation. The map of the study area is shown in the Figure 1.
for an unknown time series. Most of the classical times series models, such as ARIMA and ARIMA intervention models, were used in analyzing impact of policies or sudden changes in agriculture. The classical time series models fail to capture the nonlinear pattern in time series intervention data sets. To overcome this problem, we have developed ANN and SVR-based intervention models by incorporating the intervention as exogenous variable in the input layer. The objective of this study was to investigate the impact of national lockdown in India on tomato supply and prices in Gudimalkapur market in Hyderabad, Telegana, by applying the intervention-based ANN and SVR models and comparing them with some of widely used approaches.

Data Description
The secondary data on tomato prices and arrivals were collected daily from 1 January 2020 to 30 June 2020 from the website http://tsmarketing.in/DailyArrivalsnPricesCommBetweenDates.aspx, accessed on 12 September 2020. The Government of India announced a nationwide lockdown from 25 March 2021 to 30 June 2020; thus, 25 March 2020 was considered as the intervention date and the data from 1 January 2020 to 24 March 2020 were considered as the pre-intervention period, whereas the data from 25 March 2021 to 30 June 2020 were considered as the post-intervention period. For modeling and forecasting, the data from 1 January 2020 to 23 June 2020 was used for model building and the data from 24 June 2020 to 30 June 2020 were used for model testing and validation. The map of the study area is shown in the Figure 1.

ARIMA Model
The Box-Jenkins [2] ARIMA is the most commonly used model in forecasting time series data. When the time series Yt is non-stationary or integrated, this procedure is an amalgamation of the ARMA process. To build the ARMA model in the case, the series

ARIMA Model
The Box-Jenkins [2] ARIMA is the most commonly used model in forecasting time series data. When the time series Yt is non-stationary or integrated, this procedure is an amalgamation of the ARMA process. To build the ARMA model in the case, the series must be differenced to make it stationary, and this differenced series, which is now stationary, must be subjected to ARIMA model fitting. This procedure is known as ARIMA (p,d,q), where p and q denote the number of AR and MA terms, respectively, and d denotes the order of differencing required to make the series stationary. An ARIMA model is expressed by the following expression: where: Moving average parameter) where d = differencing term, B = backshift operator, i.e., B a Y t = Y t−a ε t = white noise or error term. The ARIMA methodology is conducted in three steps, namely identification, estimation and diagnostic checking. For diagnostic checking, the Box-Pierce non-correlation test is most commonly used.

ARIMA Intervention Model
ARIMA intervention analysis [10] is a time series analysis technique that uses modelling approaches to incorporate the effect of exogenous forces or interventions. Government policies, strikes, earthquakes, price shifts, folds, pandemic, and other unforeseen catastrophes are all examples of interventions. It produces unexpected shifts in time series. Simply put, intervention analysis in time series refers to the study of how a series mean level changes as a result of an intervention.
In this, the dependent variable is Y t , indicator variable. I t = indicator variable coded according to the type of intervention (step, pulse/point, and impulse). Here, δ(B) = 1 + δ 1 B + . . . δ r B r is the slope parameter, ω(B) = ω 0 + ω 1 B + . . . ω s B s is the impact parameter, ϕ(B) is the moving average parameter, b is the delay parameter, B is the backshift operator, i.e., B a Y t = Y t−a , ε t is the white noise or error term. The step intervention occurs at a specific point in time and persist over successive time periods. The impact of the step intervention may be stable over time, or it may rise or diminish.
The indicator variable is coded as follows since the occurrence of COVID-19 is a step intervention type, I t = 0, t < T and 1, t ≥ T . The COVID-19 intervention, which was classified as a lockdown, began on March 25 2020 and was later extended in multiple phases. As a result, the indicator variable I t was given a value of 0 before intervention period and 1 during the intervention period.

Support Vector Regression (SVR) Model
The SVR model was basically developed for classification problem, later adopted to regression problem by adding ε-insensitive loss function [32]. The main concept behind SVR is to solve a nonlinear regression in a linear manner by transferring nonlinear input data from a lower dimensional feature space to a higher dimensional feature space. The SVR estimation function is written as follows: where Φ(.) stands for a nonlinear space transformation, is the weight vector, and b stands for the bias. By further introducing a kernel function, θ is no longer needed to be given explicitly in the SVR estimation function, which becomes: where k(x, is the kernel function. The radial basis function (RBF) kernel function is the most commonly used kernel function in SVR and is represented as follows: The coefficients W and b are estimated from data by minimizing the following regularized risk function: This regularized risk function minimizes both the empirical error and regularized term simultaneously, which helps in avoiding both under-and overfitting of the model. In Equation (6), the first term 1 2 w 2 is called the 'regularized term', which measures the flatness of the function. Minimizing 1 2 w 2 will make a function as flat as possible. The second term 1 is called 'empirical error', which was estimated by the Vapnik ε-insensitive loss function as follows: The performance of the RBF kernel function requires the optimization of two hyperparameters: regularization parameter C, which balances the complexity and approximation accuracy of the model; and Kernel bandwidth parameter γ, which represents the variance of the RBF kernel function [33].

Artificial Neural Network (ANN) Model
The ANN is the most widely used AI technique in the last three decades in time series modelling and prediction. In the field of time series modelling, ANN is commonly referred to as an autoregressive neural network because it considers time lags as inputs. The time series framework for ANN can be mathematically modelled using a neural network with implicit functional representation of time. The general expression for the final output Y of a multilayer autoregressive neural network with feedforward is expressed as follows: where α j (j = 0, 1, 2, . . . , q) and β ij (i = 0, 1, 2, . . . , p, j = 0, 1, 2, . . . , q) are the model parameters, also called the synopsis weights; p is the number of input nodes; q is the number of hidden nodes; and g is the activation function. The training part in ANN minimizes the error function between actual and predicted values. The error function of autoregressive ANN is expressed as follows: where N is the total number of error terms. The parameters of the neural network w ij are changed by a number of changes in ∆w ij as ∆w ij = −η ∂E ∂w ij , where η is the learning rate [34].
The schematic representation of neural network structure is depicted in Figure 2.

Artificial Intelligence (AI)-Based Intervention Models
The traditional artificial intelligence approach allows for forecasting based solely on the past values of the forecast variables. The model assumes that the future values of a variable are determined by its previous values as a well as the values of exogenous variable in the past. The artificial intelligence intervention model is a modified version of the artificial intelligence model that adds an additional independent variable called the intervention variables; this model is also referred to as the vector artificial intelligence model. The artificial intelligence forecasting models typically assume that each observed value is an unknown nonlinear function F of clagst1, t2, …, tc, for a given univariate time series {xt, t = 1, 2, …, n}, where xtϵ R, = ( − 1 , − 2 , … , − ) + (10) where the error t is error of zero mean. Next, we assume that m interventions have been observed throughout time periods r1, r2, …, rm. Depending on the nature of the interventions, we define m auxiliary variables 1 t , 2 t ,…, m t . As a result, we can have a nonlinear forecasting model with clagst1, t2, …, tc and m interventions: In this article, two AI-based intervention models, namely SVR and ANN intervention models, were developed by the intervention concept explained in this section. Along with the above explained models, the BDS (Brock-Dechert-Scheinman) test for non-linearity checking and the Diebold-Mariano test for significance comparison of model performance are used in this study; details about these tests are available in the literature [35,36]. Finally, the mean absolute percentage error (MAPE) is the most commonly used measure for forecasting error.
where A is the actual value, F is forecast or predicted value, and N is the number of observations.

Artificial Intelligence (AI)-Based Intervention Models
The traditional artificial intelligence approach allows for forecasting based solely on the past values of the forecast variables. The model assumes that the future values of a variable are determined by its previous values as a well as the values of exogenous variable in the past. The artificial intelligence intervention model is a modified version of the artificial intelligence model that adds an additional independent variable called the intervention variables; this model is also referred to as the vector artificial intelligence model. The artificial intelligence forecasting models typically assume that each observed value is an unknown nonlinear function F of clagst 1 , t 2 , . . . , t c , for a given univariate time series {x t , t = 1, 2, . . . , n}, where x t R, where the error ε t is error of zero mean. Next, we assume that m interventions have been observed throughout time periods r 1 , r 2 , . . . , r m . Depending on the nature of the interventions, we define m auxiliary variables δ 1 t , δ 2 t , . . . , δ m t . As a result, we can have a nonlinear forecasting model with clagst 1 , t 2 , . . . , t c and m interventions: In this article, two AI-based intervention models, namely SVR and ANN intervention models, were developed by the intervention concept explained in this section. Along with the above explained models, the BDS (Brock-Dechert-Scheinman) test for non-linearity checking and the Diebold-Mariano test for significance comparison of model performance are used in this study; details about these tests are available in the literature [35,36]. Finally, the mean absolute percentage error (MAPE) is the most commonly used measure for forecasting error.
where A is the actual value, F is forecast or predicted value, and N is the number of observations.    Table 1 shows the descriptive statistics for all four series of the Gudimalkapur market. The arrival price series are highly skewed and leptokurtic. The coefficient of variation for all four series are 31%, 78%, 80%, and 76% for arrivals, maximum price, minimum price, and modal price, respectively, indicating that the data are inherently heterogeneous. The BDS (Brock-Dechert-Scheinman) test indicates that the data under consideration is nonlinear as the probability value p 0.0001 for all the four series of Gudimalkapur market is given in Table 2.  Table 1 shows the descriptive statistics for all four series of the Gudimalkapur market. The arrival price series are highly skewed and leptokurtic. The coefficient of variation for all four series are 31%, 78%, 80%, and 76% for arrivals, maximum price, minimum price, and modal price, respectively, indicating that the data are inherently heterogeneous. The BDS (Brock-Dechert-Scheinman) test indicates that the data under consideration is nonlinear as the probability value p 0.0001 for all the four series of Gudimalkapur market is given in Table 2.

Results of ARIMA Models
For the series of supplies in Gudimalkapur market, ARIMA (1,0,0) was found to be suitable; for the series of prices, ARIMA (0,1,1), ARIMA (0,1,1), ARIMA (0,1,1) were found suitable for maximum price, minimum price, and modal price, respectively ( Table 3). The parameters of the models were estimated using the maximum likelihood method and the estimated values are given in Table 3. Diagnostic testing of the residuals was performed using the Box-Pierce non-correlation test for residuals and the results show that the residuals are not autocorrelated and random as the probability values of significance are 0.61, 0.81, 0.97, and 0.98 for the arrivals, maximum price, minimum price, and modal price, respectively.

Results ARIMA Intervention Models
Like the ARIMA model, the ARIMA intervention models were built for all four time series dates. The ARIMA intervention model (1,0,0) was found to be appropriate for supply series, and the ARIMA intervention models (0,0,1), ARIMA intervention models (0,0,1), and ARIMA intervention models (0,0,1) were found to be appropriate for the maximum price, minimum price, and modal price, respectively. The parameters estimated using maximum likelihood estimation techniques are shown in Table 4. The intervention parameters (impact (ω)) are estimated as −5.73 (p = 0.023), 171.29 (p = 0.029), 179.82 (p = 0.017), and 124. 88 (p = 0.039) for supplies, maximum price, minimum price, and modal price, respectively. The results show that the lockdown had a negative impact on tomato supplies and a positive impact on price series, which means about 600 kg of tomato per day less supplies. The results showed that there was an increase in the maximum price by Rs.17.1/t/day, minimum price by Rs.18/t/day, and average price by Rs.12.5/t/day for Gudimalkapur market. The fitted models were appropriate as the Box-Pierce non-correlation test for the residuals is not autocorrelated and random as the probability values of significance are 0.71, 0.83, 0.93, and 0.98 for arrivals, maximum price, minimum price, and modal price, respectively. Similar results were obtained in the study conducted by Ray and others [12] on cotton yield prediction. It was found that the performance of ARIMA intervention models was superior to conventional ARIMA models for all the three locations.

Results of SVR and SVR Intervention Models
Based on the required user-defined parameters, the SVR model and SVR intervention models (Table 5) were constructed. The radial basis function (RBF) is used as the kernel function. The number of support vectors obtained at optimal level was 117 for arrival series and 118, 123, 116 for maximum price, minimum price, and modal price, respectively. For the support vector regression with intervention model, the number of support vectors obtained at the optimal level was 111 for the arrival series and 115, 133, and 112 for the maximum price, minimum price, and maximum price, respectively. The Box-Pierce non-autocorrelation shows that the residuals are not autocorrelated or random (Table 5).

Results of ANN and ANN Intervention Model
Given the low training MAPE, appropriate models were selected with three tapped and ten hidden nodes (3: 10S: 1L) for the arrival series and maximum price, respectively, and two tapped and nine nodes (2: 9S: 1L) and two tapped and nine hidden nodes (2: 9S:1L) for the minimum price and modal price, respectively. Diagnostic testing of the residuals by the Box-Pierce test is performed after model fitting. The residuals were not autocorrelated or random as the probability values are 0.79, 0.69, 0.63, and 0.73 for the arrivals, maximum price, minimum price, and modal price, respectively. For the arrival and maximum price series of the Gudimalkapur market, the intervention model with three tapped nodes and ten hidden nodes and one exogenous intervention variable (4: 10S: 1L) was selected; and for the minimum price and modal price, the model with (3;9S: 2L) and (3:9S: 1L), respectively, was selected due to low MAPE values. The sigmoidal activation function was used in the input to the hidden layer and the linear activation function from the hidden to output layer. The residual values of all four series of the Gudimalkapur market are shown in Table 6, indicating that the residuals are not autocorrelated or random in nature.

Discussion
The results show that the lockdown has a negative effect on tomato supply and a positive effect on tomato price series. About 600 kg of tomatoes per day were supplied less due to the lockdown. The results also showed that the maximum price increased by Rs.17.1/t/day, minimum price increased by Rs.18/t/day, and modal price increased by Rs.12.5/t/day. The negative impact of the lockdown on deliveries is likely to be related to the decline in production during this period and could also be due to the problems faced by farmers and suppliers in transporting tomatoes to the market. The low volume of deliveries might have contributed to the higher price, which is evident from the positive impact of the lockdown on the price series. The forecasting performance of the six selected models-ARIMA, ARIMA intervention, SVR, SVR intervention, ANN, and ANN intervention-was evaluated for their forecasting ability based on model errors in both training and test data sets. Based on the MAPE values obtained (Table 7), the ANN intervention model outperformed the ANN and other models in both model building and validation data sets for the supply and price time series. The ANN intervention model also performed better than ANN and the other models in both the training and test data sets for the price series.
The predicted values of the ARIMA and ARIMA intervention models, as well as the SVR models, produced the same predicted values for all days, implying that the model does not have the generalization ability to provide different predicted values compared to the ANN intervention models.
In this study, the developed AI models outperformed the classic ARIMA and ARIMA intervention models. Among the AI models, the ANN intervention model performed better than all other models in both training and test data sets. The performance hierarchy of the Gudimalkapur market in the training and testing datasets is ANN intervention ANN SVR intervention SVR ARIMA intervention ARIMA in all four datasets. The MAPE values only show the observed difference between the actual and predicted values. To overcome this, we used DM test statistics to identify the significance difference in performance between the different evolutions. The DM test shows that the ANN intervention model was significantly superior to the other models in both the training and test data sets, and their intercombinational superiority is shown in Table 8. The actual vs. fitted plots of the different models for all four markets are shown in Figure 4 and the actual vs. fitted values are given in supplementary Tables S1 and S2.  The empirical results show the superiority of the ANN intervention model over the other models examined in this study. The superiority of the ANN intervention model may be due to its ability to mimic the nonlinear and detailed nature of the data while allowing for an external intervention variable, making it very useful in explaining the dynamics of The empirical results show the superiority of the ANN intervention model over the other models examined in this study. The superiority of the ANN intervention model may be due to its ability to mimic the nonlinear and detailed nature of the data while allowing for an external intervention variable, making it very useful in explaining the dynamics of the impact of the COVID-19 pandemic on tomato supply and price fluctuations in the market.
Similar results were obtained in the study conducted by Ray and others [12], who found a positive impact of Bacillus thuringiensis technology on the cotton yield. Another study reported that the decline in Chinese manufacturing during the reported period had a negative impact on stock market prices, as revealed by the ARIMA intervention model results [14]. A study conducted by Ismail and others [37] also reported similar results; they found a negative impact of bombing on the tourism industry using ARIMA intervention model. Several studies [38][39][40][41] revealed that artificial-intelligence-based models, namely ANN and SVR, outperformed the classical ARIMA models in predicting the time series data in agriculture and other fields. Harini and others [24] also showed that ANN-based intervention models performed better compared to classical models in clinical intervention prediction.

Conclusions
The present study was undertaken to investigate the impact of the COVID-19 lockdown on the supplies and prices of tomato in Hyderabad market in Telangana state, India. The ARIMA intervention model significantly confirmed that there was a negative impact on the supply, but a positive impact on the price of tomato in examined market. The spillover effects from the pandemic to the tomato supplies have been significant with the increase in transportation costs, which resulted in an increase in the prices of the tomato and might also have contributed to low consumption levels. Additionally, the high prices of tomato combined with low supplies might have contributed to the increased vulnerability of the producers and food insecurity of the consumers. The classical times series models such as ARIMA and ARIMA intervention models were used in analyzing the impact of policies or sudden changes in agricultural impact studies. These models fail to capture the nonlinear structure present in data sets. To overcome this problem, we have developed ANN-and SVR-based intervention models by incorporating the intervention variable as an exogenous variable in the input layer. The data considered were nonlinear in nature, the classical linear time series models, namely the ARIMA and ARIMA intervention time series models, were not able to capture the nonlinear and chaotic patterns in the data identified by the BDS test. Therefore, an AI-based model was applied in this study to capture the nonlinear and complex nature of the data. The ANN intervention model outperformed all other models for modeling and predicting supply and price series of tomato; thus, it can be used to study the nonlinear complex nature of data under intervention in other time series data. The AI-based intervention models developed in this study can be used to evaluate the potential effects of government policies and programmes. The model has a wider scope to study the impact of interventions in agriculture. For example, it can be applied to study the impact of government subsidies on inputs in agriculture, price support to the producers, and the impact of pest and disease outbreak, to forecast the supply and price of agricultural products.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/1 0.3390/agronomy11091878/s1, Table S1: actual vs. fitted values of supply and price time series by different models are given in the excel file.   (Table S1).