Unemployment and COVID-19 Impact in Greece: A Vector Autoregression (VAR) Data Analysis †

: In this paper, the scope is to study whether and how the COVID-19 situation affected the unemployment rate in Greece. To achieve this, a vector autoregression (VAR) model is employed and data analysis is carried out. Another interesting question is whether the situation affected more heavily female and the youth unemployment (under 25 years old) compared to the overall unemployment. To predict the future impact of COVID-19 on these variables, we used the Impulse Response function. Furthermore, there is taking place a comparison of the impact of the pandemic with the other European countries for overall, female, and youth unemployment rates. Finally, the forecasting ability of such a model is compared with ARIMA and ANN univariate models.


Introduction
The scope of this paper is to examine the impact of COVID-19 on the Greece unemployment rate. To achieve this goal, we designed and implemented an econometric analysis. It is a quite common debate whether and to what scale the pandemic, will cause unemployment problems to society. This is an attempt to answer this question using econometric analysis. Another major question is whether female unemployment will be impacted further than that of males and another interesting question is if this situation affects more the youth (under the age of 25) in terms of unemployment. Such answers uncover whether these specific groups (i.e., women and young people) are more vulnerable to a pandemic situation. The answers to the above questions are particularly important in terms of the designing of economic policies.
The country of interest of this study is Greece. Thus, is important to investigate the impact of COVID-19 on Greece in comparison with other countries and in this study we considered the European Union of 27 countries (EU27). We can extract some conclusions about how much time should we expect the impact of COVID-19 to the unemployment rate of the country to last. The comparison with the EU27 allows the direct comparison of the impact in terms of time. Whether the impact of this situation is similar, then the same policies are expected to be effective both for Greece and the EU27. The impact of the pandemic on female and youth unemployment unveils increased vulnerabilities for these specific groups and the economic measures should be directed more to them in order to gain increased efficiency, i.e., smaller effects of the COVID-19 to the unemployment rate. A final question under examination is the forecasting ability of such an approach (VAR model) compared to some other approaches. The target of this question is to answer if this approach is suitable for both impact measuring (and maybe for deciding forecasting horizon) and for forecasting or some other approach should be used for forecasting purposes. The core of this econometric analysis is the Vector Autoregression (VAR) model. The unemployment rate is expressed in monthly data and the COVID-19 cases in daily terms. To create a time series of equal length for the unemployment series, we use interpolation while for 2 of 11 the COVID-19 series, we considered the number of new cases per five days. An essential feature of this model is the Impulse response function which allows for observation of the future impact of the situation per unit (in this analysis per 5 days).
There are already some attempts with the aim to describe and explain the impact of COVID to the dynamics of macroeconomic variables. Examples are these of [1] in which the author studies the social and economic responses to the COVID-19 pandemic in a large sample of countries, of [2] in which the authors study the influences of the COVID-19 pandemic on unemployment in five selected European economies and of [3] in which the author investigate the impact of globalization to the speed of initial transmission and on the scale of initial infections to a country. Moreover, there are mentioned some additional relevant studies whose analyses share the common characteristic of the usage of VAR models. These studies include [4] in which the author study the impact of fear sentiment caused by the coronavirus pandemic on Bitcoin price dynamics using Google search queries, ref. [5] in which the author investigates the impact of COVID-19 in the stock market (specifically in Dow Jones and S&P 500 returns), ref. [6] in which the authors consider several indicators of economic uncertainty for the US and the UK before and during the COVID-19 situation and study the impact of the pandemic to these indicators, ref. [7] in which the authors study the assumptions which are needed for forecasting of the evolution of the U.S. economy following the outbreak of COVID-19, ref. [8] in which the author study the effect of the virus outbreak on the economic output of New York state. There are also several papers about the impact of macroeconomic variables to unemployment using VAR models such as the following: ref. [9] in which the authors study the influence of Foreign Direct Investment on Unemployment, ref. [10] in which the author analyzes the dynamic effects of different macroeconomic shocks on unemployment in Germany, ref. [11] in which the authors use Bayesian SVAR models to analyze the role of oil price movements in the evolution of unemployment in the UK, ref. [12] in which the author uses a Structural VAR (SVAR) approach to study the effects of shocks to the Austrian unemployment, ref. [13] in which the authors review the main causes of Spanish unemployment using the structural VAR methodology [14] in which the author uses a bivariate VAR model with to describe output-unemployment dynamics. Attempts which are related to forecasting are that of [15] who use three time series methods to forecast the Swedish unemployment rate, and a recent attempt for forecast youth unemployment in Italy in the aftermath of the COVID-19, using an artificial neural network (ANN) model, in [16].
The scope of this paper is the exploration of impact of COVID-19 in the unemployment in Greece and the comparison with the rest EU countries overall, for females and for young people. This is performed through the fitting of Vector Autoregression (VAR) models. Finally, we studied the contribution of such model to the forecasting ability for the unemployment rate. Specifically, there is an attempt to answer the following question: is the usage of such model for forecasting purposes a suitable approach or is preferable the usage of some other approach? Some important conclusions could be derived from such an analysis. The rest of the manuscript is organized as follows: in Section 2 we analyzed the VAR model, in Section 3 we discuss the Impulse Response function, in Section 4 we discuss the forecasting ability of VAR model and the detection of a suitable forecasting approach, in Section 5 we perform the data analysis and Section 6 contains the conclusions of the paper.

VAR Model
The Vector Autoregression model is a statistical model which describes the evolution of multivariate linear time series with k endogenous variables. The evolution of these endogenous variables in the system is considered not only as function of their own history, but as a function of the lagged values of all endogenous variables. In essence, this model is a generalization of ARIMA models for univariate time series. This is the simplest and most used model for multivariate time series forecasting.
The VAR model introduced in [17] where the author explains the usefulness of VAR models and show their use through applications. All variables in this approach are endogenous and are functions of the lagged values of all the considered variables. A brief review of the illustration of such a model follows.
In terms of characterization the order of the model, i.e., the number of previous periods that the model will use, has crucial role. For example a VAR(3) model is a model where each variable is linear combination of the last three periods (lags) of all the variables of the system.
The general form of a VAR(p) model with k variables and p lags in terms of a matrix follows: where each Y i represents a vector of length k and each A i is a k × k matrix. The vector of residuals (e) has expected value of zero and the error terms (e i,t ) are not autocorrelated. The validity of the previous properties offers consistent and efficient estimators through the method of Least Squares (LS). The interpretation of VAR models is especially important. We should be careful that these models do not allow us to extract any inference about causality between the variables (only Granger causality could be examined, i.e., if a time-series contribute to the prediction of another time-series, this property is obviously much weaker compared to the normal causality). On the other hand, VAR models allow interpretations about the dynamic relationship between variables of the system. Detailed information about VAR models can be found in [18].

Impulse Response Function
Impulse response functions are used frequently in macroeconomic modeling to describe how the economy reacts over time to economic shocks, which are considered to be exogenous impulses. These functions are often used in the context of a VAR model. Additionally, these functions describe the reaction of endogenous macroeconomic variables to the economic shock (of one or more standard deviations) both at the time of the shock and in future points in time. In other words, the major purpose is the description of the evolution of a variable in the model when it reacts to a shock in other variables of the system, and this makes them a very useful tool for policy makers for assessing alternative economic policies.
The idea of the impulse response is that we look at the adjustment of the endogenous variables over time, after a hypothetical shock in t, and we compare this adjustment with the time series process without the shock, i.e., the actual process. The impulse response sequences plot this difference. The impulse response function is obtaining through the consideration of the moving average (MA) representation for a linear VAR model. The discrepancy between the expected value of the variable with and without the considered shock is the forecast error impulse response (FEIR) function. The FEIR function for the ith period after the shock is expressed as where Φ 0 = I k and A j = 0 for j > p, k is the number of exogenous variables and p is the lag order of the VAR model.
In this work, orthogonal impulse responses are used. The reason for the use of such functions is that we assume that the other impulse remains constant, i.e., to isolate a concurrent effect to the variable which is arising solely because of an impulse in the same equation. The basic idea is that the variance-covariance matrix (Σ) is decomposed (usually with a Choleski decomposition) in a way that Σ = PP , where P is a lower triangular matrix with positive diagonal elements.
Additional and detailed information about impulse response functions one can find in [18]. In [19] is discussed the identification of shocks for studying specific economic problems. Moreover, have been suggested asymmetric impulse response functions that separate the impact of a positive shock from a negative one in [20].

Forecasting Using the VAR Approach
The VAR model is certainly useful for studying the impact of COVID-19 cases on unemployment. One question is whether this approach is useful for forecasting. The answer is not straightforward, in the sense that accurate forecasting is a different task than studying the impact of a factor. Could a VAR model be used for both of these tasks effectively, or could the consideration of an alternative model for forecasting be advantageous in terms of forecasting accuracy? This question is explored in this section of this paper.
As benchmark model for unemployment forecasting is considered the plain ARIMA model (i.e., using only past values of the series). The other considered forecasting approaches are: the ARIMA model using COVID-19 cases as external regressor (the lags are decided from the corresponding VAR model), Feed-Forward Artificial Neural Networks (ANN) and Feed-Forward Artificial Neural Networks (ANN) using the COVID-19 cases as external input (the lags are decided from the corresponding VAR model). Two questions are explored here, with the aim to specify a suitable model: (i) whether the insertion of COVID-19 cases can improve a forecasting approach and (ii) whether a machine learning approach (in our case ANN) can offer additional forecasting accuracy.
The details of our analysis are as follows. The training set are 60 observations and the test set 30 observations. The forecasting task in this study is as follows: models are fitted using the first 60 observations. In every step, the models are refitted with all the available data up to this point. The forecasts are 1-step ahead and finally, there are 30 forecasts with each model which are compared with the actual values of the test set using the Root Mean Squared Percentage Error (RMSPE) and the Mean Absolute Percentage Error (MAPE) multiplied by 1000 for easier direct comparison of the models. We consider that we have N observations with y i the actual values of the time-series andŷ i the forecasts of the values of the time series, then the formulas for the RMSPE and MAPE are following: A basic question, which we try to answer in this work, is whether such a model is the suitable approach for forecasting or should be accompanied by a model for gaining additional forecasting accuracy. In this study, the considered approaches are the ARIMA model (1,0,0) from the univariate time series domain and the Feed-forward Artificial Neural Networks (ANN) with one (1) hidden layer and with one (1) to ten (10) nodes in this layer (the model which gives the lowest RMSPE is considered). Using both frameworks, we consider the insertion of COVID-19 cases (as external regressor in the ARIMA model and as a second input in ANN model) and study if such an insertion offers additional accuracy. Details about ARIMA models can be found on [21] and details about Artificial Neural Nets can be found in [22].

Data Analysis
The source of the data used in this study is Eurostat. The reported data describe monthly unemployment by sex and age: https://ec.europa.eu/eurostat/web/productsdatasets/-/une_rt_m (accessed on 6 March 2021). On the other hand, the COVID-19 data are freely available from the European Centre for Disease Prevention and Control and are downloaded from the https://ourworldindata.org/coronavirus site (download the com-plete database) (accessed on 6 March 2021). The analysis in this work is performed through the statistical software R. Specifically, the following packages are used: moments [23] for calculating skewness, kurtosis and for performing Jarque-Bera test, vars [24][25][26]) for VAR model estimation and prediction, forecast [27,28] for ARIMA models, AMORE ( [29]) for the Feed-Forward Fully Completed Artificial Neural Network Models and MLmetrics [30] for the computation of RMSPE and MAPE metrics.

Data Overview
The cases of COVID-19 are considered. The data cover the period from November 2019 until 31 January of 2021. For both Greece and EU, the considered unemployment data are from November 2019 until January of 2021. To achieve equal length of the datasets, we consider five days as a time point (each month has six time points-day 5, day 10, day 15, day 20, day 25, end of month, i.e., day 30 or 31 or 29 for February 2020). For the COVID-19 data, we consider the number of new cases at the end of the five days as a single point, while for the unemployment series we consider constant interpolation to fill the gaps (until the start of November 2020 for Greece and until the start of January 2021 for EU). For the additional time points until the finish of January 2021, the data for the unemployment series are filled using exponential smoothing models. The analysis performed in R package forecast. The mean value, the standard deviation (SD), the Coefficient of variation (CV) and the Jarque-Bera test for normality (statistic and p-value in parenthesis) of the data are shown in Table 1. Greece displays unemployment over EU27 countries for all categories (Total, Female and Under 25 years old-"Youth"). The most severe case-for both Greece and EU27 countries-seems to be the youth unemployment because it is on a higher level according to mean value and is more dispersed according to SD. Normality can be rejected at any case for Greece's unemployment, while for the EU27 unemployment series cannot be rejected at 0.01 level. With the aim to perform our analysis to find the impact of COVID-19 new cases on unemployment, the data are transformed to natural log values (COVID-19 new cases are transformed to natural log (values+1) because there are zeros in the sample). We observe mainly that Greece unemployment display higher kurtosis than EU27 for all types of unemployment and the rejection of Normality at the 0.01 level only for all types of Greece unemployment.

Overall Unemployment
The first scope of this paper is to explore the effects of COVID-19 to the overall unemployment of Greece and to compare this impact to the rest EU27 countries (EU27). To achieve this, a VAR model is applied for two variables, i.e., unemployment and COVID-19 new cases both for Greece and for EU27. The lags are decided through the BIC criterion or is selected the model with the minimum lags which leads to no autocorrelated or heteroskedastic residuals. The residuals of the models are checked for autocorrelation with the Pertmanteau Multivariate test, for heteroskedasticity using the ARCH-LM test, for normality with the Jarque-Bera test and for stationarity with the ADF test. Additionally, we observe in a CUSUM graph if there is evidence of the existence of a structural break. Table 2 displays the results of the fitting for these models. For both Greece and the EU27, autocorrelation and no heteroskedasticity of residuals can be assumed at 5%, while there is no graphical evidence of the existence of a structural break. Additionally, the residuals of the regression with Unemployment Rates as (yvariables) can be assumed stationary at 1%. Finally, the assumption of normality of the residuals is rejected at almost every level of significance which led to the use of bootstrap both for the construction of confidence intervals for Impulse Responses and for the calculation of the p-value for Granger causalities (the p-values of the tests are calculated by considering 10,000 bootstrap replicates). Using Granger causalities, only the EU27 shows that it can be assumed at 10% that COVID-19 cases cause Granger Unemployment.
With the aim to directly compare Greece with the EU27 case, we construct a table which shows the cumulative impulse response in terms of Unemployment Rates for the next seven months. These results are shown in Table 3. The impact of COVID-19 cases is expected to raise unemployment more in EU27 countries than in Greece. This situation can be considered less in Greece as a factor of deterioration in unemployment.

Female and Youth Unemployment
An additional aim is the exploration of the effect of COVID-19 to the female and to the youth unemployment in Greece and EU27. Again, VAR models are fitted for COVID-19 cases and unemployment of these specific groups and the lags are decided through the BIC criterion. For female unemployment, Table 4 displays the results of the fitting and Table 5 the cumulative impulse responses for Greece and EU27 respectively. For youth unemployment the results are shown in Table 6 and the cumulative impulse responses for Greece and EU27 respectively in Table 7. Again, the lags are decided through the BIC criterion or the model with the minimum lags is selected, which leads to no autocorrelated or heteroskedastic residuals and the residuals of the models are checked for autocorrelation with Pertmanteau Multivariate test, for heteroskedasticity using the ARCH-LM test, for normality with Jarque-Bera test and for stationarity with ADF test, and we observe whether there is evidence of the existence of a structural break in a CUSUM graph.  For female unemployment, both for Greece and the EU27, autocorrelation (for Greece, the residuals cannot be considered autocorrelated at 1% level) and no heteroskedasticity of residuals can be assumed at 5% level, while there is no graphical evidence of the existence of a structural break and the residuals of the regression with Unemployment Rates as (y-variables) can be assumed stationary at 0.05 level. However, normality of residuals is rejected at almost any level of statistical significance. The same applies for youth unemployment. The rejection of normality of the residuals again lead to the use of bootstrap for the construction of confidence intervals for Impulse Responses.
With the aim to directly compare the Greece with EU27 case, we construct a table which show the cumulative impulse response in terms of Unemployment Rates for the next seven months. These results are shown to Table 5. The impact of COVID-19 to both Greece and EU27 is not observable the first month, but in the end of the seventh month the affection is clear and female unemployment is expected to rise more in EU27 countries than in Greece.
What follows are the results for youth unemployment. Table 6 displays the fitted VAR model. With the aim to directly compare the Greece with EU27 case, we construct a table which show the cumulative impulse response in terms of Unemployment Rates for the next seven months. These results are shown in Table 7. To sum up, Table 8 displays the analysis which presents the results for the cumulative impact of COVID-19 cases both for Greece and EU27. All categories of unemployment are expected to be affected positively from the pandemic. According to the type of unemployment, young people are expected to experience a higher increase of their unemployment, while females are expected to be affected less than the overall population in the EU27 countries and to be affected more heavily than the other categories of unemployment in Greece. According to the country, Greece is expected to be affected less than the EU27 countries for all types of unemployment. Probable reasons are maybe structural, and we point out that the values of Unemployment rates in Greece are already in higher level than the EU27 countries. This fact leads to two main remarks: first that unemployment is expected to rise in all cases due to the COVID-19 situation and the average EU27 country is expected to be affected more than Greece in terms of unemployment rates. This is maybe a sign that it is more urgent for Greece to solve structural problems, while for the average EU27 country it seems more urgent to take measures to protect its economy from this situation. Secondly, female unemployment in Greece and the unemployment of young people in EU27 countries are expected to be affected more heavily by COVID-19, which indicates that the policies should have a different focus, to alleviate from the consequences.

Forecasting the Unemployment Rates
This analysis closes the paper and answers the question whether such a VAR model is better for forecasting purposes of unemployment or whether other approaches should be considered because they could achieve more accurate results. The results are displayed in Table 9 which displays the RMSPE values (multiplied by 1000) and the MAPE values inside parenthesis (multiplied by 1000). To decide about the suitability of the model, we use as alternatives for forecasting, the following approaches: the plain ARIMA model (as benchmark), the ARIMA model with COVID-19 cases as external regressor, a Feed-Forward Multivariate Artificial Neural Network (ANN) based solely on previous cases of the unemployment and the same model but additionally with observations of COVID-19 cases. The models are compared in terms of RMSPE and MAPE. Sixty (60) observations are used for training of the models and thirty (30) for testing the forecasting ability of the models. The main conclusions from Table 9 are as follows. First, the VAR model is not the best approach for forecasting for the EU27 nor for Greece, for all the considered subcategories of unemployment. Next, the ANN approach displays lower performance than VAR and ARIMA models. Finally, under the ARIMA framework, the insertion of COVID-19 cases improves the forecasting only for the case of EU27 countries and not in the case of Greece (expected due to the Granger causality).

Conclusions
In this work, we constructed and fitted Vector Autoregressive (VAR) models with the aim to explore the impact of COVID-19 cases on Greece's general unemployment and on two more sensitive cases, i.e., Female and the Youth unemployment. Furthermore, the forecasting ability of the VAR model is found to be limited and other univariate approaches appear as preferable. A strategy is to use the VAR model to explore effects of shocks, while it seems advantageous the use of other approaches for forecasting purposes. Additionally, there is evidence that COVID-19 cases Granger cause the overall unemployment rates only for the EU27 countries (the non-causality cannot be rejected at the 0.1 level). Additionally, a shock in COVID-19 cases in Greece will have a lower impact in all considered types of unemployment. For all unemployment types (overall, female and youth) the effect of COVID-19 cases is expected to be lower for Greece compared to the EU27 countries. However, the impact does not appear to stop after seven months for all types of unemployment. In terms of forecasting, a suggestion is that the VAR model can be used to investigate the impact of a shock and should be accompanied by an ARIMA model for forecasting purposes.
Data Availability Statement: The overview of the data is analyzed in Section 5.1.

Conflicts of Interest:
The author declares no conflict of interest.