A Quantitative Model Supporting Socially Responsible Public Investment Decisions for Sustainable Tourism †

: The purpose of this article is to develop a quantitative model that supports policy makers in the tourism sector in making socially responsible investment decisions. In particular, this paper proposes a methodological approach to assess the impact of strategic decisions at the policy level, in the ﬁeld of tourism, from an economic, environmental and social point of view. The Calabria region, in Italy, has been chosen as a real-world case study. Based on historical data, the study identiﬁes the main levers that inﬂuence tourism-related dynamics in Calabria. A quantitative forecasting model to support future investment decisions for sustainable tourism has then been developed. This problem is modeled through a multi-criteria optimization framework. To initialize such a framework, a non-linear autoregressive network with exogenous inputs (NARX) has been used. The proposed model is a ﬂexible instrument to evaluate public investment policies in the ﬁeld of tourism from the point of view of sustainability and social responsibility.


Introduction
Tourism is a sector with a strong economic, environmental and social impact. National and regional governments which invest in tourism need to consider the problem of making socially responsible decisions. The primary objective of this study is to support policy makers in the field of tourism in evaluating their investment decisions from the point of view of sustainability and social responsibility. The goal of our analysis is not to merely provide guidelines to increase tourism flows, since this strategy could lead to different drawbacks; i.e., increased pollution (Mehmood et al. 2016), a higher number of crimes (Stange et al. 2011) or overpopulation (Saaty 1990) etc.). More generally, this work aims at supporting socially responsible strategic investment decisions taking into account drivers related to socio-environmental aspects. To achieve this goal we examine the impact of tourism on the quality of life of a community. This impact is measured through different key performance indicators (KPIs) related to economic, environmental and social factors (Wu et al. 2018). The dynamic nature of tourist channel competition requires destinations to be able to combine and manage their tourist resources in order to gain competitive advantages (Teece et al. 1997;Cracolici and Nijkamp 2009). A predictive model can be used to understand how these resources affect the system and accordingly plan future strategic decisions (Shah et al. 2019;Zhao et al. 2019).

of 9
Different studies have been carried in the field of the impact of tourism on the surrounding context. An interesting review can be found in Van der Zee and Vanneste (2015). Many studies are numerical and statistical models (Plog 1974;Miossec 1977;Thurot 1980;Butler 1980;Butler 1991) analyzing the economic impact of tourism flows on a specific territory and in a well-defined time frame. Sustainability has become one of the main dimensions in tourism research as highlighted by the UN 2030 Agenda for Sustainable Development (Hall 2019). Scholars stress the importance of studies and tools focusing on tourism sustainability and with an impact on decision making and individual behavior (Keyvanfar et al. 2018;Font et al. 2019). After an extensive review of literature on sustainable tourism, Ruhanen et al. (2019) underline the importance of modeling decision processes in the field. In general, there is a lack of practical procedures supporting policy makers in making decisions oriented to sustainable tourism (Lakner et al. 2018;Guo et al. 2019).
Consequently, the aim of this work is to develop a model that can be used as a sustainability-oriented decision support system. The specificity of our study, however, is that it proposes a framework that can be adapted to a territory, taking into account the specific attributes particularly important for the considered case study.
Data used to test this approach are related to Calabria, an Italian region with a modest level of industrial development, which has attracted growing tourist flows in recent years. Tourism is a strategic economic sector for this area, but the growing number of tourists poses challenges from a social and environmental point of view. Calabria is a region characterized by small organizations where diffuse, spontaneous initiatives are the main mode of destinations promotion in the tourism sector. In the past, public investment decisions aimed exclusively at increasing the number of tourists have created social and environmental damage.
The methodology illustrated in this paper is derived from Zhang (2016), who first proposed an analytic network process within a goal-programming model based on nine decision variables and used it to weigh tourism development goals. The model has been tested on data from Tibet, one of the most popular Chinese tourism destinations, which is of great ecological, cultural and economic value. Although our methodology is similar, we made use of different decision variables and objective criteria in order to consider the different peculiarities of the analyzed regions. Moreover, in the first stage of the analysis, in which a set of relationships is developed through a non-linear autoregressive network (NARX), we consider the possibility that the impact of some decision variables on an objective response could be achieved only after a certain amount of time (years). This is particularly relevant, since it might be possible that public funding used for the activation of multi-year projects generates an immediate social value due to the newly created job positions and a delayed economic return due to the project outputs.

The Predictive Model
With the aim of defining a decision support system for tourism management, the following steps were performed. First step: determining the boundaries of the analyzed domain. Second: after a deep evaluation of the state-of-the-art and through collaboration with various stakeholders in the tourism field, we were able to identify the main objectives to pursue and the variables to investigate in order to influence those objectives. Third: a NARX network was used to formalize mathematically the required input-output relationships. Each step of the procedure will be described in Section 3.
In this section, we describe how we have developed a predictive model from a time series of data related to Calabria and how the model can be used to make future decisions. There are few examples of attempts to forecast the tourist flows based on regional data. Most focus on the factors affecting the choice of tourist destinations. These studies are mainly based on qualitative methods and focusing on short time periods, generally one or few years. In Calabria, the tourist flow is steadily growing.

Domain Definition
The development of a decision support system requires the correct definition of the factors that partially/totally influence the set of key performance indicators which have a direct impact on the objectives that the decision maker aims to optimize. The selection of indicators was carried out according to the indications reported by the European Tourism Indicator System (European Commission (EC) (2014)). The indicators have been divided into three overall categories: final goals, intermediate results and variables. "Final goals" are the three dimensions of sustainability: economic, social and environmental. The final goals cannot be directly measured. "Intermediate results" are intermediate objectives that can be directly measured and relate to the three dimensions of sustainability. Intermediate results have been linked to final goals through a procedure detailed in the following section. "Variables" are cause variables which have an impact on intermediate results and, through intermediate results, on the final goals. Some variables are under the control of the decision maker. They are the specific focus of our model. Others, like for example population and GDP, are contingent factors that must be taken into account but cannot be directly determined. See Figure 1 for the details. We collected all the historical data available for Calabria to get a clear picture of the touristic impact on the economic, social and environmental goals. Then, in order to collect missing data we have consulted public databases of national institutions such as: Istat, Mibact, Sistan, Ispra, ACI, ANPA, Ateco, UNPLI, Arpacal, Eurostat, Banca D'Italia (the expanded name of the institutions is reported in the caption of the following table). Collaboration with the Regional Department of National and European Planning Funds, which we thank, allowed us to access the time series of European and Italian funds in the field of regional tourism. Other potentially relevant indicators were excluded from our analysis because their effect can be considered to a large extent included in the chosen indicators. For example, the effect of competition from alternative destinations is implicit in the number of tourists or in the average stay. The effect of macroeconomic factors such as the financial crisis is included in the GDP factor. This article proposes a procedure that can be applied using different indicators according to the needs of the subject conducting the analysis. In this first application we used indicators for which, without prejudice regarding their relevance, data were available. In future applications it will be possible to purposefully collect specific data according to refined indicators. They are the specific focus of our model. Others, like for example population and GDP, are contingent factors that must be taken into account but cannot be directly determined. See Figure 1 for the details. We collected all the historical data available for Calabria to get a clear picture of the touristic impact on the economic, social and environmental goals. Then, in order to collect missing data we have consulted public databases of national institutions such as: Istat, Mibact, Sistan, Ispra, ACI, ANPA, Ateco, UNPLI, Arpacal, Eurostat, Banca D'Italia (the expanded name of the institutions is reported in the caption of the following table). Collaboration with the Regional Department of National and European Planning Funds, which we thank, allowed us to access the time series of European and Italian funds in the field of regional tourism. Other potentially relevant indicators were excluded from our analysis because their effect can be considered to a large extent included in the chosen indicators. For example, the effect of competition from alternative destinations is implicit in the number of tourists or in the average stay. The effect of macroeconomic factors such as the financial crisis is included in the GDP factor. This article proposes a procedure that can be applied using different indicators according to the needs of the subject conducting the analysis. In this first application we used indicators for which, without prejudice regarding their relevance, data were available. In future applications it will be possible to purposefully collect specific data according to refined indicators. The data were grouped into "thematic" tables depending on the field of interest and the direct influence on the territory, subsequently selected as "variables" or "levers" as follows (Tables 1-5). The data were grouped into "thematic" tables depending on the field of interest and the direct influence on the territory, subsequently selected as "variables" or "levers" as follows (Tables 1-5).

ID
Indicators Description x 1 Public investment in tourism Sum of all European and national funds (including the annual fee paid by the region) spread over the territory to promote cultural and touristic activities.
(1), (11), (13) x 2 GDP Calabria Gross domestic product per capita in Calabria. (1), (11) x 3 Population Total number of residents in the region. (1), (12) x 4 Number of sleeping accommodations in Calabria Total number of beds on offer in the regional territory during the year.
(1), (10), (13) x 5 Vehicles Total number of cars in circulation in the regional territory. (5) x 6 Associations promoting local tourism Total number of "pro loco" registered in the territory. (8) x 7 Tourism sciences graduates Total number of students graduated from all the tourism schools in Calabria over the year. (13) (1) Istat: Italian National Institute of Statistics; (2)    Average stay of tourists Average number of days of tourist stay in the regional territory in one year.

Time Delay Neural Networks
Accurate forecasts are crucial because of the unique nature of the tourism industry (Frechtling 2012). These activities can be performed through qualitative and quantitative approaches (Walle 1997). The first ones depend on substantial information and human experiences gained through the years. Walle (1997) criticized these techniques for their lack of generalizability. The second ones make use of mathematical functions to define the relationships of certain phenomena using numerical data. These models are then used to estimate future values based on past performance. The construction of an approximation model as a neural network has become a standard technique for many applications (Specht 1991). However, the use of classical structures and training algorithms still present some shortcomings, being not satisfactory if the complexity or the dimension of the problem is increased. An alternative scheme of neural network architecture (NARX) is designed to model the non-linearity of the analyzed problem. NARX is one of the recurrent neural network schemes with a specific global feedback, considered as a time series forecasting network. It is particularly appropriate for dynamic system applications as it gives a fast and accurate training response, then describes a system as a linear combination of some input parameters, then past values up to a certain fixed time delay, and its own past values.
(1) y j (t) and x i (t) represent, respectively, the input and output vectors at time step t, n y and n ui are the input and output time delays, while the function f is a non-linear mapping function. The non-linear mapping f in (1) is generally unknown but can be approximated, using a standard multilayer perceptron network with a combination of different activation functions (e.g., RELU, sigmoid, linear). For illustration purposes, the architecture of the NARX network is shown in Figure 2. The first layer of the network is composed by the input information used to predict the value of y j (t). For the particular application, the layer contains a subset of the decision variables of tourism development goals reported in Table 1 and previous recorded values of y j . The subset of relevant information can be detected by using a trial and error approach or through statistical techniques (e.g., Analysis of Variance). The values stored in the input layers are then manipulated through a structure of weighted connections and hidden units. An example of a network composed of three layers is shown in Figure 2. The value of a generic hidden unit in the second layer H k is given by the following equation: where σ is a sigmoid function and w k and w ik are weight coefficients. Finally, in a similar way the predicted output value y j (t) is modeled as: It is worth noting that Equations (2) and (3) apply to a three-layer network. If the NARX architecture is made by more than one hidden layer, additional intermediate transformations are required.
Backpropagation through time (BPTT) is the most used algorithm for training recurrent neural networks. The aim of the BPTT algorithm is to minimize the error of the network outputs. The general algorithm is: 1.
fix an initial set of weights; 2.
present the input data and propagate it through the network to get the estimated output; 3.
compare the predicted output to the expected output and calculate the error; 4.
calculate the derivates of the error with respect to the network weights and adjust the weights so that the error is minimized.
The error E to be minimized is: whereŷ(t) is the estimated output value at time t. The new weights are iteratively updated computing the error propagation term by proceeding backward through t = T, . . . , 1 for each time t and unit activation x i (t), y(t). The performance of the NARX network in terms of complexity and accuracy is largely dependent on internal components, such as the number of hidden neurons and the activation functions, and training algorithm parameters, such as the learning rate and the momentum. The process of selecting an adequate value of these parameters is still a controversial issue even if several approaches have been proposed in recent years. In this work, a procedure based on the use of a genetic algorithm illustrated in Ciancio et al. (2016) has been used to determine a suitable network architecture. The first step of this method is to encode the features of the neural network into specific chromosomes. A chromosome is a sequence of bits with value 0 or 1. Genetic algorithm undertakes to evolve the solution, during its execution, according to the following basic pattern: (1) random generation of the first population of solutions; (2) application of a fitness function to the solutions belonging to the current population; (3) selection of the best solutions based on the value of the fitness function; (4) generation of new solutions using crossover and mutation; (5) repetition of steps 2, 3 and 4 for k iterations; (6) selection of the best found solution.
One of the disadvantages of BPTT is that when the number of time steps increases, the computation also increases, making the overall model noisy. The high cost of single parameter updates makes the BPTT impossible to use for a large number of iterations. For this reason, it is important to consider only the relevant input parameters and time delays. The ANOVA technique was used to determine which decision variables significantly affect the selected criteria (Rajput et al. 2011). A p-Value threshold of 0.05 was used to determine the relevant features. The 26 data related to the period 1990-2015 were split in two sets. The first set, 1990-2013, was used to train and validate the regression model, while the last two data sets were used for validation purposes. Table 3 reports the accuracy of the estimated input-output relationships both in absolute and percentage terms.
non-linearity of the analyzed problem. NARX is one of the recurrent neural network schemes with a specific global feedback, considered as a time series forecasting network. It is particularly appropriate for dynamic system applications as it gives a fast and accurate training response, then describes a system as a linear combination of some input parameters, then past values up to a certain fixed time delay, and its own past values.
yj(t) and xi(t) represent, respectively, the input and output vectors at time step t, ny and nui are the input and output time delays, while the function f is a non-linear mapping function. The non-linear mapping f in (1) is generally unknown but can be approximated, using a standard multilayer perceptron network with a combination of different activation functions (e.g., RELU, sigmoid, linear). For illustration purposes, the architecture of the NARX network is shown in Figure 2. The first layer of the network is composed by the input information used to predict the value of ( ). For the particular application, the layer contains a subset of the decision variables of tourism development goals reported in Table 1 and previous recorded values of . The subset of relevant information can be detected by using a trial and error approach or through statistical techniques (e.g., Analysis of Variance). The values stored in the input layers are then manipulated through a structure In order to test the prediction accuracy, we compared the real KPI values measured in 2014 and 2015 with the predicted responses obtained by imposing the strategic decisions adopted in those two years as input values. The results are reported in Tables 6 and 7.

Conclusions
The proposed methodology allowed us to create a tool that supports policy makers in future investment decisions in the tourism sector. The methodology was built to evaluate investments from the point of view of sustainability and social responsibility. Compared to existing models, the mathematical formulation is able to analyze different investment plans through a sensitivity analysis that emphasizes socially responsible factors. Predictions on the future value of investments from the economic, environmental and social points of view can be formulated using our method and used to evaluate investment decisions.
The equations of the model are estimated based on historical data using a NARX network. At the same time the NARX helps us to have a clear picture of how to influence the critical objective variables with a small sample of data. The methodology will help decision makers to choose from a set of previously identified strategic plans.
The variables chosen in this paper were considered relevant for the specific case study (i.e., Calabria). Other decision makers in other territories could choose a different set of variables. In particular, in this study only publicly available data have been used. Future applications could use purposefully retrieved data and, consequently, develop finer indicators. For example, the number of circulating vehicles specifically related to tourism could be calculated and included in the model instead of the number of vehicles in general. The use of this approach can support the balanced and socially responsible growth of the tourism industry in regions which aim at increasing their wealth without compromising their environmental and social goals. Funding: This research has not received external or public funding.