Evaluation and Interpretation of Tourist Satisfaction for Local Korean Festivals Using Explainable AI

: In this paper, we propose using explainable artiﬁcial intelligence (XAI) techniques to predict and interpret the effects of local festival components on tourist satisfaction. We use data-driven analytics, including prediction, interpretation, and utilization phases, to help festivals establish a tourism strategy. Ultimately, this study aims to identify the most signiﬁcant variables in local tourism strategy and to predict tourist satisfaction. To do so, we conducted an experimental study to compare the prediction accuracy of representative predictive algorithms. We then built a surrogate model based on a game theory-based framework, known as SHapley Additive exPlanations (SHAP), to understand the prediction results and to obtain insight into how tourist satisfaction with local festivals can be improved. Tourist data were collected from local festivals in South Korea over a period of 12 years. We conclude that the proposed predictive and interpretable strategy can identify the strengths and weaknesses of each local festival, allowing festival planners and administrators to enhance their tourist satisfaction rates by addressing the identiﬁed weaknesses.


Introduction
The existence of local festivals is closely related to regional suburban development outside of city centers [1]. Festival tourism, which utilizes local tourism resources to attract visitors to local festivals, is an important means of promoting the tourism industry and has become one of the most critical sources of tourism in many countries [2][3][4][5]. Local festivals also contribute to population growth, as they promote the establishment or supplementation of local infrastructure [6]. Furthermore, local festivals not only create jobs by revitalizing the local economy but also help regions preserve their history and culture and nurture their local identities [7,8].
Existing tourism research has focused on the planning of local festivals to improve tourist satisfaction, which, it argues, can be done through the provision of cultural experiences, food and drink, and other forms of entertainment [9,10]. A considerable amount of research has determined how towns and cities can attract more tourists through local festivals [11].
Many studies have focused on determining how festival tourist satisfaction levels can be improved. The key factors, which have been suggested by previous studies, are summarized Table 1. Kim et al. [12] identified significant positive and negative factors affecting local food festivals and clarified the relationship between tourist satisfaction and loyalty. Velikova et al. [13] presented case studies of tourist satisfaction by focusing on local wine festivals and found that certain product and service attributes cause greater satisfaction. Some studies have considered tourist travel motivations as the most dominant factor affecting satisfaction [14][15][16][17], as travel motivations play a significant role in encouraging travelers to research the offerings of local festivals and plan their travel routes in It is difficult to establish a universal strategy to make local festivals attractive because there are too many decision-making factors [19]. Moreover, it is impossible to consider regional diversity for all suburban regions [20]. Regarding tourist satisfaction, various studies have examined the effect of travel recommendations [21][22][23][24][25][26] and have provided content or user-based travel recommender systems to match and recommend appropriate routes or travel packages to travelers. These studies stated that travel recommender systems are helpful for eliciting positive customer feedback.
Several existing studies have focused their research into customer satisfaction by concentrating on holidays such as Christmas, Easter, or Ascension days [27,28]. To determine how to increase holiday tourism in specific sightseeing cities, such as Merano or Bethlehem, they conducted a quantitative experiment using an ad hoc survey. However, these studies examine only a few days over the course of one year, which makes it difficult to use their results to establish a macro-level strategy for local festivals more generally.
More recently, various data-driven studies have addressed the limitations of the existing research into the activation of local festivals. In particular, machine learning methods have become an increasingly popular means of obtaining better solutions to the problem of tourist satisfaction because they reflect realistic uncertainties. Such methods have widely been used to identify event, tourist, and market segmentations and have outperformed other methods in their ability to resolve real-world problems [29][30][31]. Nevertheless, these studies focus only on the use of marketing strategies to persuade specific tourist or customer groups. Therefore, to revitalize local festivals, it is necessary to recognize their strengths and weaknesses to identify the improvements that will promote continuous growth.
As for research gaps, our study differs from previous research on local festivals in the following ways: First, existing studies have not provided sufficient data on the activation of local festivals, nor have they provided accurate predictions for attendee satisfaction. Specifically, no study has used artificial intelligence models to consider the variables that affect the satisfaction of tourists toward local festivals. Therefore, an interpretation model is needed that distinguishes between positive and negative evaluation variables for each local festival, and an accurate prediction model is needed because poor prediction performance leads to incorrect interpretations.
To resolve this, we propose a data-driven approach using artificial intelligence techniques to accurately predict tourist satisfaction and to identify the variables affecting tourist satisfaction for local festivals. For over 12 years and using 20 survey questions, we gathered Sustainability 2021, 13, 10901 3 of 18 data from tourists who have attended local festivals. The tourist satisfaction score was set as the dependent variable, with all other variables set as independent variables. Figure 1 illustrates the novel framework used in this study. The seven variables presented in the dotted box are considered to be critical components that determine the success of a local festival. Here, we used these independent variables to predict a dependent variable: tourist satisfaction. In particular, the importance of the interpretability of AI has become more apparent in order to increase the accuracy of AI while enabling humans and AI to pursue joint work. As such, these understandable AI techniques play an important role in the analysis of local festivals in our study. As individual indicators for tourist satisfaction, the following variables were investigated: shopping opportunities, the festival program, food, advanced publicity, travel guide, transport accessibility, and cultural content. When interpreting the significant effects between the independent and dependent variables, merely identifying the variables with significance for local festivals in general does not help us recognize the shortcomings of individual festivals. To resolve this problem, we proposed the use of Shapley Additive exPlanations (SHAP), a game theory-based framework, to identify the significant variables affecting each local festival sample. By understanding the weaknesses and strengths of an individual local festival, it is possible to identify which of the festival's marketing strategies should be strengthened and which strategies should be modified.
The main contributions of this study are as follows: First, we propose an AI-based tourism evaluation approach for local festivals. The proposed AI-based approach aims to indicate which festivals receive good results in terms of tourist satisfaction. We provide thorough explanations of the overall data-driven procedure, detailing the data preprocessing methods, including the handling of missing values, normalization issues, and the learning framework.
Second, to obtain an accurate prediction model for tourist satisfaction, representative regression models were used to determine the best predictive model for the given data set. A comparative evaluation revealed that the proposed deep learning model outperformed other models in terms of prediction accuracy. To the best of our knowledge, research into tourist satisfaction has yet to use explainable AI (XAI) to interpret results for individual local festivals.
Third, the experimental results were derived from a local festival data set gathered over 12 years. The experiments confirm that the conventional approach, which is operated by human experts, can be improved through the incorporation of AI-based approaches. In terms of the results of some of the interpretations, the experimental results are contrary to the understanding of local festival agencies and administrators. The remainder of this paper is structured as follows Section 2 shows the overall analysis procedure, including data preprocessing and the results of the exploratory data analysis. Sections 3 and 4 present the theoretical descriptions and experimental results for the model predictions and explanations, respectively. Finally, Section 4 presents the concluding remarks.

Overview
This section outlines the proposed approach, as shown in Figure 2. First, we collected the data representing tourist satisfaction and specific evaluation metrics for local festivals in South Korea. The data sets were first validated for missing values and distribution shape. In particular, the data set was transformed using quantile transformation, which is a robust preprocessing schema used to reduce the impact of outliers [32]. We then conducted log transformations to achieve better prediction accuracy. Based on the preprocessed data set, the prediction models were built to regress tourist satisfaction. We used 10 more representative machine learning algorithms for a regression task and then adopted an XAI technique, SHAP, to allow us to decompose the prediction results. Based on the decomposed SHAP value for a variable, we then indicated the feature's importance for predictions.

Data Description and Transformation
We gathered survey data from tourists at local festivals in South Korea over a span of 12 years. The Korean government has conducted surveys on local festivals from 1995 in order to improve their attraction for foreign tourists. The Ministry of Culture and Tourism selected the superior local festivals and provided financial support. http://www.mcst.go. kr/kor/s_notice/press/pressView.jsp?pSeq=17724 (accessed on 11 August 2021).
There were 476 total observations. After removing insignificant variables by adopting qualitative and quantitative methods, the final data sets included were the year, festival ID, festival type, the festival program, shopping opportunities, food, advanced publicity, travel guide, and cultural content. We used these data sets to achieve two goals: (1) to predict the tourist satisfaction toward different festivals and (2) to build an explainable model to identify the strengths and weaknesses of each festival. Based on the interpretations of the corresponding tourist satisfaction rate, we indicated the strengths and weaknesses of each festival.
First, we checked the proportions of the missing values to investigate the completeness of the data sets. Figure 3a illustrates the original data, including missing values, by providing missing data visualizations and by utilizing a quick summary of the data completeness. At a glance, we can see that there were missing values in a few observations Sustainability 2021, 13, 10901 5 of 18 related to two variables: type and guide. We have two reasons for simply deleting these observations, as shown in Figure 3b. The proportion of observations with missing values is trivial, and we did not adopt the order information between observations from the data because the order information is insignificant. model to identify the strengths and weaknesses of each festival. Based on the interpretations of the corresponding tourist satisfaction rate, we indicated the strengths and weaknesses of each festival.
First, we checked the proportions of the missing values to investigate the completeness of the data sets. Figure 3a illustrates the original data, including missing values, by providing missing data visualizations and by utilizing a quick summary of the data completeness. At a glance, we can see that there were missing values in a few observations related to two variables: type and guide. We have two reasons for simply deleting these observations, as shown in Figure 3b. The proportion of observations with missing values is trivial, and we did not adopt the order information between observations from the data because the order information is insignificant.  Second, to improve prediction accuracy, we checked the data transformation phase to handle data distribution by considering the quantile and log transformations. After several experiments, we selected the quantile transformation for a normalization procedure. As shown in Figure 4, the less skewed and sparse distributions for each variable result from the quantile transformations. Finally, we used min-max scaling, which realizes equal scaling for independent variables. After conducting transformations for all of the Second, to improve prediction accuracy, we checked the data transformation phase to handle data distribution by considering the quantile and log transformations. After several experiments, we selected the quantile transformation for a normalization procedure. As shown in Figure 4, the less skewed and sparse distributions for each variable result from the quantile transformations. Finally, we used min-max scaling, which realizes equal scaling for independent variables. After conducting transformations for all of the independent variables, we obtained both the accurate performance and fast convergence speed of the prediction models because the transformed data set reduced the sparse area in the data space [33]. independent variables, we obtained both the accurate performance and fast convergence speed of the prediction models because the transformed data set reduced the sparse area in the data space [33].

Figure 4.
Histograms of three significant variables-cultural content, transport accessibility, and advanced publicitybefore and after quantile transformation. It can be seen that the distribution of sparse regions is greatly reduced.

Exploratory Data Analysis
Here, we present the exploratory data analysis used to understand the data distribution and simple but significant data patterns. Figure 5 presents a scatter matrix showing associations between independent variables. The length of the rows and columns of the matrix represents the number of variables, and each cell plot in the matrix displays the scatter plot of the variables and . Regarding year, it is difficult to indicate significant changes in time for all variables. Regarding festival types, the types of festivals held each year are similar, and no significant difference exists among them. As for the pairwise relations among the seven significant variables, the cell plots of the red dotted rectangle in Figure 5 illustrate that positive linear relations are observed in general.  Histograms of three significant variables-cultural content, transport accessibility, and advanced publicity-before and after quantile transformation. It can be seen that the distribution of sparse regions is greatly reduced.

Exploratory Data Analysis
Here, we present the exploratory data analysis used to understand the data distribution and simple but significant data patterns. Figure 5 presents a scatter matrix showing associations between independent variables. The length of the rows and columns of the matrix represents the number of variables, and each cell plot in the matrix displays the scatter plot of the variables X i and X j . Regarding year, it is difficult to indicate significant changes in time for all variables. Regarding festival types, the types of festivals held each year are similar, and no significant difference exists among them. As for the pairwise relations among the seven significant variables, the cell plots of the red dotted rectangle in Figure 5 illustrate that positive linear relations are observed in general.
independent variables, we obtained both the accurate performance and fast con speed of the prediction models because the transformed data set reduced the sp in the data space [33]. Histograms of three significant variables-cultural content, transport accessibility, and advanced publici before and after quantile transformation. It can be seen that the distribution of sparse regions is greatly reduced.

Exploratory Data Analysis
Here, we present the exploratory data analysis used to understand the data tion and simple but significant data patterns. Figure 5 presents a scatter matrix associations between independent variables. The length of the rows and colum matrix represents the number of variables, and each cell plot in the matrix disp scatter plot of the variables and . Regarding year, it is difficult to indicate si changes in time for all variables. Regarding festival types, the types of festivals h year are similar, and no significant difference exists among them. As for the pai lations among the seven significant variables, the cell plots of the red dotted rec Figure 5 illustrate that positive linear relations are observed in general.  As shown in Figure 6, we intuitively identified the highly correlated variable pairs, including {food, shopping opportunities}, {travel guide, festival program}, {shopping opportunities, festival program}, {food, festival program}, and {shopping opportunities, cultural content}. The results of the correlation matrix informed us that each variable is closely tied to improvements in tourist satisfaction toward local festivals. We found that the variable combination with the highest linear correlation among three different variables (including dependent variables) was {festival program, food, and tourist satisfaction}. Note that only linear correlations are visualized in this plot. Furthermore, Figure 7a shows a three-dimensional scatter plot with a grid-patterned hyperplane. As described, it shows how much more important the festival program is to leveraging tourist satisfaction than food, but food is also important. However, Figure 7b shows that on the plot with travel guide, food, and tourist satisfaction, it is difficult to infer clear linear correlations between the two independent variables (cultural content and transport accessibility) and the dependent variable (tourist satisfaction). As shown in Figure 6, we intuitively identified the highly correlated variable pairs, including {food, shopping opportunities}, {travel guide, festival program}, {shopping opportunities, festival program}, {food, festival program}, and {shopping opportunities, cultural content}. The results of the correlation matrix informed us that each variable is closely tied to improvements in tourist satisfaction toward local festivals. We found that the variable combination with the highest linear correlation among three different variables (including dependent variables) was {festival program, food, and tourist satisfaction}. Note that only linear correlations are visualized in this plot. Furthermore, Figure 7a shows a three-dimensional scatter plot with a grid-patterned hyperplane. As described, it shows how much more important the festival program is to leveraging tourist satisfaction than food, but food is also important. However, Figure 7b shows that on the plot with travel guide, food, and tourist satisfaction, it is difficult to infer clear linear correlations between the two independent variables (cultural content and transport accessibility) and the dependent variable (tourist satisfaction). (a) Three-dimensional plots for festival program, food, and tourist satisfaction.
(b) Three-dimensional plots for cultural content, transport accessibility, and tourist satisfaction.  (a) Three-dimensional plots for festival program, food, and tourist satisfaction.
(b) Three-dimensional plots for cultural content, transport accessibility, and tourist satisfaction. Next, we explored whether the independent variables have sufficient explanatory power in our data sets. Here, we used a dimensionality reduction technique to conduct a multivariate data analysis. Figure 8 shows the results of the dimensionality reduction with  Next, we explored whether the independent variables have sufficient explanatory power in our data sets. Here, we used a dimensionality reduction technique to conduct a multivariate data analysis. Figure 8 shows the results of the dimensionality reduction with a principal component analysis (PCA). Each figure presents a plot that was drawn while changing the number of principal components (PCs). Figure 8a indicates that the first two PCs account for over 80% of the total variance in the original data sets. Figure 8b-d show the results for three PCs, four PCs, and five PCs, respectively. Since the PCA assumes independence between PCs, it is natural that the scatter plot matrix does not appear to be correlated. Note that the purple-colored dots denote a higher tourist satisfaction rate, while the yellow-colored dots denote a lower rate. For each PC, we observed that the PC is linearly correlated with a dependent variable (i.e., tourist satisfaction), indicating that the independent variables can be used to predict the dependent variable in our data sets. Next, we explored whether the independent variables have sufficient explanatory power in our data sets. Here, we used a dimensionality reduction technique to conduct a multivariate data analysis. Figure 8 shows the results of the dimensionality reduction with a principal component analysis (PCA). Each figure presents a plot that was drawn while changing the number of principal components (PCs). Figure 8a indicates that the first two PCs account for over 80% of the total variance in the original data sets. Figures 8b-d show the results for three PCs, four PCs, and five PCs, respectively. Since the PCA assumes independence between PCs, it is natural that the scatter plot matrix does not appear to be correlated. Note that the purple-colored dots denote a higher tourist satisfaction rate, while the yellow-colored dots denote a lower rate. For each PC, we observed that the PC is linearly correlated with a dependent variable (i.e., tourist satisfaction), indicating that the independent variables can be used to predict the dependent variable in our data sets. Table 2 summarizes basic statistics for each variable.

Prediction
This section outlines how the regression models were built to compare the performance of the representative machine learning algorithms. Figure 9 shows the data structure used to build the prediction models, which includes seven independent variables (festival program, shopping opportunities, food, advanced publicity, travel guide, transport accessibility, and cultural content) and two additional variables (year and festival type). Among these, the variable "year" was used to verify whether tourist satisfaction differed over time. To select the best model, we considered 17 representative machine learning algorithms: (1) linear regression models, including lasso [34], ridge [35], elastic net [36], and passive aggressive regressors [37]; (2) k-nearest neighbor [38]; (3) decision tree regressor; (4) support vector regressors (SVR), including SVR with linear or polynomial kernels and nu-SVR [39]; (5) bagging methods, such as bagging regressors and random forest regressors [40]; (6) boosting methods, such as an AdaBoost regressor, gradient boosting machines, and an XGB regressor [41]; and (7) deep neural networks [42].
(1) The lasso, ridge, and elastic net linear regressors all identify the fitting function, which minimizes the prediction error with different regularization terms and shrinkage roles of parameter variance. The cost functions for these shrinkage methods are calculated as follows: where λ denotes a weight hyperparameter, p denotes the number of variables, and w is a parameter to be learned. Although our data set has few independent variables, the experiments were performed using three methods to evaluate the relative performance accuracy. The passive aggressive regressor is a first-order online learning method that updates the weight, w, to optimize the following equation: where w t denotes a parameter to be learned at time t. We aggressively update w t when the loss is nonzero as w t+1 : which is the learning rate at time t, and x t denotes the sample.
(2) The k-nearest neighbor algorithm is a nonparametric method that uses k-nearest training samples in the feature space. The k-nearest neighbor regressor predicts the dependent variable by using the average values of its k-nearest neighbors.
(3) The decision tree algorithm is also a well-known nonparametric method with a tree-like structure. We used classification and regression tree (CART) techniques, which recursively divide data into sets of rectangular regions and model the distribution of the dependent variables in order to make predictions [43].
(4) Support vector machines (SVM) are a supervised learning method that efficiently handle high-dimensional data. SVRs embed the independent variables onto a highdimensional feature space to build a linear regressor [44]. The objective function of the SVR is defined as follows: where ξ i denotes the deviation from the support vector margin for the concept of slack variables when ε is the threshold for lower error sensitivity in the training data set. (5) Bagging methods build an ensemble of multiple classifiers by manipulating the training data with weak learners. Among these, the random forest is a representative ensemble algorithm that constructs multiple decision trees to avoid overfitting. As for regression, the random forest makes predictions by averaging the predicted values of its individual decision trees [40]. The random forest technique is well-known before being robust against noise and overfitting problems.
(6) Boosting methods produce a predictive model by combining weak learners produced in an iterative fashion [41]. Among these, the light gradient boosting machine (LGBM) learning method is best able to show highly accurate performance in various fields.
LGBM has two benefits: (1) it achieves higher accuracy than other boosting approaches, such as eXtreme Gradient Boosting or AdaBoost, by enabling more complex leaf-oriented split trees, and (2) it is faster in the training phase and offers high efficiency in terms of gradient descent [45].
(7) Finally, artificial neural networks have recently received increased attention because of the impressive accuracy of their predictions. These networks build a cascade of several layers for linear and nonlinear processing to perform representation learning [42].
To obtain reliable hyperparameter settings, a 10-fold cross validation with grid search was used to minimize the mean squared error (MSE) for each model with the given data set. We measured three accurate performance metrics: R 2 , adjusted R 2 , and MSE. R 2 is a statistical measure that shows the proportions of the variance of the predicted values for the variance of the actual value in the dependent variable. Adjusted R 2 is a modified measure of R 2 that is created by adjusting for the number of independent variables in the trained model. The adjusted R 2 is used to correct for overestimation. Finally, MSE is calculated as follows: Note that the reason we built so many predictive methods was to ensure that the predictors could determine tourist satisfaction with a high level of accuracy. The accuracy of the predictors must be confirmed because their performance directly connects to the next research step (i.e., XAI) to identify the true strengths and weaknesses of each local festival. The less accurate the predictor performance is, the more likely we are to arrive at incorrect characteristics for the local festivals. Table 3 presents the adjusted R 2 , R 2 , and MSE results for the 17 regressors used in this study. Overall, the light gradient boosting regressor outperforms in three metrics. Therefore, based on these experiments, we selected the light gradient boosting regressor as our predictive model for tourist satisfaction. In the next section, we present the decomposition of the prediction for each festival in a step-by-step manner. In addition, we presented the critical hyperparameter setting for the algorithms. Regarding bagging and boosting-based predictors, we set the number of the weak classifiers to 260-300, used the bootstrap procedure, and explored the adequate max depth of the tree-based classifiers from 3-20. For the decision tree, we considered the tree height (from [4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20] and the performance metric as GINI impurities. Regarding kNN, we set the number of neighbors to 4 and used Manhattan distance to measure the similarity between the observations, As for linear models such as lasso, ridge, and elastic net, we identified the optimal value of the hyperparameter as λ 0 = 0.13 (lasso), λ 1 = 0.02 (ridge), λ 0 = 0.0015, λ 1 = 0.00001 (elastic net). In SVM, the value of the hyperparameters were: C = 10, gamma = 0.0001, and kernel = "radial basis function". For neural networks, the model contained one input layer, three hidden layers, and one output layer. Regarding the hidden layers, the dropout (probabilities = 0.1) and rectified linear units were used to prevent overfitting. For the other variables, the default value was used. Overall, the random state of each algorithm was set to 2021 for reproducibility.

Shapley Additive Explanations
Here, we explain the variable significance for each observation regarding our predictions. Before describing how we used XAI in our study, we present the basics of SHAP, which is one of the most popular XAI frameworks. Based on game theory, SHAP describes the performance of a predictive model. To determine a model's explanation capability, SHAP uses an additive feature attribution technique, defining the output model as a linear addition of the contributions of independent variables X = x 1 , x 2 , . . . , x p , where p is the number of variables. Here, we define the predictive model as f (·) and the explainable model as g(·); this can be formulated as follows: where z denotes the transformed independent variables as z ∈ {0, 1} p for all Shapley values φ. As shown, g(·) is a linear function that can be obtained by summing φ i . To confirm absences, φ 0 is the constant value when all of the independent variables are missing. To obtain g(·), we used the equation below: where |z | is the number of nonzero variables, and z \i denotes z i = 0. Only g(·) can be obtained by this formula.
Using the SHAP framework, we constructed two building steps: building a prediction model and building an explainable model based on the given data set {x, y}. Figure 10 illustrates the differences between the two steps. First, we built an accurate prediction model f (x) by minimizing the sum of the squared residuals (SSR) function between the observed y and predictedŷ. We then built the explainable model by interpreting how f (x) predictsŷ. SHAP, a model agnostic method, then allowed us to decompose the prediction results and indicate which variables have a relatively significant impact score, φ i , for the ith variable in the predictions. SHAP uses the Shapley value φ, which denotes the mean of the marginal contributions across all permutations of the variables in the predictions. Figure 11 provides a summary plot that presents the explainability of the overall SHAP values. The independent variables are ordered according to their predictive contributions, and the colors in the plot illustrate the Shapley value of each independent variable for each observation. Figure 11 shows that the festival program is the most important indicator for the success of a local festival. Put another way, tourist satisfaction with the festival program has a considerable influence on their overall satisfaction with the festival. Lower program scores lead to lower tourist satisfaction, and higher program scores lead to higher tourist satisfaction. Advanced publicity and travel guide are the next most important variables for tourist satisfaction. In particular, we found that tourists underestimate a festival if the travel guide scores low. However, the influences of advanced publicity and travel guide are quite limited, as shown by their respective SHAP values. In contrast, a positive evaluation of the food offered at a festival has a positive influence on tourist satisfaction. In terms of cultural understanding (cultural content), the variable's effect on overall satisfaction is not simply linear, as the figure also presents an inverse relation. The results of the remaining independent variables are insignificant, and the results of the year variable indicate that no significant change in satisfaction occurred over the 12-year study period. ainability 2021, 13, x FOR PEER REVIEW Figure 11. SHAP summary plot for predictors of tourist satisfaction.

Results
We next investigated which variables had a major influence on t tourist satisfaction score is high and low. To do so, we first divided th two groups based on tourist satisfaction: overestimated and underesti isfaction was then predicted using the predictive model (•), and we m values for each variable using the explainable model (•). Figures 12val cases in which the predictions of tourist satisfaction were under here, tourists largely identified the festival's program and advanced pu reasons for a festival's low score. As such, these charts illustrate the w festivals, providing festival planners with insight into which areas sho  We next investigated which variables had a major influence on the cases where the tourist satisfaction score is high and low. To do so, we first divided the observations into two groups based on tourist satisfaction: overestimated and underestimated. Tourist satisfaction was then predicted using the predictive model f (·), and we measured the SHAP values for each variable using the explainable model g(·). Figures 12-15 present the festival cases in which the predictions of tourist satisfaction were underestimated. As seen here, tourists largely identified the festival's program and advanced publicity as the main reasons for a festival's low score. As such, these charts illustrate the weaknesses of these festivals, providing festival planners with insight into which areas should be improved.    In contrast, Figures 16-19 present the festival cases in which the predictions of tourist satisfaction were overestimated, thereby providing insight into the strengths of each festival. Finally, Figure 20 shows the interaction plots for the single effect of the shopping opportunities and food variables, which are positively correlated to tourist satisfaction; however, the festival program continues to have a large influence on tourist satisfaction.

Conclusions
This study aims to predict and explain tourist satisfaction for local festivals by identifying the significant variables to enable festivals to establish an adequate tourism strategy. We built various machine learning models and compared their predictive performance to obtain both the performance and explanation accuracy of predictive models. Subsequently, we reviewed the explanations of predictive results and presented the strength and weakness characteristics of each local festival. The proposed approach is a practical solution to minimize the uncertainty of revitalizing tourism at local festivals by identifying important variables of local festivals and by drawing a deeper understanding of their success points. The experimental results of the XAI demonstrate that the prediction and explanation results offer valuable insights for identifying the problems of local festivals and their potential solutions.
The main contributions of our study are twofold. First, we proposed machine learningbased festival estimation models including both predictive and explainable models. The proposed methods are not only helpful to identify the key success factors for each local Korean festival but also explain what factors should be improved to capture the attention of more tourists. Experimental results based on real data collected over 12 years demonstrated the applicability and effectiveness of our approach. Therefore, the proposed approach could be useful to promote the attraction strategy for tourists, resulting in leveraging the success of local festivals.
There are some limitations to our study. First, we did not consider time series patterns such as increases of preferences for specific types of local festivals and the increase or decrease of the number of foreign tourists. The stationary conditions for all of the variables were assumed to build uncomplex models for relationships between the significant variables for local festivals and tourist satisfaction. Second, our approach is only designed to estimate and explain tourist satisfaction and does not consider exogenous variables such as complex economic effects within the local cities, the effect of an increase or decrease of the number of foreign and native festival visitors, and the support from local the local administration. Finally, recent local festivals have been greatly affected by the impact of COVID-19, experiencing problems such as (1) fewer tourists, (2) program restrictions, and (3) budget cuts [46]. We should identify the critical effects of a serious social pandemic such as COVID-19 on local festivals.
Regarding future research, we have two plans: (1) a methodological approach and (2) a sustainability approach. First, we plan to discover the deep causal relationships between the significant variables of local festivals over time. We extend the predictive approach to addressing the complex relations between many other variables to improve the applicability and the prediction robustness of the proposed method. Further, future studies may extend this XAI-based approach to model other prediction-based tourism research. Second, our study can be extended to resolving potential problems in the area od sustainability. For overtourism especially, the proposed method can be adopted to diagnose the causes of overtourism. We can establish a strategy to disperse tourists from overcrowded areas by identifying the critical causes for the adjacent less-popular areas. Moreover, we can use the proposed approach to suggest the optimal use of the budget to revitalize tourism. Finally, we can establish an integrated strategy for dispersing tourists by combining the survey results of adjacent cities, especially in cases where a tourism imbalance problem exists. Using the multi-task method, our approach can be helpful to combine multiple survey results and to represent global optimal solutions for tourism strategies.