Next Article in Journal
Effect of Silica Fume on Engineering Performance and Life Cycle Impact of Jute-Fibre-Reinforced Concrete
Previous Article in Journal
Identification of Gendered Trait Preferences among Rice Producers Using the G+ Breeding Tools: Implications for Rice Improvement in Ghana
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Super Typhoon Rai’s Impacts on Siargao Tourism: Deciphering Tourists’ Revisit Intentions through Machine-Learning Algorithms

by
Maela Madel L. Cahigas
1,*,
Ardvin Kester S. Ong
1 and
Yogi Tri Prasetyo
2,3
1
School of Industrial Engineering and Engineering Management, Mapúa University, 658 Muralla St., Intramuros, Manila 1002, Philippines
2
International Bachelor Program in Engineering, Yuan Ze University, 135 Yuan-Tung Rd., Chung-Li, Taoyuan 32003, Taiwan
3
Department of Industrial Engineering and Management, Yuan Ze University, 135 Yuan-Tung Rd., Chung-Li, Taoyuan 32003, Taiwan
*
Author to whom correspondence should be addressed.
Sustainability 2023, 15(11), 8463; https://doi.org/10.3390/su15118463
Submission received: 1 May 2023 / Revised: 16 May 2023 / Accepted: 17 May 2023 / Published: 23 May 2023
(This article belongs to the Section Tourism, Culture, and Heritage)

Abstract

:
Super Typhoon Rai damaged Siargao’s tourism industry. Despite the reconstruction projects, there was still evidence of limited resources, destructed infrastructures, and destroyed natural resources. Therefore, this study aimed to examine the significant factors influencing tourists’ intentions to revisit Siargao after Super Typhoon Rai using feature selection, logistic regression (LR), and an artificial neural network (ANN). It employed three feature-selection techniques, namely, the filter method’s permutation importance (PI), the wrapper method’s Recursive Feature Elimination (RFE), and the embedded method’s Least Absolute Shrinkage and Selection Operator (LASSO). Each feature-selection technique was integrated into LR and the ANN. LASSO-ANN, with a 97.8146% model accuracy, was found to be the best machine-learning algorithm. The LASSO model performed at its best with a 0.0007 LASSO alpha value, resulting in 35 subfeatures and 8 primary features. LASSO subsets underwent the ANN model procedure, and the optimal parameter combination was 70% training size, 30% testing size, 30 hidden-layer nodes, tanh hidden-layer activation, sigmoid output-layer activation, and Adam optimization. All eight features were found to be significant. Among them, hedonic motivation and awareness of Typhoon Rai’s impact were considered the top-tier post-typhoon tourism factors, as they maintained at least 97% prediction accuracy. The findings could be elaborated by combining feature-selection techniques, utilizing demographic characteristics, assessing Siargao’s tourism before the typhoon, and expanding the context and participant selection. Nevertheless, none of the existing studies explored the combination of feature selection, LR, and ANNs in a post-typhoon tourism context. These unique methods and significant findings represent the study’s novelty. Furthermore, practical contributions were provided through economic resolutions focusing on tourism activities and communication revamping by the government, media outlets, and transportation companies.

1. Introduction

In December 2021, the Philippines recorded the strongest typhoon of 2021 in the country. Super Typhoon Rai, also known as Odette, transformed from Category 1 to Category 5 within 24 h [1]. It brought heavy rains, winds, floods, landslides, and storms. Approximately 40.54% of the Philippines’ entire population was affected by the typhoon [2]. Among the affected regions, Siargao experienced the first landfall and was the hardest-hit area [1,3]. News outlets reported mortality, injuries, missing cases, and health issues [1]. In addition to human lives, the regions’ livelihoods were also affected. The island’s properties and residents’ livelihoods incurred damages of approximately USD 388 million [4]. Moreover, renowned insurance companies categorized Super Typhoon Rai as the deadliest typhoon in 2021 [1]. The overall and recent impact of Super Typhoon Rai convinced the researchers to focus on it out of all the natural disasters in the Philippines. Hence, the research problem originated from the devastating aftermath of Super Typhoon Rai because tourism is the primary source of livelihood in Siargao. These post-typhoon tourism causes, impacts, and resolutions are better explained through the application of machine learning.
Machine learning handles model complexity by transforming data and solving designated problems [5]. Among all machine-learning concepts, this study addresses the research problem by utilizing feature selection, logistic regression (LR), and an artificial neural network (ANN). Feature selection prioritizes critical features over irrelevant features [5]. In this study, feature selection was utilized to determine significant features before they undergo LR and the ANN. The researchers maximized important features in LR and the ANN to identify the factors affecting Siargao tourists’ revisit intentions after Super Typhoon Rai hit Siargao. LR is a supervised machine-learning method that applies the sigmoid function through binary variables [6,7]. Meanwhile, the ANN describes functional relationships among the model’s features and investigates human behavior in different contexts [6,7,8,9].
Although machine learning is widely used in various contexts, including tourism and post-disaster regions, there are insufficiently many academic papers focusing on the application of feature selection, LR, and ANNs to both tourism and post-typhoon response. For example, Tien Bui et al. [10] modified the LR model to address forest-fire and park tourism protection. However, only 10 features were utilized, which undermined the model’s accuracy rate. Moreover, they only considered forest-fire features and no features for tourism. Next, a typhoon-related study by Chen et al. [11] concentrated on LR and overlooked the significance of comparing it with other algorithms. While they generated landslide-susceptibility maps, the past study did1 not discuss landslide impacts or the course of action after the typhoon triggered multiple landslides. Tsaur et al. [12] applied both LR and an ANN to assess tourists’ loyalty to hotels. However, they only utilized typical hotel features, did not employ feature selection, and focused on model fitting instead. These existing studies failed to provide an in-depth analysis of feature selection, LR, and ANNs in a post-disaster tourism context. The lack of solidity in the past studies’ methods stemmed from the absence of a conceptual model. In the present study, the researchers addressed the gap by utilizing the extended theory of planned behavior model, as it best described tourist behavior after a super typhoon.
In addition, ANN techniques are frequently used in predicting tourist demands [13,14,15]. However, the present study argues that ANNs can also be used to investigate factors influencing tourist behavior. The following studies failed to elaborate on tourists’ perceptions. One study focused on international tourists’ overnight stays in one of Europe’s well-known tourist destinations [13]. Another study found a connection between Macau’s tourist-demand factors and arrival volumes [14]. Some researchers forecasted tourist demands a year after the COVID-19 pandemic surge [15]. Although the tourism context is represented in these past studies, none of them used various machine-learning techniques to assess post-typhoon tourism response. Furthermore, they only employed simplified data acquisition and machine learning. For instance, Law et al. [14] focused on search-engine keywords, which were prone to bias as they could result in false positives. Unlike the past study, the current study applies actual responses and employs distinct feature-selection techniques. Meanwhile, Talwar et al. [16] evaluated Japanese residents’ traveling behavior during and after the pandemic. They utilized the big-five personality traits as the primary inputs of an ANN. However, they considered all 25 features and overlooked feature selection. Similarly, tourist behavior was analyzed through an ANN [8]. However, the researchers derived features and website reviews, which hindered the application of feature selection. Although Xu et al. [17] utilized feature engineering to scrutinize landslide deformation appearances, they only focused on the causal patterns. On the other hand, researchers maximized convolutional neural networks (CNNs), one of the classes of ANNs, to locate earthquake survivors from captured images [18]. Their research was centered on disaster management, contrary to the current study, which aimed to combine disaster response and tourism recovery. Based on these past studies, the authors employed ANNs for optimization purposes. They overlooked other purposes of ANNs, such as predicting significant factors. More importantly, they focused on one algorithm (e.g., an ANN) and optimized ANN parameter settings instead of testing other algorithms that are comparable to ANNs. Thus, the past studies’ approaches are characterized by subjective analysis. The present study eliminated bias issues by incorporating three types of feature-selection techniques and LR.
The preceding research-gap discussion leads to the following research questions: (1) How can feature selection integrated with LR and an ANN identify Siargao tourists’ behavior after Super Typhoon Rai?; (2) What is the best combination of machine-learning techniques among feature selection, LR, and ANNs?; (3) What is the optimal parameter of the best machine-learning algorithm combination?; and (4) What are the significant factors affecting tourists’ intentions to revisit Siargao after Super Typhoon Rai?
Hence, the researchers aimed to examine the factors influencing Siargao tourists’ revisit intentions with respect to after Super Typhoon Rai using feature selection, LR, and an ANN. As of this writing, this is the first study that has utilized feature selection, LR, and an ANN to decipher tourists’ behavior after a super typhoon. These innovative combinations of machine-learning algorithms and the evaluated context have not yet been explored by any academicians, as indicated by the research gap. The study’s novelty can benefit academicians focusing on human behavior, natural-disaster impacts, and machine learning. The findings of this research can also alleviate the economic problems of the government sector, Siargao business owners, other tourism-related commercial companies, and residents.
The remainder of the article is organized as follows. Section 2 discusses relevant studies on machine-learning algorithms and conceptual features. Section 3 explains the data collection and processes behind the three machine-learning algorithms. Then, Section 4 presents the results generated from the utilized methods. Next, Section 5 interprets the present findings by comparing them with those of past studies, determining stakeholders and academic contributions, and identifying further improvements. Finally, Section 6 concludes with a consideration of the study’s innovative methods, findings, and benefits.

2. Literature Review

This section is subdivided into four parts. The first part discusses the concept of feature selection and the three feature-selection techniques. The second part explains logistic regression, its application, and relevant studies. The third part shows related artificial neural network studies in the context of tourism or typhoons. Meanwhile, Table S1 in the Supplementary Materials summarizes the journals by showing their significance with respect to tourism, typhoons, and machine learning. Finally, the fourth part displays the origin of the model’s 8 features and their 59 subfeatures.

2.1. Feature Selection

Feature selection selects the best set of features, also known as the subset, among all probable combinations [5]. This technique assures the quality of data by removing unnecessary features in the study’s context [19]. Caraka et al. [20] utilized feature selection as a pre-processing technique before applying multivariate analysis in identifying visiting intentions in Indonesia. Yuan et al. [21] revealed that feature selection assisted in the pattern identification of tourists’ activities. Meanwhile, Sheykhmousa et al. [22] used feature-selection results as the primary feed for a designed machine-learning algorithm centered on the Philippines’ post-disaster recovery plans. The resulting feature subset from the study of Tien Bui et al. [23] underwent three neural network designs to compare flash-flood susceptibility factors. However, feature selection with statistical fundamentals produces subsets that are more difficult to interpret [24]. Considering the efficiency and accuracy of data interpretation, the present study employed feature-selection techniques equipped with machine learning. In particular, the researchers assessed the three feature-selection categories in machine learning, which are the filter, wrapper, and embedded methods.
The filter method considers highly correlated features with the dependent variable [25]. It is the simplest feature-selection method, which ranks features based on the strength of their relationships [26]. In this study, features need to have a strong connection with tourist’s intentions to revisit because weak features are eliminated from the model. Permutation importance (PI) was the best filter method for researchers aiming to compare one filter method with another feature-selection technique [27]. It does not assume the model’s nature and removes biases [28]. Specifically, permutation importance rearranges features randomly until a high predictive importance score is achieved [29]. The researchers found these characteristics to be relevant to the study’s objectives. Since the researchers aimed to integrate permutation importance into the most-suitable machine-learning algorithm, all types of algorithms could be utilized in the present study. Muñoz et al. [30] trained eight models by applying permutation importance. They revealed the contributions of nine factors to the respective models by identifying vital local and international tourism factors in Southern Norway. Li et al. [31] recalculated model accuracy by employing permutation importance. They ranked features accordingly and identified important power-interruption factors induced by the typhoon. Kim et al. [32] investigated landslide features dependent on the model. They ran the model twice and found 11 important features affecting landslides in Gangwondo, Korea.
Next, the wrapper method evaluates the most-important features through its built-in search techniques dependent on the predictor [19]. It is bounded by the evaluation function, search technique, and predictor. Apart from determining the quality of features, the wrapper method has a huge memory capacity, as it can be extended for optimization and forecasting problems [33]. Out of all wrapper techniques, the present study used Recursive Feature Elimination (RFE). RFE trains the subsets continuously until all unimportant features are removed from the model [24]. It is a frequently used method and is used every time researchers compare the wrapper technique with other feature-selection types [25]. In addition, RFE can be combined with another feature-selection technique to increase the accuracy result. A study disclosed that RFE produced the lowest error rate among all the wrapper methods [34]. Since the past studies support RFE’s effectiveness, the current study analyzed post-typhoon and tourism features using the wrapper method’s RFE. Kołakowska and Godlewska [35] reduced the number of tourism features by generating a new subset derived from important RFE predictors. They considered factors that mostly affected tourist traffic and trip prices. Additionally, Xiao et al. [36] proposed a new algorithm but noted that RFE defeated the existing machine-learning algorithms as it reduced 21 soil features to 12 important features. The past study wanted to mitigate climate change that could induce natural-disaster aftermaths through unwanted soil properties.
Lastly, the embedded method ranks the importance of each feature by minimizing prediction error [19]. It uses regularization techniques that penalize insignificant variables in the model [26]. The primary advantage of embedded feature selection is the combination of the characteristics of the filter and wrapper methods [5]. Thus, it can be applied to different sets of data easily. An example of the embedded method is the Least Absolute Shrinkage and Selection Operator (LASSO). This removes unimportant features by imposing penalties dependent on the estimated regularization parameters [33,34]. Unimportant features are penalized by imposing zero coefficients [37]. Hence, features with residual values are considered part of the important subset. Unlike other embedded methods that produce feature combinations with overly correlated values, the LASSO approach minimizes overfitting and maintains a high prediction accuracy [33]. The researchers considered the mentioned advantages of using LASSO instead of other embedded feature-selection techniques. Through the LASSO method, Kołakowska and Godlewska [35] discovered the changes in tourists’ perceptions before and after the pandemic. In particular, tourists in Poland were perceived to value reimbursements, feedback, assurance, and comparison features. Chang et al. [38] investigated 13 hotel features that affect tourists’ travel and revisit intentions. They found that LASSO could reduce the features to 8 and 10, and 10 hotel features had the highest accuracy rate when fed into other machine-learning algorithms. On the other hand, Jones et al. [39] selected preventive and triggering landslide factors while comparing different typhoons in the Philippines. They accentuated common landslide-related factors that are all present at four different time points.

2.2. Logistic Regression

Logistic regression (LR) is a type of supervised classification model which can predict an algorithm’s objectives [34,37]. It is also utilized to find the best overall model fitting through an accuracy score [40]. A study emphasized that LR is one of the best algorithms that can compete against ANNs [12]. Therefore, the researchers aimed to determine the best machine-learning algorithm between LR and ANNs. The purpose of utilizing both techniques is centered on predicting the intended outcome. In this case, the exploration of a good algorithm would determine factors affecting tourists’ intentions to revisit Siargao after a super typhoon. Another advantage of LR is the presence of variables’ unbiased weights, as this simplifies the model’s condition [11].
Despite the presence of the LR approach in past studies, the established algorithm was noncomprehensive. For example, Tsaur et al. [12] focused on LR’s importance score and did not assess accuracy, precision, recall, and F-1 score. Although Chen et al. [11] determined accuracy, precision, and recall, they only produced approximate values of 71–76% for all three LR aspects. This percentage range was deemed a low rate, which could be increased further by enhancing the algorithm. Tien Bui et al. [10] created a hybrid LR approach, but it was only limited to 10 features. Limited inputs undermine the established accuracy rate, since the number of features affects model fitting. Furthermore, it was seen from past studies that none of them combined tourism and post-super typhoons in the context of combined feature selection, LR, and ANNs. While Li et al. [41] compared logistic regression and random-forest methods, they only utilized one feature-selection technique centered on factors affecting flash floods. The objective of Guzzetti et al. [42] was met, as they established landslide warning systems, but the inputs fed into LR and ANN algorithms lacked credibility. They used historical data directly without any feature selection or other pre-processing techniques.

2.3. Artificial Neural Networks

Artificial neural networks (ANNs) are predictive tools for human behavior [43]. They integrate the human brain and artificial intelligence models to analyze extensive datasets, which guarantees efficient speed and high accuracy [14]. Hence, the present study utilized the ANN approach to process complex data from the identified tourism-related features after a natural disaster. One of the ANN’s advantages is its flexibility to choose activation functions depending on the effect of the customized features on the overall accuracy [9]. This advantage helps to filter functions and feature combinations that would produce a better accuracy result. ANNs also ensure extensive results compared to conventional empirical models [44]. They eliminate traditional model analysis, and data can be used for prediction, classification, and data-segmentation purposes [40].
Considering ANNs’ purposes, a few studies selected features based on reliable theories. For instance, Talwar et al. [16] considered the big-five personality traits theoretical model as its primary features that underwent an ANN procedure. In addition, Leong et al. [45] considered service-quality features to determine tourists’ loyalty to the airline industry. Meanwhile, some studies modified their ANN features according to historical data. Mikhailov and Kashevnik [8] considered driving-related features (e.g., distance, duration, speed, and acceleration) to construct the layers of an ANN model. Likewise, researchers created an ANN model for areas affected by landslides and utilized environmental features, such as morphology, drainage, geology, and soil [10]. Apart from identifying significant features, an ANN was also utilized to forecast tourist demands. Claveria and Torra [13] employed an ANN to increase the forecasting accuracy model by maximizing tourist behavioral data focused on Catalonia, Spain. Law and Au [44] forecasted Japanese tourist demands in traveling to Hong Kong through an ANN. Moreover, Palmer et al. [6] applied an ANN to predict tourism expenditure in the Balearic Islands, Spain.
Despite the promising functions of ANNs, the presented studies overlooked the importance of analyzing both subfeatures and features within the ANN models. This approach is only possible by integrating feature selection and ANNs, which the current study proposes. Most studies only applied ANNs to determine tourist loyalty, arrival, and expenditure [6,13,45]. While a few researchers evaluated other types of crises, such as the COVID-19 pandemic [16] and landslides [42], they overlooked the tourist behavior after a super typhoon hit a popular destination. These past studies did not pinpoint a specific travel destination and crisis variant (e.g., Delta COVID-19 or the Haiyuan landslide). Since these presented studies failed to consider the sustainability of tourism in times of natural disaster, the current study aims to close the research gap by integrating feature selection, LR, and an ANN.

2.4. Features from the Conceptual Model

This study assessed the following features: (1) awareness of Typhoon Rai’s impact, (2) crisis management, (3) hedonic motivation, (4) perceived travel constraints, (5) perceived travel risks, (6) attitude, (7) subjective norms, and (8) perceived behavioral control. Their summarized definition is displayed in Table 1. Moreover, these eight features were adopted from the study of Cahigas et al. [46] to further elaborate their importance using machine-learning approaches. Although the past study identified significant factors affecting tourists’ revisit perceptions, it emphasized the significant relationships between features instead. In the present study, 59 subfeatures were given equal importance to features. Both subfeatures and features were evaluated individually, which made the analysis more independent and comprehensive. The comprehensive description of 59 subfeatures is presented in Table S2 in the Supplementary Materials.
First, awareness of Typhoon Rai’s impact depicts the tourist’s level of understanding when the typhoon hit the Philippines. Awareness of the typhoon is an essential feature because it advises residents about pre-disaster and post-disaster updates [47]. Without updates, individuals would have difficulty translating calamity warnings and aftermath issues. Awareness in all forms of communication protects human lives and minimizes property damage [48]. Tourists acquire knowledge by reading and watching the news [46]. Their awareness also increases by attending seminars facilitated by non-governmental organizations [49]. Some individuals have adequate knowledge of natural calamities because of frequent exposure [50]. They tend to learn the risks and adapt to dangerous situations if they have first-hand experience of calamities [50]. Thus, tourists who had previous typhoon experience were more aware of the possible dangers caused by the typhoon.
Second, crisis management refers to the governmental and non-governmental programs to help victims of Super Typhoon Rai. This is a distinguished typhoon-related feature as it aids timely actions [18]. The government disseminates information, arranges rescue operations, and provides product supplies [47]. This shows that the government bridges the gap between the public and typhoon victims. The government also renovates damaged establishments, builds alternative infrastructures, and provides financial support to tourist businesspeople. Since most establishments and livelihood sources are centered on Siargao’s tourist spots, the government holds a greater responsibility to determine tourism-recovery plans [4]. The presented scenarios show that combined efforts among all concerned groups are geared to bring success in post-disaster tourism recovery.
Third, hedonic motivation is the influence of tourists’ positive emotions regarding revisiting Siargao. It primarily consists of emotional stimulators, such as enjoyment, relaxation, and amazement [51]. This study assessed hedonic motivation because it triggers the behavior of tourists. Supporting the current study, hedonic motivation had the highest direct impact on tourists’ intention to travel because of tourists’ eagerness to experience leisure activities [52]. Hedonic motivation is also an indicator of valued tourist spots because tourists would only pay attention if satisfaction were guaranteed [53]. These past studies showed the ripple effects of hedonic motivation on tourists’ behaviors and the identified tourist spots.
The fourth feature is perceived travel constraints. It describes all the existing physical and emotional limitations that would hinder tourists visiting Siargao. While there are positive features affecting tourists’ behaviors, negative features are as important as positive features because both affect tourist behaviors. For example, tourists would not visit a tourist spot if there were a lack of credible travel agencies, if they were to feel uncomfortable visiting a typhoon-affected region, or if they were to encounter inadequate public transportation [46]. Apart from convenience, the financial factor is also a well-known travel constraint [51]. Furthermore, security provides safety for tourists, since there might be limited infrastructure and basic supplies [6].
The fifth attribute is known as perceived travel risks. Contrary to travel constraints that are deemed existing limits, travel risks are uncertain barriers for tourists because tourists may or may not experience them. Perceived travel risks are investigated because they cover all important factors affecting tourists’ behaviors. Examples of perceived travel risks are sanitary, environmental, and welfare concerns [51]. They pose a problem for tourists because devastated regions need time to reconstruct water pipes, public roads, and business establishments. Additionally, tourists’ emotional stress about the presented situation and the people around them contribute to perceived travel risks [46]. Psychological risks are uncertain because individuals respond to situations differently. Some tourists are keen on visiting a risky tourist spot, but they will organize things meticulously [16,52]. Tourists can only hypothesize about perceived travel risks while planning a trip, but the risks would be proven or debunked after visiting Siargao Island. Therefore, perceived travel risks have two different outcomes as they can urge tourists to support an affected tourist spot or choose another travel destination [15].
The sixth attribute is attitude. In the present study, attitude reflects the tourists’ personal opinions. The necessity of scrutinizing attitude stems from the behavior resulting from an individual’s perception. Interestingly, tourists’ positive attitudes dominated after the deadliest earthquake in China [54]. However, another study argued that tourists would avoid tourist spots affected by natural disasters and would choose to visit another place instead [55]. These relevant studies expressed that tourist attitudes may positively or negatively affect the intention to visit a destination affected by natural disasters.
The seventh attribute, subjective norms, denotes the degree of influence from other people’s insights. This study considered subjective norms as one of the features because tourists tend to seek guidance and companionship. Tourists’ subjective norms encompass the opinions of family members, friends, and society. Families and friends of typhoon victims would most likely travel to the affected region [55]. Opinions of other people who are not personally connected to the victims also matter because society pressures individuals to help the needy [56]. Although societal pressure has a negative connotation, it still aims to contribute positively to the travel destination’s post-typhoon recovery [47]. Hence, the effect of subjective norms is dependent on an individual’s environment because of diversified social phenomena [57].
The eighth attribute, perceived behavioral control, signifies the tourists’ hypothetical competence or incapability with respect to visiting Siargao after Super Typhoon Rai hit the island. If tourists have positive behavioral responses, they were deemed capable of visiting a destructed tourist spot [16]. Furthermore, perceived behavioral control reflects a tourist’s confidence to persist despite the challenges posed by a natural disaster [56]. However, tourists who are vulnerable to weaknesses and restrictions would have less intention to travel [58]. Natural resources also contribute to tourists’ behavioral control [51]. These resources are products of nature, but humans have limited control over them. Tourists are primarily concerned about destination safety and alternative courses of behavior [57]. Therefore, tourists have the decision authority over their travel plans.

3. Methodology

The methodology section is comprised of four subsections. First, the data were collected and pre-processed. Second, the pre-processed data underwent 3 feature-selection techniques. The third and fourth steps could be performed simultaneously since they were independent of each other. The third subsection discusses the LR algorithm, while the fourth explains the ANN process.

3.1. Data Collection and Pre-Processing

All Filipino respondents (n = 502) participated in an online questionnaire voluntarily. A purposive sampling technique was implemented to determine targeted respondents effectively. More importantly, the study focused on domestic tourists because they were of a greater number compared to foreigners before and after the typhoon hit the island in 2021. Specifically, 73.37% of domestic tourists traveled to Siargao in 2019 [59], and 94% comprised domestic tourists from January to September 2022 [60]. The researchers utilized digital platforms to engage with potential participants. Specifically, they distributed the questionnaire through Facebook groups, Instagram pages, and LinkedIn networks. Moreover, the researchers explained the research context before the respondents answered the questionnaire. Afterward, the respondents gave their consent in written form.
The present study adopted the questionnaire from the study of Cahigas et al. [46], both studies being analyses of post-tourism in Siargao, Philippines. The questionnaire asked about the respondent’s background, as shown in Table 2. It also included 64 questions to be answered on a 5-point Likert scale dependent on the 8 primary features and 1 dependent variable. In this study, the 8 primary features referred to awareness of Typhoon Rai’s impact, crisis management, hedonic motivation, perceived travel constraints, perceived travel risks, attitude, subjective norms, and perceived behavioral control. These features had 59 underlying questions, also known as subfeatures. Meanwhile, the dependent variable (intention to revisit Siargao) comprised 5 questions.
Data normalization was applied to the respondent’s raw responses. This helped the data restructuring by transforming all values on a similar scale before applying machine-learning algorithms. It guaranteed uniformly distributed data across all 59 subfeatures and 5 questions under one dependent variable. After the application of feature-selection techniques, optimal subsets had to undergo LR and the ANN. For LR, the feature selection’s optimal subsets were transformed from a scale of 0 to 1. The dependent variable was also converted into 0 or 1, where 0 referred to a lesser intention and 1 meant a greater intention to revisit Siargao. On the other hand, the feature selection’s optimal subsets for the ANN underwent an averaging procedure. Instead of considering subfeatures’ scores (e.g., ATI1, ATI2, and ATI3) individually, they were grouped according to their corresponding primary features (e.g., ATI).

3.2. Feature-Selection Techniques

The first feature-selection technique is known as permutation importance. It does not rely on the model; hence, it uses random feature combinations while considering training and testing size [32]. In the present study, 59 subfeatures derived from 8 features were combined randomly. Their accuracy was evaluated based on the dependent variable (intention to revisit). Each combination’s permutation importance scores were calculated as follows:
P e r m u t a t i o n   I m p o r t a n c e j = s 1 K k = 1 K s k , j
where s is the accuracy of unused subfeatures, s k , j is the accuracy of randomly selected subfeatures, k is the data repetition using the randomization method, and j is the subfeature’s column.
Permutation importance prioritized less-important combinations for accuracy comparison purposes; thus, these combinations incurred lesser accuracy and higher error [28]. However, in permutation importance, an increase in model error would result in a higher importance score because of its strong association with the dependent variable [30]. The study compared these model errors and permutation importance scores 5 times. A total of 5 K permutation repetitions were considered based on the study of Ramirez et al. [28]. In the permutation importance method, more than 5 replicates resulted in overfitting and fewer than 5 replicates led to premature subfeature selection. Hence, 5 K permutation replicates was deemed the optimal number. After completing 5 repetitions, the researchers chose the combination with the lowest number of subfeatures, as this was the primary aim of the permutation importance filter-selection technique. Therefore, permutation importance predicted the relationships between 59 subfeatures and the intention to revisit Siargao after Super Typhoon Rai hit the island. All the aforementioned procedures were processed through Jupyter Notebook’s SVC and feature_importance_permutation packages.
Second, RFE is a wrapper method that applies a backward or removal technique [36]. Unlike permutation importance, which randomly combines subfeatures, the RFE technique does not apply randomization in the elimination and selection process. Instead, all 59 subfeatures were directly inputted into the built-in Jupyter Notebook’s RFE package. Its package comprised n   p r e d i c t o r s and k f o l d   C r o s s   V a l i d a t i o n . In this model, one subfeature with the lowest predictor value for each cross-validation was eliminated. All subfeatures were trained iteratively until the RFE model identified the highest RFE accuracy. These steps were repeated across all training and testing sizes. Specifically, the researchers assessed the following training–testing sizes: 50:50, 60:40, 70:30, 80:20, and 90:10. Although training and testing sizes varied, the same set of 59 subfeatures were analyzed. The RFE model stopped its iteration for each size once the subfeature combination met the highest RFE accuracy.
However, these produced subsets from each training and testing size might have a similar number of optimal features. Thus, they were ranked from the lowest to the highest number of features, where the lowest number was considered a better solution since the study aimed to reduce the number of features. If there were multiple training and testing sizes with similar least-optimal subset features, their RFE model accuracies had to be compared. Ultimately, the optimal combination of subfeatures with the smallest subset and highest RFE accuracy was selected as the best RFE solution.
The third feature-selection technique is the embedded method’s LASSO. LASSO applies a probability distribution across all 59 subfeatures dependent on the intention to revisit. It penalizes the subfeatures’ coefficients, as described by the following equation.
min i = 1 n y i j = 0 p w j x i j 2 +   λ   j = 0 p w j 2
where y i j = 0 p w j x i j 2 is the residual sum of squares, while λ   j = 0 p w j 2 represents LASSO’s penalty.
LASSO automatically reduces multicollinearity by eliminating overfitted features because of its integrated regularization characteristics [37]. Following Equation (2), the subfeatures’ LASSO values would produce coefficients ranging from −1.00 to 1.00. Zero coefficients were removed from the current subset because they were deemed overfitting or unimportant [38]. The remaining non-zero coefficients, both positive and negative, were retained and considered important predictors of tourists’ intentions to revisit Siargao after Super Typhoon Rai. At the end of this method, the retained coefficients generated LASSO’s alpha and accuracy. An alpha closer to 1.0 meant a stronger penalty was imposed, thus reducing a huge number of features [61]. Meanwhile, the ideal value of accuracy must be closer to 100%. These statistical formulas were interpreted through Jupyter Notebook’s RidgeCV, LasssoCV, Ridge, and Lasso packages.

3.3. Logistic Regression

Logistic regression (LR) applies a binary response by categorizing dependent variables as 0 or 1 [41]. Based on the pre-processed responses, the dataset categorized as 0 included participants who had less intention to revisit Siargao after Super Typhoon Rai, which triggered a lower accuracy value. Meanwhile, participants classified as 1 showed interest in revisiting Siargao after Super Typhoon Rai, which increased LR predictive accuracy. The relationships between one dependent variable (intention to revisit) and multiple independent variables (optimal subsets from each feature-selection method) were compared. It was also identified that LR’s results were interpreted through posterior probabilities of the number of dependent variable classes [37]. The LR’s standard form is displayed in Equation (3):
p = 1 1 + e z
where p is the probability of intention to revisit occurrence, e represents the Euler number with a constant value of 2.71828, and z is the linear combination of feature selection’s optimal subsets. Equation (4) elaborates on the calculation of the z value:
z = ln p 1 p = b o + b 1 x 1 + b 2 x 2 + + b n x n
where b o is the constant intercept value, x n is the regression coefficient for each optimal subset’s subfeatures, and n is the number of optimal subsets from each feature-selection technique.
These formulas were established using the maximum-likelihood approach [11]. This approach allows positive coefficients to incur a high probability of predictive success. Otherwise, negative coefficients are attributed to low probability. Thus, maximum likelihood ensures consistency regardless of the number of repetitions. Changes in independent variables would only affect the dependent variable’s final prediction [12]. Considering the current study, 3 optimal feature subsets (independent variables) with different optimal numbers and subfeature combinations were fed into the model based on the feature selection’s results. These inputs might result in either a strong probability or a weak probability of intention to revisit Siargao after Super Typhoon Rai. Therefore, tourists’ intentions were more likely to increase if probability values were higher. Low probability meant that tourists were unlikely to visit Siargao after Super Typhoon Rai. These LR methods were formulated using Jupyter Notebook.

3.4. Artificial Neural Network

An artificial neural network (ANN) is capable of processing human-behavior-related data by applying pseudocode iteratively [43]. Hence, the ANN parameters were identified beforehand. First, the ANN’s input layers were identified based on the feature selection’s optimal subsets. Specifically, 7 input layers or features (ATI, CM, HM, PTCs, PTRs, SNs, and PBC) were determined for permutation importance. Meanwhile, 8 input layers or features (ATI, CM, HM, PTCs, PTRs, ATT, SNs, and PBC) were discovered through RFE and LASSO. These input layers contained varying subfeatures dependent on the feature-selection results. Second, 10, 20, and 30 nodes for the hidden layer were tested [9,43,62]. These nodes can be represented using the following equation:
i = 1 n w i , x i
where x i is the input node and w i is the assigned weight dependent on the corresponding n input layers. These weights undergo the identified activation functions and were calculated as:
X = 1 1 + e x
Afterward, the researchers investigated tanh, swish, and relu for the hidden layer’s activation functions [43,62,63]. Tanh is reflected in Equation (7), swish is illustrated in Equation (8), and relu is demonstrated based on Equation (9).
t a n h x = 2 1 + e 2 x 1
s w i s h y = x   s i g m o i d x = x 1 e x
r e l u x = m a x 0 , x
Next, the output nodes were compared to known values, y 1 , to calculate the sum of squared errors between predicted and known values through the following equation adopted from Ong et al. [64]:
C = i = 1 n 1 2 y ^ 1 y 1
Subsequently, this study analyzed softmax and sigmoid as the output layer’s activation functions [62,63]. The representations of softmax and sigmoid are displayed in Equations (11) and (12), respectively.
σ z j = e z j k = 1 K e z k  
s i g m o i d x = 1 1 + e x
where Equation (11)’s j = 1 , 2 , 3 , K and sigmoid functions under softmax range from 0 to 1. Additionally, the sigmoid function allows the calculation of nonlinear and constricted ranges [43].
Moreover, Adam and RMSProp were considered as the ANN’s optimizers [9,65]. Adam is presented in Equation (13), and RMSProp is described in Equation (14).
w t = w t 1 α v t + g t 2
E [ g 2 ] t = β E g t 1 + 1 β δ C δ w 2
where Equation (13)’s w is the exponential average of the squared gradients, v is the corresponding weight, α is the initial learning rate, and g is the gradient tree [66]. Meanwhile, Equation (14) applies the moving average of the squared gradients of each weight’s functions [64].
Each parameter combination ran 10 times with 150 epochs [66], resulting in a total of 2520 runs for the PI-ANN combination and 2880 runs for RFE-ANN and LASSO-ANN. Within the thousands of runs, epochs supported the learning algorithm of each parameter combination [64]. Thus, the consistency of accuracy values was guaranteed. In the initial run, the study considered a 60:40 training–testing split to ensure the uniformity of sizes based on feature-selection and LR results. Seidu et al. [67] reported that 60:40 is an optimal split size for a modified ANN, and since the current study integrated multiple machine-learning algorithms it was deemed especially suitable. Once all these initial runs were completed, a final run was processed by considering the parameter combination of the highest average training accuracy across all features for the combinations of PI-ANN, RFE-ANN, and LASSO-ANN. In the final run, 60:40, 70:30, 80:20, and 90:10 training–testing sizes were evaluated to mitigate underfitting and overfitting issues [63]. Lastly, the ANN algorithm was processed using Spyder’s Python software.

4. Results

4.1. Feature-Selection Results

The illustrated y-axis values in Figure 1 are the importance scores of the corresponding x-axis features. In permutation importance, positive score values held a significant value compared to zero and negative values. Hence, zero and negative values were eliminated from the subset. A total of 14 subfeatures were extracted as part of the important subset through filter selection’s permutation importance. The importance score for each subfeature is presented in Table 3. The presented optimal feature subset had an accuracy value of 88.2743%.
The RFE approach allowed customization of the training and testing sizes. As can be seen in Table 4, it produced different optimal feature subsets. Each training–testing split was ranked based on the number of features; the smallest feature number ranked first, while the highest feature number ranked last. Since the RFE feature selection’s primary goal was to reduce the number of features, the best solution was the subset with the lowest optimal number. Two solution sets (90% training and 70% training) produced at least 28 features, which was deemed high compared to the least number of features. Thus, they were ranked third and second, respectively. Three solution sizes (80%, 60%, and 50%) produced 27 optimal feature subsets. Among 59 subfeatures, 27 was the optimal number of features using LASSO.
The subsets with the lowest numbers of features were then analyzed using the RFE model’s accuracy. In this step, the highest accuracy was chosen because feature selection aimed to reduce the number of features by ensuring the highest accuracy rate. Hence, 73.6331% model accuracy from 60% training and 40% testing sizes was the RFE’s best solution parameter (Table 5). Regardless of the parameters, these 27 subfeatures had similar results. They all comprised the following subfeatures: ATI1, ATI2, ATI3, ATI6, ATI7, ATI8, CM3, CM5, CM6, HM1, HM5, HM6, HM7, PTC1, PTC2, PTC5, PTC9, PTR8, ATT1, ATT3, ATT5, ATT6, SN4, PBC2, PBC3, PBC4, and PBC6. This scenario supported that the identified 27 subfeatures were the optimal subset, although training and testing sizes differed. The coinciding results helped the consistency of the feature selection’s RFE once the results were fed into LR and the ANN.
Figure 2 demonstrates the coefficient scores on the x-axis and subfeatures on the y-axis. In the LASSO approach, important subfeatures are non-zero values [37]. Thus, the optimal feature subset had positive or negative coefficients. A total of 35 out of 59 subfeatures, approximately 59.32%, were considered important for the LASSO model (Table 6). Nevertheless, they were not ranked based on coefficient value because the LASSO subset would undergo another machine-learning algorithm. The remaining 24 features were eliminated and found unimportant since they were penalized to have zero coefficients. Moreover, LASSO’s alpha value was 0.0007, and LASSO’s accuracy was 76.5950%.

4.2. Application of Logistic Regression

The optimal subset results from three feature-selection techniques were fed into the LR model. First, the permutation importance’s subset was integrated into LR. Table 7 displays all possible training–testing splits with corresponding accuracy, precision, recall, and F1 scores. Based on the results, the researchers determined that 50:50 training and testing sizes produced the highest accuracy of 92.0318%. While 0.91 precision did not have the superior value among all comparisons, 92.0318% accuracy, 0.92 recall, and 0.91 F1 score ranked first compared to the other four splits. It could also be seen that the accuracy value of 50:50 training and testing sizes had an increase of 0.5% to 2.6% compared to the remaining sizes. This separation was deemed relatively small.
Optimal features from RFE underwent LR, and the results are shown in Table 8. The 60:40 split generated the highest accuracy (94.0299%), precision (0.94), recall (0.94), and F1 score (0.93). Although it had a similarity to the 50:50 split, decimal points revealed that 60:40 corresponding parameters produced a slightly higher value. More importantly, accuracy held a greater weight compared to precision, recall, and F1 score. Specifically, the accuracy proximity value ranged from 1.95% to 0.006% when the 60:40 model was compared to other training–testing splits. Among the three feature-selection techniques, only RFE had an optimal data size of a 60:40 training–testing split.
LASSO’s optimal subset was integrated into the study’s LR model. It can be seen in Table 9 that 50:50 training–testing sizes had the best parameters, similar to the optimal data size of permutation importance. It yielded the highest accuracy of 94.8207% and the highest precision of 0.95. Although its recall (0.95) and F1 score (0.94) values were on a par with the 60:40 split corresponding values, the accuracy rate was appreciated first, before other parameters. Unfortunately, 60:40 produced 94.5274% accuracy, while 50:50 produced 94.8207, which meant a 0.2933 difference. This inference supported that the LR model must maintain good accuracy to ensure appropriate model fitting. Despite the presence of equivalent values of precision, recall, and F1 score, accuracy held superiority in the predictive model.

4.3. Application of the Artificial Neural Network

Similar to LR, all feature-selection techniques (PI, RFE, and LASSO) underwent the ANN procedure. Based on the results of permutation importance, seven out of eight features were considered in the optimal subset. These seven features were fed into all ANN parameter combinations. The underlying permutation importance feature that generated the highest average testing accuracy was deemed the best parameter combination, as displayed in Table 10. Among all the presented combinations, the hedonic motivation (HM) feature had the highest average testing accuracy of 97.76%. It also had the lowest standard deviation (0.0118), which described the clustered values of testing accuracy. Hence, the combined permutation importance and the ANN’s optimal parameters entailed 20 hidden-layer nodes, tanh hidden-layer activation, softmax output-layer activation, and Adam optimization.
Next, RFE’s optimal subsets were integrated into the ANN algorithm. A total of eight features were compared since they were important subsets of RFE. HM was found to have the highest average testing accuracy of 97.06% (Table 11). Although its standard deviation was not the lowest value compared to the other features, 0.0142 was considered low. This indicated proximity among the generated testing accuracy values. Furthermore, the optimal RFE-ANN parameters were 30 hidden-layer nodes, tanh hidden-layer activation, softmax output-layer activation, and Adam optimization.
Finally, LASSO’s optimal subsets comprised eight features. The accuracy value ranged from 95.42% to 97.52%. Among the evaluated eight features, HM had the highest average testing accuracy (97.51%) when the subset underwent the ANN algorithm (Table 12). It was deemed superior by having an increase of 0.35 to 2.9 compared to the seven remaining features. Meanwhile, the average standard deviation ranged from 0.0128 to 0.0216. The HM accuracy’s average standard deviation was 0.0146, indicating consistency. Thus, the best LASSO-ANN parameters were 30 hidden-layer nodes, tanh hidden-layer activation, sigmoid output-layer activation, and Adam optimization.

5. Discussion

5.1. Feature Selection, Logistic Regression, and ANN Results

The researchers started analyzing machine-learning algorithms by comparing three feature-selection techniques. The summarized findings are displayed in Figure 3. The filter method’s permutation importance garnered the highest accuracy (88.2743%), with 14 subfeatures. It was followed by the embedded method’s LASSO (76.5950%), with 27 underlying subfeatures. Lastly, the wrapper method’s RFE (73.6331%) encompassed 35 subfeatures.
Along the same lines, Bommert et al. [27] discovered that permutation importance was the best filter-method technique. The past study generated the highest accuracy while trying different datasets. Hence, this finding supported that the post-disaster Siargao tourism context was not the sole data source that guaranteed permutation importance’s supremacy. Although Li et al. [31] did not present permutation importance’s accuracy, this method generated the least predicted Mean Absolute Error (MAE), Mean Square Error (MSE), and Root Mean Square Error (RMSE) among all compared techniques. Generally, the model error was associated with accuracy values. Accurate findings would produce a lesser error, while inaccurate findings would produce a greater error. Kim et al. [32] performed permutation importance twice, and 51.7% was the highest training accuracy. This percentage value was deemed extremely low, which indicated that permutation importance was not the best feature-selection technique for the past study. Kim et al. [32] only considered permutation importance and failed to assess other techniques. The current study provided alternatives by comparing three techniques, which eliminated feature-selection biases.
However, feature selection’s optimal subsets underwent LR and the ANN procedure. The accuracy values changed as the subsets were fed into advanced machine-learning algorithms. As displayed in Table 13, the combination LASSO-LR produced the highest accuracy (94.8207%), next was RFE-LR (94.0299%), and last was PI-LR (92.0318%). It could be determined that feature selection with lower accuracy values showed an immense improvement because the accuracy values of LASSO and RFE increased by 18.2257% and 20.3968%, respectively. Unlike permutation importance with 88.2743% feature-selection accuracy, its PI-LR accuracy only increased by 3.7575%. Similarly, permutation importance produced a very small increase of 2.4% when the model was run twice [32]. Therefore, permutation importance could produce good accuracy values in the beginning, but their significance diminished as the researchers tried to tune the machine-learning parameters.
A study produced a flash-flood susceptibility accuracy of 86.34% with the Embedded-LR combination for China [41]. This past study used a higher training set (70%), but the generated result was lower than in the present study, with only a 50% training set that produced the highest accuracy. Based on the current results, the 70% training size for the LASSO-LR combination yielded 92% accuracy. Still, the value was higher than the findings of Li et al. [41]. Thus, the present study produced better optimal features supporting other LR parameters, such as precision, recall, and F1 score. Next, Peng et al. [7] combined RFE and LR and only produced a 0.554 F1 score. F1 score summarizes the predictive aspect of the LR model by ensuring the balance between the mean of precision and recall [34]. Unfortunately, the study of Peng et al. [7] yielded a significantly lower F1 score (0.554) compared to the present study (0.940). The past study utilized travelers’ insights from websites, which led to unsystematic data. Meanwhile, the present study utilized actual survey responses from people who were aware of Siargao’s tourism after Super Typhoon Rai. Although the PI-LR combination had the lowest accuracy value in the present study, 92.0318% was considered high and close to other accuracy values (94%). Regrettably, LR was not a good combination for any filter method, such as PI, due to the constricted number of features [27]. Consequently, the dataset was essential to ensure its fit with a machine-learning algorithm. Another study combined PI and LR and found an accuracy value range of 72.6% to 90.7% [68]. This study produced a lower accuracy range because it set a pre-determined number of features (the top five features for each health model) instead of considering all the accepted features based on PI’s standard parameters.
The ANN’s final run and the best accuracy for each integrated feature-selection technique are presented in Table 14. The similarities among all three algorithms were in terms of the training–testing size, the hidden layer’s activation function, the optimizer, and the most-important feature. They were itemized as 70:30, tanh, Adam, and hedonic motivation, respectively. Meanwhile, the ANN parameters varied for hidden-layer nodes and the output layer’s activation function. Specifically, PI-ANN had the optimal accuracy with 20 nodes and softmax output activation. Meanwhile, RFE-ANN generated the highest accuracy through 30 nodes with softmax output activation. Lastly, LASSO-ANN’s highest accuracy was achieved with 30 nodes and the sigmoid output layer.
Specifically, the 70% training set coupled with the 30% testing set was the optimal size for all algorithms (PI-ANN, RFE-ANN, and LASSO-ANN)—almost similar to the findings of Palmer et al. [6], who found that a 73.3% training set and a 26.7% testing set forecasted their tourism series data in Spain accurately. Another study noted that the 70:30 size validated the data fit to ANN parameters [28]. These identical instances implied that lower training sets would lead to premature convergence and higher training sets could instigate overfitting.
For hidden-layer nodes, PI-ANN had a different optimal parameter (20 nodes) compared to RFE-ANN and LASSO-ANN with 30 nodes. This result implied that PI-ANN could find a better model with fewer nodes, while RFE-ANN and LASSO-ANN needed a higher number of neurons in the hidden layer. Any values greater than 20 nodes for PI-ANN and more than 30 nodes for RFE-ANN and LASSO-ANN could result in overfitting. Moreover, values below the optimal number of nodes could lead to underfitting. Tsaur et al. [12] used 20 hidden nodes and produced 94.3% accuracy as they investigated eight features affecting tourists’ loyalty to hotels. Furthermore, another study reported that 30 nodes assisted in producing a 97.32% predictive model [62]. Although the past and current studies have similarities in terms of the number of hidden nodes, the utilized features are dissimilar. In addition, these past studies did not incorporate any feature-selection techniques, resulting in lesser accuracy values [12] and premature convergence [62]. Researchers had to perform a trial-and-error process to find the optimal nodes because there was a lack of standards [23]. Hence, the effectivity of nodes was maintained by integrating 150 epochs in the current study. Likewise, 150 epochs assisted in categorizing travel-trip behavior in Russia [8].
Among tanh, swish, and relu, the best hidden-layer activation function was tanh across all types of algorithms. Supporting the current study, Yuduang et al. [9] and German et al. [63] utilized the same hidden-layer activation function that yielded the best result. In a similar context, tanh and relu were utilized to predict earthquake frequency and magnitude [65]. The past study failed to separate the evaluation between tanh and relu. Nevertheless, these parameters were considered the optimal activation functions. However, the current study argued that relu lost to tanh because relu only produced an average range from 81.99% to 93.23% compared to the corresponding tanh accuracy within 97% values. In another study, relu was chosen as the hidden-layer activation and only produced 79.7% travel-trip accuracy [8]. This value was significantly smaller than the current study’s findings. In addition, Ong et al. [62] concluded that swish was the best hidden-layer activation. The current study contented that swish produced a lesser average accuracy than tanh. Swish’s average accuracy rates ranged from 94.28% to 96.97%.
Furthermore, the softmax and sigmoid output-layer activation functions were assessed. This study concluded that softmax was the best function for PI-ANN and RFE-ANN, while sigmoid better fit LASSO-ANN. Studies focusing on natural-disaster impacts and disaster-response activities found that softmax was the best output-layer activation function [18,65]. On the other hand, researchers who investigated tourism demand revealed the importance of the sigmoid function in the ANN model [6,44]. The researchers assessed tourists’ intentions to revisit Siargao after it was affected by a super typhoon, which was associated with post-natural-disaster and tourism recovery. Since the present study considered both the softmax and sigmoid functions, the results coincided with the mixed results of past studies.
Between the Adam and RMSProp optimizers, Adam dominated the optimal ANN parameter settings regardless of the algorithm type. Several researchers agreed that Adam was the best optimizer, and their studies yielded accuracy values of 89.21%, 98.56%, and 98.15%, respectively [9,43,63]. However, one study noted that RMSProp was the optimal optimizer to predict earthquakes with high magnitudes [65]. Differences in results occurred due to the context, as the past study was more focused on natural-disaster preparation.
Overall, the ANN outperformed LR, as reflected in Table 15. All combined feature-selection and ANN algorithms had greater accuracy rates than the integrated feature selection and LR. Supporting these findings, Tsaur et al. [12] found that the ANN was better than LR because the ANN consisted of multiple layers that could process non-linear functions in the tourist loyalty accuracy prediction. However, when researchers located landslide-prone areas in Iran, Nhu et al. [69] revealed that LR’s validation accuracy was better than the ANN due to LR’s flexible dataset type and distribution. Nevertheless, this study has argued that the ANN bested LR because the researchers did not only focus on data fitting, but also on predicting significant post-disaster tourism-recovery features affecting Siargao tourists.
Therefore, the best algorithm was the LASSO-ANN combination, since it had the highest testing accuracy value of 97.8146%. Through this combination, ANN curve fitting generated an inverted yield. This curve style implied a decrease in error and an increase in accuracy percentage.
Unfortunately, past studies overlooked the combination of LASSO and ANNs. For instance, Aronsson et al. [70] evaluated LASSO and ANNs individually and only incurred 80% to 82% accuracy. If the past study combined LASSO and an ANN, they could generate higher accuracy values. Thus, the LASSO-ANN algorithm’s underlying features and parameters best described the predictive model of tourists’ intention to revisit Siargao after Super Typhoon Rai. Specifically, 8 features (ATI, CM, HM, PTCs, PTRs, ATT, SNs, and PBC) which comprised 35 subfeatures (ATI1, ATI2, ATI3, ATI5, ATI6, ATI8, ATI9, CM2, CM3, CM6, HM5, HM6, HM7, PTC1, PTC2, PTC4, PTC5, PTC8, PTC9, PTR3, PTR7, PTR8, ATT1, ATT3, ATT5, ATT6, SN4, SN3, SN5, SN6, PBC2, PBC3, PBC4, and PBC6) were found to be significant. Among all feature-selection techniques, LASSO performed well when combined with either LR or the ANN. Its optimal subset generated the highest testing accuracy for both algorithms. Meanwhile, RFE was the next-best feature-selection technique integrated into LR, followed by PI. For the ANN algorithm, PI ranked second, while RFE was the weakest performer.

5.2. The Most-Significant Factors and Practical Implications

Since LASSO-ANN produced the highest accuracy among all the algorithms, its underlying subset was considered the most-important feature or factor. The following factors were arranged from the most significant to the least significant: HM, ATI, PBC, ATT, PTCs, CM, SNs, and PTRs. For practical implications, the top two factors (HM and ATI) were chosen for further practical interpretation because they held at least 97% accuracy compared to the other factors.
It could be seen that hedonic motivation (HM) was the greatest factor influencing tourists’ intentions to revisit Siargao after Super Typhoon Rai. Tourists sought happiness from traveling since this emotion ignited their motivation. They were also eager to try a new lifestyle because of Siargao’s isolation from urban life, the residents’ culture, and the destination’s unique activities. Moreover, they wanted to travel to Siargao to relax physically and mentally. Similarly, Cahigas et al. [52] noticed that HM was the most-significant tourism-related factor affecting destination visits affected by the natural crisis. This implied that tourists were motivated despite the negative effects of natural phenomena, whether these were related to the COVID-19 pandemic or the super typhoon. Tourists’ HM was greater than the fear of experiencing the natural crisis’s aftermath, as they believed that adequate knowledge of safety protocols was more essential. Moreover, Rodriguez–Sanchez et al. [51] revealed that HM outperformed other factors in influencing hotel tourists’ support of their destination’s natural-crisis advocacy. Tourists were more motivated to visit destinations affected by natural disasters because traveling could benefit them personally and at the same time support the affected economies. These findings were consistent with the study of Lee et al. [71], where HM was described as the best exogenous variable affecting other connected variables. This factor not only supported tourism activities, it also promoted pro-environmental behavior coinciding with the present study’s context.
Therefore, the researchers suggested that businesses (e.g., travel agencies, restaurants, and accommodations) in Siargao should prioritize mild-to-moderate activities. For example, travel agencies could offer relaxing city tours and beach hopping. These activities would guarantee full exploration of Siargao Island. Restaurants could entertain their customers by organizing events (e.g., fire dancing, music festivals, and themed photoshoots). This would produce a win–win situation because events not only help businesses increase tourist traffic but also keep tourists’ engagement intact. Accommodation places could have built-in massage, spa, and yoga areas for their guests. They could also provide them with bundle prices or free services to stimulate their interests. Extreme activities (e.g., diving, surfing, and snorkeling) could still be offered to tourists, since some tourists might seek novelty. Nevertheless, it is advised to focus on gentle activities because these were directly associated with the tourists’ HM.
Furthermore, the awareness of Typhoon Rai’s impact (ATI) factor was the second most-significant factor affecting Siargao tourists’ intentions to visit the island after it was destructed by Super Typhoon Rai. Tourists were willing to face hypothetical consequences and aftermaths if they were aware of the typhoon’s severe impacts on human lives, Siargao’s infrastructures, and the differences between the old and new Siargao. They must also be equipped with credible and sufficient information surrounding Siargao reconstruction projects and weather forecasts. Awareness among tourists held great value because they used it to gauge the risk and safety levels of destination places [72]. Tourists who were equipped with natural-disaster and tourism knowledge felt more confident in visiting a tourist spot such as Siargao. Likewise, the study of Ong et al. [64] disclosed that understanding the effects of natural disasters was an essential factor because it helps individuals analyze their psychological emotions and social needs. Thus, awareness could help tourists to decide reasonably whether to revisit Siargao. As long as the typhoon impacts were manageable and tourists were aware of safety precautions, tourists would most likely visit Siargao Island. Cahigas et al. [73] also revealed that the most-important factor in the quadrant analysis was understanding the effects of natural disasters on organizations, regular citizens, and the government. The presented studies implied that tourism recovery after natural disasters was possible by receiving the utmost support from tourists.
Given the ATI’s significance, the researchers recommended that the government work alongside media outlets, airline companies, maritime companies, Siargao businesspeople, and residents. While Siargao has a couple of local media outlets, tourists were unaware of them, since they were not considered residents. The government must bridge the communication gap between nationwide and local media outlets by creating a comprehensive government website. This official website should focus on Siargao’s weather updates, reconstruction projects for all municipalities, the timelines of reconstruction projects, and the severity of hypothetical aftermaths for each area. The website must be updated in real time and regularly by identifying actual updates from locals of Siargao. This would eliminate the presence of unreliable data and ensure that tourists could access consistent and adequate information using one platform. Since all tourists can only visit Siargao Island by taking airplanes or ships, the researchers recommend partnering with the transportation companies. The companies could offer Siargao updates, safety levels, and precautions through infographics shown via LCDs or with voice-overs by crew members. Tourists’ travel times would be meaningful, as they could prepare themselves ahead of their arrival in Siargao. Overall, the synchronized relationship between the government and these stakeholders would allow a wide array of credible and accessible data.

5.3. Academic Contributions

The integration of various feature-selection techniques into LR and ANNs has not yet been explored by other researchers. As of this writing, this is the first study to analyze the combination of the filter method’s permutation importance, the wrapper method’s RFE, and the embedded method’s LASSO alongside multiple-parameter combinations under LR and the ANN. Hence, the findings could be used as a benchmark for human-behavior practitioners, natural-disaster investigators, and machine-learning academicians.
First, the researchers discovered that LASSO was the best feature-selection technique, as it performed well in both LR and the ANN. Moreover, a 0.0007 alpha value could be identified as a standard parameter to tune the LASSO model’s penalty. Higher and lower alpha values could lead to unbalanced penalization because they would lead to unideal subset extraction. An example of a substandard result is the usage of a 0.1 alpha value by Waldmann et al. [61]. They concluded that 0.1 was an optimal LASSO alpha without trying other values or considering other studies’ findings, which resulted in an uneven subset. Therefore, it was necessary to build a standard LASSO alpha to generate the most-important features.
Second, the ANN outperformed LR in all machine-learning combinations. It was previously discussed that the ANN model applied a more calibrated, non-linear approach to features [12]. LR was restrained in processing a certain dataset distribution and did not contain hidden layers, unlike the ANN. Through the ANN, the model was able to increase the accuracy rate further because it also investigated the underlying relationships among all features. Moreover, the ANN could process all types of datasets, such as the Likert scale, the semantic scale, and the ranking. The present study utilized supervised learning and a five-point Likert scale, which was fed into the ANN model successfully.
Lastly, LASSO-ANN was the recommended combination for identifying factors influencing Siargao’s tourism recovery after Super Typhoon Rai. It was suggested to use a 70:30 training–testing size, 30 hidden-layer nodes, tanh hidden-layer activation, sigmoid output-layer activation, Adam optimization, and HM as the most-important feature. The findings supported that appropriate parameters of machine-learning algorithms could predict human behavior and analyze natural disasters’ impacts comprehensively. It was not suggested to perform the ANN procedure directly. For instance, the ANN was outperformed by more than half of the evaluated algorithms when they forecasted tourist arrivals and overnight stays [13]. Thus, the researchers introduced the importance of employing feature-selection techniques before the application of the ANN to generate an effective model.

5.4. Limitations and Future Research

While the researchers utilized an innovative methodology and produced essential findings, they admit to the limitations of the study. First, the researchers only chose one feature-selection approach for the filter, wrapper, and embedded methods. Past studies revealed that the filter method’s permutation importance, the wrapper method’s RFE, and the embedded method’s LASSO could best assist machine-learning models in increasing accuracy values. However, datasets varied from one study to another, which opened the possibility of using a different optimal feature-selection technique. Hence, future researchers are encouraged to compare at least three approaches for each feature-selection method. Second, the demographic characteristics of respondents were not utilized in the prediction model, since the present study focused on features or factors. The demographic data could be transformed into pre-processed quantitative data before integrating them into machine-learning algorithms. Third, future studies could compare two time periods because the present research focused on tourist’s intentions after a typhoon. If other researchers could expound on tourists’ perceptions before the typhoon hit Siargao Island, more comprehensive findings could be obtained. Finally, future scholars could identify the similarities or differences between domestic and foreign tourists. They could also improve the state of research by investigating and comparing multiple crises (e.g., typhoons, the Ukraine war, and COVID-19). Despite the presence of these limitations, the researchers performed the methodology completely and achieved the intended objectives. The data were gathered properly and analyzed extensively using different types of algorithms.

6. Conclusions

Super Typhoon Rai had severe impacts on Siargao’s tourism facilities. Since the island’s primary livelihood stems from tourism, it was critical to assess the factors affecting tourists’ intentions to revisit the island. A total of 502 valid participants cooperated in this research voluntarily. Four research questions were formally presented, and the researchers revealed the corresponding findings.
First, the integrated machine-learning algorithms determined Siargao tourists’ behavior after Super Typhoon Rai by identifying the best feature-selection parameters and connecting each parameter to LR and an ANN. Moreover, LR and the ANN had underlying parameters that were explored until the highest model accuracy was achieved. Second, the best machine-learning algorithm combination was LASSO-ANN because it generated the highest accuracy rate (97.8146%). Third, the optimal parameter for the LASSO combination was the inclusion of 35 subfeatures, 8 primary features, and a 0.0007 alpha value. Meanwhile, the ANN performed at its best with a 70:30 training–testing size, 30 hidden-layer nodes, tanh hidden-layer activation, sigmoid as the output layer’s activation, Adam as an optimizer, and hedonic motivation (HM) as the most-important feature. Fourth, all eight features were considered significant. These features were arranged from the most important to the least important as follows: hedonic motivation (HM), awareness of Typhoon Rai’s impact (ATI), perceived behavioral control (PBC), attitude (ATT), perceived travel constraints (PTCs), crisis management (CM), subjective norms (SNs), and perceived travel risks (PTRs). The top-tier feature that produced a high prediction rate was HM, followed by ATI. Both HM and ATI sustained at least 97% accuracy rates in the initial and final runs compared to the other features.
Following the presented findings, the researchers expounded on the study’s managerial applications to improve Siargao’s economy. It was recommended to offer mild-to-moderate activities to increase tourists’ HM. Instead of extreme activities, tourism businesses should focus on relaxing itineraries, enjoyable events, solemn social interaction, and wellness-center services. Additionally, the provision of a new government website was suggested to guarantee the presence of credible and consistent information because these aspects enhance tourists’ ATI. This approach could alleviate miscommunication issues between tourists and locals, since the new website would provide Siargao tourism data (e.g., reconstruction updates, activities, and safety levels) systematically. To support this recommendation, the researchers also encouraged partnerships with media outlets, airline companies, and maritime companies to distribute information digitally and physically. Furthermore, the researchers contributed novel machine-learning algorithm results. As of this writing, none of the past studies explored the importance of LASSO-ANN in a post-disaster tourism-recovery context. Future scholars could utilize the findings to expand the study’s research questions and methods.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/su15118463/s1, Table S1: Related machine-learning algorithm studies; Table S2: Description of fifty-nine subfeatures.

Author Contributions

Conceptualization, M.M.L.C., A.K.S.O. and Y.T.P.; methodology, M.M.L.C., A.K.S.O. and Y.T.P.; software, M.M.L.C. and A.K.S.O.; validation, M.M.L.C., A.K.S.O. and Y.T.P.; formal analysis, M.M.L.C. and A.K.S.O.; investigation, M.M.L.C.; resources, M.M.L.C., A.K.S.O. and Y.T.P.; data curation, M.M.L.C. and Y.T.P.; writing—original draft preparation, M.M.L.C., A.K.S.O. and Y.T.P.; writing—review and editing, M.M.L.C.; visualization, M.M.L.C.; supervision, M.M.L.C., A.K.S.O. and Y.T.P.; project administration, M.M.L.C., A.K.S.O. and Y.T.P.; funding acquisition, A.K.S.O. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Mapúa University Directed Research for Innovation and Value Enhancement (DRIVE).

Institutional Review Board Statement

This study was approved by the Mapua University Research Ethics Committees (FM-RC-23-01-06).

Informed Consent Statement

Informed consent was obtained from all subjects involved in this study (FM-RC-23-02-06).

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

The researchers would like to express their appreciation of the participants’ responses to the questionnaire.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Rotton, T. Super Typhoon Odette (RAI). Available online: https://disasterphilanthropy.org/disasters/super-typhoon-odette-rai/ (accessed on 29 December 2022).
  2. Jorgio, J.; Sharma, A. Death Toll from Super Typhoon Rai Climbs to at Least 208 in the Philippines. Available online: https://edition.cnn.com/2021/12/19/asia/super-typhoon-rai-philippines-75-dead-intl/index.html (accessed on 29 December 2022).
  3. Department of Social Welfare and Development. DSWD Dromic Report #19 on Typhoon “Odette” as of 24 December 2021, 6AM—Philippines. Available online: https://reliefweb.int/report/philippines/dswd-dromic-report-19-typhoon-odette-24-december-2021-6am (accessed on 29 December 2022).
  4. ABS-CBN News. Watch: Typhoon Odette Leaves Siargao Devastated. Available online: https://news.abs-cbn.com/news/12/17/21/watch-typhoon-odette-leaves-siargao-devastated (accessed on 29 December 2022).
  5. Cahigas, M.M.L.; Zulvia, F.E.; Ong, A.K.S.; Prasetyo, Y.T. A Comprehensive Analysis of Clustering Public Utility Bus Passenger’s Behavior during the COVID-19 Pandemic: Utilization of Machine Learning with Metaheuristic Algorithm. Sustainability 2023, 15, 7410. [Google Scholar] [CrossRef]
  6. Palmer, A.; José Montaño, J.; Sesé, A. Designing an Artificial Neural Network for Forecasting Tourism Time Series. Tour. Manag. 2006, 27, 781–790. [Google Scholar] [CrossRef]
  7. Peng, X.; Shuai, Y.; Gan, Y.; Chen, Y. Hybrid Feature Selection Model Based on Machine Learning and Knowledge Graph. J. Phys. Conf. Ser. 2021, 2079, 012028. [Google Scholar] [CrossRef]
  8. Mikhailov, S.; Kashevnik, A. Tourist Behaviour Analysis Based on Digital Pattern of Life—An Approach and Case Study. Future Internet 2020, 12, 165. [Google Scholar] [CrossRef]
  9. Yuduang, N.; Ong, A.K.; Vista, N.B.; Prasetyo, Y.T.; Nadlifatin, R.; Persada, S.F.; Gumasing, M.J.; German, J.D.; Robas, K.P.; Chuenyindee, T.; et al. Utilizing Structural Equation Modeling–Artificial Neural Network Hybrid Approach in Determining Factors Affecting Perceived Usability of Mobile Mental Health Application in the Philippines. Int. J. Environ. Res. Public Health 2022, 19, 6732. [Google Scholar] [CrossRef]
  10. Tien Bui, D.; Le, K.-T.; Nguyen, V.; Le, H.; Revhaug, I. Tropical Forest Fire Susceptibility Mapping at the Cat Ba National Park Area, Hai Phong City, Vietnam, Using GIS-Based Kernel Logistic Regression. Remote Sens. 2016, 8, 347. [Google Scholar] [CrossRef]
  11. Chen, S.-C.; Chang, C.-C.; Chan, H.-C.; Huang, L.-M.; Lin, L.-L. Modeling Typhoon Event-Induced Landslides Using GIS-Based Logistic Regression: A Case Study of Alishan Forestry Railway, Taiwan. Math. Probl. Eng. 2013, 2013, 728304. [Google Scholar] [CrossRef]
  12. Tsaur, S.-H.; Chiu, Y.-C.; Huang, C.-H. Determinants of Guest Loyalty to International Tourist Hotels—A Neural Network Approach. Tour. Manag. 2002, 23, 397–405. [Google Scholar] [CrossRef]
  13. Claveria, O.; Torra, S. Forecasting Tourism Demand to Catalonia: Neural Networks vs. Time Series Models. Econ. Model. 2014, 36, 220–228. [Google Scholar] [CrossRef]
  14. Law, R.; Li, G.; Fong, D.K.; Han, X. Tourism Demand Forecasting: A Deep Learning Approach. Ann. Tour. Res. 2019, 75, 410–423. [Google Scholar] [CrossRef]
  15. Fotiadis, A.; Polyzos, S.; Huan, T.-C.T.C. The Good, the Bad and the Ugly on COVID-19 Tourism Recovery. Ann. Tour. Res. 2021, 87, 103117. [Google Scholar] [CrossRef] [PubMed]
  16. Talwar, S.; Srivastava, S.; Sakashita, M.; Islam, N.; Dhir, A. Personality and Travel Intentions during and after the COVID-19 Pandemic: An Artificial Neural Network (ANN) Approach. J. Bus. Res. 2022, 142, 400–411. [Google Scholar] [CrossRef] [PubMed]
  17. Xu, J.; Bai, D.; He, H.; Luo, J.; Lu, G. Disaster Precursor Identification and Early Warning of the Lishanyuan Landslide Based on Association Rule Mining. Appl. Sci. 2022, 12, 12836. [Google Scholar] [CrossRef]
  18. Chaudhuri, N.; Bose, I. Exploring the Role of Deep Neural Networks for Post-Disaster Decision Support. Decis. Support Syst. 2020, 130, 113234. [Google Scholar] [CrossRef]
  19. Rodriguez-Galiano, V.F.; Luque-Espinar, J.A.; Chica-Olmo, M.; Mendes, M.P. Feature Selection Approaches for Predictive Modelling of Groundwater Nitrate Pollution: An Evaluation of Filters, Embedded and Wrapper Methods. Sci. Total Environ. 2018, 624, 661–672. [Google Scholar] [CrossRef] [PubMed]
  20. Caraka, R.E.; Noh, M.; Lee, Y.; Toharudin, T.; Yusra; Tyasti, A.E.; Royanow, A.F.; Dewata, D.P.; Gio, P.U.; Basyuni, M.; et al. The Impact of Social Media Influencers Raffi Ahmad and Nagita Slavina on Tourism Visit Intentions across Millennials and Zoomers Using a Hierarchical Likelihood Structural Equation Model. Sustainability 2022, 14, 524. [Google Scholar] [CrossRef]
  21. Yuan, H.; Xu, H.; Qian, Y.; Li, Y. Make Your Travel Smarter: Summarizing Urban Tourism Information from Massive Blog Data. Int. J. Inf. Manag. 2016, 36, 1306–1319. [Google Scholar] [CrossRef]
  22. Sheykhmousa, M.; Kerle, N.; Kuffer, M.; Ghaffarian, S. Post-Disaster Recovery Assessment with Machine Learning-Derived Land Cover and Land Use Information. Remote Sens. 2019, 11, 1174. [Google Scholar] [CrossRef]
  23. Tien Bui, D.; Hoang, N.-D.; Martínez-Álvarez, F.; Ngo, P.-T.T.; Hoa, P.V.; Pham, T.D.; Samui, P.; Costache, R. A Novel Deep Learning Neural Network Approach for Predicting Flash Flood Susceptibility: A Case Study at a High Frequency Tropical Storm Area. Sci. Total Environ. 2020, 701, 134413. [Google Scholar] [CrossRef]
  24. Liu, Y.; Lyu, C.; Liu, Z.; Cao, J. Exploring a Large-Scale Multi-Modal Transportation Recommendation System. Transp. Res. Part C Emerg. Technol. 2021, 126, 103070. [Google Scholar] [CrossRef]
  25. Matharaarachchi, S.; Domaratzki, M.; Muthukumarana, S. Assessing Feature Selection Method Performance with Class Imbalance Data. Mach. Learn. Appl. 2021, 6, 100170. [Google Scholar] [CrossRef]
  26. Thabtah, F.; Kamalov, F.; Hammoud, S.; Shahamiri, S.R. Least Loss: A Simplified Filter Method for Feature Selection. Inf. Sci. 2020, 534, 1–15. [Google Scholar] [CrossRef]
  27. Bommert, A.; Sun, X.; Bischl, B.; Rahnenführer, J.; Lang, M. Benchmark for Filter Methods for Feature Selection in High-Dimensional Classification Data. Comput. Stat. Data Anal. 2020, 143, 106839. [Google Scholar] [CrossRef]
  28. Ramirez, S.G.; Hales, R.C.; Williams, G.P.; Jones, N.L. Extending SC-Pdsi-PM with Neural Network Regression Using GLDAS Data and Permutation Feature Importance. Environ. Model. Softw. 2022, 157, 105475. [Google Scholar] [CrossRef]
  29. Granados-López, D.; Suárez-García, A.; Díez-Mediavilla, M.; Alonso-Tristán, C. Feature Selection for CIE Standard Sky Classification. Sol. Energy 2021, 218, 95–107. [Google Scholar] [CrossRef]
  30. Muñoz, L.; Hausner, V.H.; Runge, C.; Brown, G.; Daigle, R. Using Crowdsourced Spatial Data from Flickr vs. PPGIS for Understanding Nature’s Contribution to People in Southern Norway. People Nat. 2020, 2, 437–449. [Google Scholar] [CrossRef]
  31. Li, M.; Hou, H.; Yu, J.; Geng, H.; Zhu, L.; Huang, Y.; Li, X. Prediction of Power Outage Quantity of Distribution Network Users under Typhoon Disaster Based on Random Forest and Important Variables. Math. Probl. Eng. 2021, 2021, 6682242. [Google Scholar] [CrossRef]
  32. Kim, H.G.; Lee, D.K.; Park, C.; Kil, S.; Son, Y.; Park, J.H. Evaluating Landslide Hazards Using RCP 4.5 and 8.5 Scenarios. Environ. Earth Sci. 2014, 73, 1385–1400. [Google Scholar] [CrossRef]
  33. Niu, T.; Wang, J.; Lu, H.; Yang, W.; Du, P. Developing a Deep Learning Framework with Two-Stage Feature Selection for Multivariate Financial Time Series Forecasting. Expert Syst. Appl. 2020, 148, 113237. [Google Scholar] [CrossRef]
  34. Theerthagiri, P. Predictive Analysis of Cardiovascular Disease Using Gradient Boosting Based Learning and Recursive Feature Elimination Technique. Intell. Syst. Appl. 2022, 16, 200121. [Google Scholar] [CrossRef]
  35. Kołakowska, A.; Godlewska, M. Analysis of Factors Influencing the Prices of Tourist Offers. Appl. Sci. 2022, 12, 12938. [Google Scholar] [CrossRef]
  36. Xiao, Y.; Xue, J.; Zhang, X.; Wang, N.; Hong, Y.; Jiang, Y.; Zhou, Y.; Teng, H.; Hu, B.; Lugato, E.; et al. Improving Pedotransfer Functions for Predicting Soil Mineral Associated Organic Carbon by Ensemble Machine Learning. Geoderma 2022, 428, 116208. [Google Scholar] [CrossRef]
  37. Kamkar, I.; Gupta, S.K.; Phung, D.; Venkatesh, S. Stable Feature Selection for Clinical Prediction: Exploiting ICD Tree Structure Using Tree-Lasso. J. Biomed. Inform. 2015, 53, 277–290. [Google Scholar] [CrossRef] [PubMed]
  38. Chang, J.-R.; Chen, M.-Y.; Chen, L.-S.; Tseng, S.-C. Why Customers Don’t Revisit in Tourism and Hospitality Industry? IEEE Access 2019, 7, 146588–146606. [Google Scholar] [CrossRef]
  39. Jones, J.N.; Bennett, G.L.; Abancó, C.; Matera, M.M.; Tan, F.J. Multi-Event Assessment of Typhoon-Triggered Landslide Susceptibility in the Philippines. Nat. Hazards Earth Syst. Sci. 2022, 23, 1095–1115. [Google Scholar] [CrossRef]
  40. Kon, S.C.; Turner, L.W. Neural network forecasting of tourism demand. Tour. Econ. 2005, 11, 301–328. [Google Scholar] [CrossRef]
  41. Li, J.; Zhang, H.; Zhao, J.; Guo, X.; Rihan, W.; Deng, G. Embedded Feature Selection and Machine Learning Methods for Flash Flood Susceptibility-Mapping in the Mainstream Songhua River Basin, China. Remote Sens. 2022, 14, 5523. [Google Scholar] [CrossRef]
  42. Guzzetti, F.; Gariano, S.L.; Peruccacci, S.; Brunetti, M.T.; Marchesini, I.; Rossi, M.; Melillo, M. Geographical landslide early warning systems. Earth-Sci. Rev. 2020, 200, 102973. [Google Scholar] [CrossRef]
  43. Ong, A.K.; Prasetyo, Y.T.; Yuduang, N.; Nadlifatin, R.; Persada, S.F.; Robas, K.P.; Chuenyindee, T.; Buaphiban, T. Utilization of random forest classifier and artificial neural network for predicting factors influencing the perceived usability of COVID-19 contact tracing “Morchana” in Thailand. Int. J. Environ. Res. Public Health 2022, 19, 7979. [Google Scholar] [CrossRef]
  44. Law, R.; Au, N. A Neural Network Model to Forecast Japanese Demand for Travel to Hong Kong. Tour. Manag. 1999, 20, 89–97. [Google Scholar] [CrossRef]
  45. Leong, L.-Y.; Hew, T.-S.; Lee, V.-H.; Ooi, K.-B. An sem–artificial-neural-network analysis of the relationships between servperf, customer satisfaction and loyalty among low-cost and full-service airline. Expert Syst. Appl. 2015, 42, 6620–6634. [Google Scholar] [CrossRef]
  46. Cahigas, M.M.; Prasetyo, Y.T.; Persada, S.F.; Nadlifatin, R. Examining Filipinos’ Intention to Revisit Siargao after Super Typhoon Rai 2021 (Odette): An Extension of the Theory of Planned Behavior Approach. Int. J. Disaster Risk Reduct. 2023, 84, 103455. [Google Scholar] [CrossRef]
  47. Kitazawa, K.; Hale, S.A. Social Media and Early Warning Systems for Natural disasters: A case study of typhoon etau in Japan. Int. J. Disaster Risk Reduct. 2021, 52, 101926. [Google Scholar] [CrossRef]
  48. Yu, J.; Liu, J.; Choi, Y. Review and prospects of strategies and measures for typhoon-related disaster risk reduction under public emergencies in TC Region. Trop. Cyclone Res. Rev. 2021, 10, 116–123. [Google Scholar] [CrossRef]
  49. Tatebe, C.; Miyamoto, T. Possible roles of People’s Organization for post-disaster community recovery: A case study on recovery process after Philippine Typhoon Yolanda. Prog. Disaster Sci. 2021, 11, 100184. [Google Scholar] [CrossRef]
  50. Esteban, M.; Takagi, H.; Mikami, T.; Aprilia, A.; Fujii, D.; Kurobe, S.; Utama, N.A. Awareness of coastal floods in impoverished subsiding coastal communities in Jakarta: Tsunamis, typhoon storm surges and dyke-induced tsunamis. Int. J. Disaster Risk Reduct. 2017, 23, 70–79. [Google Scholar] [CrossRef]
  51. Rodriguez–Sanchez, C.; Sancho-Esper, F.; Casado-Díaz, A.B.; Sellers-Rubio, R. Understanding in-room water conservation behavior: The role of personal normative motives and hedonic motives in a mass tourism destination. J. Destin. Mark. Manag. 2020, 18, 100496. [Google Scholar] [CrossRef]
  52. Cahigas, M.M.; Prasetyo, Y.T.; Alexander, J.; Sutapa, P.L.; Wiratama, S.; Arvin, V.; Nadlifatin, R.; Persada, S.F. Factors affecting visiting behavior to Bali during the COVID-19 pandemic: An extended theory of planned behavior approach. Sustainability 2022, 14, 10424. [Google Scholar] [CrossRef]
  53. Lee, W.; Jeong, C. Distinctive roles of tourist eudaimonic and hedonic experiences on satisfaction and place attachment: Combined use of SEM and necessary condition analysis. J. Hosp. Tour. Manag. 2021, 47, 58–71. [Google Scholar] [CrossRef]
  54. Huang, L.; Yin, X.; Yang, Y.; Luo, M.; Huang, S.S. “Blessing in disguise”: The impact of the wenchuan earthquake on inbound tourist arrivals in Sichuan, China. J. Hosp. Tour. Manag. 2020, 42, 58–66. [Google Scholar] [CrossRef]
  55. Lan, T.; Yang, Y.; Shao, Y.; Luo, M.; Zhong, F. The synergistic effect of natural disaster frequency and severity on inbound tourist flows from the annual perspective. Tour. Manag. Perspect. 2021, 39, 100832. [Google Scholar] [CrossRef]
  56. Wang, J.; Ritchie, B.W. Understanding Accommodation Managers’ Crisis Planning Intention: An application of the theory of planned behaviour. Tour. Manag. 2012, 33, 1057–1067. [Google Scholar] [CrossRef]
  57. Zou, Y.; Yu, Q. Sense of safety toward tourism destinations: A Social Constructivist perspective. J. Destin. Mark. Manag. 2022, 24, 100708. [Google Scholar] [CrossRef]
  58. Su, L.; Chen, H.; Huang, Y. The influence of tourists’ monetary and temporal sunk costs on Destination Trust and Visit Intention. Tour. Manag. Perspect. 2022, 42, 100968. [Google Scholar] [CrossRef]
  59. Lopez, A. Major Tourism Activity Opens in Siargao Town after “Odette”. Available online: https://www.pna.gov.ph/articles/1178577 (accessed on 13 May 2023).
  60. Avila, J. PHILIPPINES: A Year after Disaster, Siargao Folk Put the Pieces of Life Back Together. Available online: https://www.reportingasean.net/philippines-a-year-after-disaster-siargao-folk-put-the-pieces-of-life-back-together/ (accessed on 13 May 2023).
  61. Waldmann, P.; Mészáros, G.; Gredler, B.; Fuerst, C.; Sölkner, J. Evaluation of the lasso and the elastic net in genome-wide association studies. Front. Genet. 2013, 4, 270. [Google Scholar] [CrossRef]
  62. Ong, A.K.; Chuenyindee, T.; Prasetyo, Y.T.; Nadlifatin, R.; Persada, S.F.; Gumasing, M.J.; German, J.D.; Robas, K.P.; Young, M.N.; Sittiwatethanasiri, T. Utilization of random forest and deep learning neural network for predicting factors affecting perceived usability of a COVID-19 contact tracing mobile application in Thailand “Thaichana”. Int. J. Environ. Res. Public Health 2022, 19, 6111. [Google Scholar] [CrossRef]
  63. German, J.D.; Ong, A.K.; Perwira Redi, A.A.; Robas, K.P. Predicting factors affecting the intention to use a 3PL during the COVID-19 pandemic: A machine learning ensemble approach. Heliyon 2022, 8, e11382. [Google Scholar] [CrossRef]
  64. Ong, A.K.; Zulvia, F.E.; Prasetyo, Y.T. “The big one” earthquake preparedness assessment among younger Filipinos using a random forest classifier and an artificial neural network. Sustainability 2022, 15, 679. [Google Scholar] [CrossRef]
  65. Yousefzadeh, M.; Hosseini, S.A.; Farnaghi, M. Spatiotemporally explicit earthquake prediction using Deep Neural Network. Soil Dyn. Earthq. Eng. 2021, 144, 106663. [Google Scholar] [CrossRef]
  66. Ojha, V.; Nicosia, G. Backpropagation Neural tree. Neural Netw. 2022, 149, 66–83. [Google Scholar] [CrossRef]
  67. Seidu, J.; Ewusi, A.; Kuma, J.S.; Ziggah, Y.Y.; Voigt, H.-J. Impact of data partitioning in groundwater level prediction using artificial neural network for multiple Wells. Int. J. River Basin Manag. 2022, 20, 1–12. [Google Scholar] [CrossRef]
  68. Suzuki, S.; Yamashita, T.; Sakama, T.; Arita, T.; Yagi, N.; Otsuka, T.; Semba, H.; Kano, H.; Matsuno, S.; Kato, Y.; et al. Comparison of risk models for mortality and cardiovascular events between machine learning and conventional logistic regression analysis. PLoS ONE 2019, 14, e0221911. [Google Scholar] [CrossRef]
  69. Nhu, V.-H.; Shirzadi, A.; Shahabi, H.; Singh, S.K.; Al-Ansari, N.; Clague, J.J.; Jaafari, A.; Chen, W.; Miraki, S.; Dou, J.; et al. Shallow landslide susceptibility mapping: A comparison between logistic model tree, logistic regression, naïve Bayes tree, artificial neural network, and support vector machine algorithms. Int. J. Environ. Res. Public Health 2020, 17, 2749. [Google Scholar] [CrossRef] [PubMed]
  70. Aronsson, L.; Andersson, R.; Ansari, D. Artificial neural networks versus lasso regression for the prediction of long-term survival after surgery for invasive IPMN of the pancreas. PLoS ONE 2021, 16, e0249206. [Google Scholar] [CrossRef] [PubMed]
  71. Lee, Y.-K.; Lee, C.-K.; Lee, W.; Ahmad, M.S. Do hedonic and utilitarian values increase pro-environmental behavior and support for festivals? Asia Pac. J. Tour. Res. 2021, 26, 921–934. [Google Scholar] [CrossRef]
  72. Shafqat, W.; Byun, Y.-C. A context-aware location recommendation system for tourists using hierarchical LSTM model. Sustainability 2020, 12, 4107. [Google Scholar] [CrossRef]
  73. Cahigas, M.M.; Prasetyo, Y.T.; Persada, S.F.; Nadlifatin, R. Filipinos’ intention to participate in 2022 Leyte landslide response volunteer opportunities: The role of understanding the 2022 Leyte landslide, social capital, altruistic concern, and theory of planned behavior. Int. J. Disaster Risk Reduct. 2023, 84, 103485. [Google Scholar] [CrossRef]
Figure 1. Permutation importance graphical results.
Figure 1. Permutation importance graphical results.
Sustainability 15 08463 g001
Figure 2. LASSO results overview.
Figure 2. LASSO results overview.
Sustainability 15 08463 g002
Figure 3. Summarized feature-selection results.
Figure 3. Summarized feature-selection results.
Sustainability 15 08463 g003
Table 1. Consolidated characteristics of features.
Table 1. Consolidated characteristics of features.
FeatureDefinition of the FeatureSignificance of the Feature
Awareness of Typhoon Rai’s impactTourist’s level of understanding after Super Typhoon Rai hit the PhilippinesIt advises individuals about calamity warnings and post-disaster updates
Crisis managementThe government and non-government programs that assist victims of Super Typhoon RaiA distinguished typhoon-related feature that aids timely actions
Hedonic motivationInfluence of tourists’ positive emotions regarding revisiting SiargaoIt triggers emotional stimulators, which affect the tourists’ behavior
Perceived travel constraintsExisting physical and emotional limitations that would hinder tourists visiting SiargaoConstraints comprise human physiological needs essential for tourist survival
Perceived travel risksUncertain barriers that tourists may or may not experienceRisks cover all important factors affecting tourists’ behaviors
AttitudeTourists’ personal opinionsThe resulting behavior brought by an individual’s perception
Subjective normsInfluence of other people’s insightsThey affect a tourist’s tendency to seek guidance and companionship
Perceived behavioral controlTourists’ hypothetical competence or incapability regarding visiting Siargao after Super Typhoon Rai hit SiargaoIt acts as the decision authority over tourists’ travel plans
Table 2. Respondents’ demographic characteristics.
Table 2. Respondents’ demographic characteristics.
CharacteristicCategoryNPercentage (%)
GenderFemale41582.7
Male8717.3
Age≤17 years old489.6
18–24 years old25851.4
25–34 years old14829.5
35–44 years old346.8
45–54 years old81.6
≥55 years old61.2
Marital StatusSingle42083.66
Married7815.54
Separated30.6
Widowed10.2
Employment StatusStudent22244.2
Full-time employee11322.5
Part-time employee183.6
Self-employed285.6
Unemployed12124.1
Highest Educational AttainmentHigh-school student6613.2
High-school graduate9518.9
College student15530.9
Associate’s degree 448.8
Bachelor’s degree 12925.7
Master’s degree 122.4
Ph.D.10.2
Travel Budget≤USD 5513426.7
USD 55.01 to 110.507314.5
USD 110.51 to 165.7510220.3
USD 165.76 to 2219118.1
≥USD 221.0110220.3
Revisit FrequencyOnce every year31262.2
Twice every year10721.3
Three times every year285.6
At least four times every year5511.0
Table 3. Optimal feature subset using permutation importance.
Table 3. Optimal feature subset using permutation importance.
NumberFeatureScoreNumberFeatureScore
1PBC30.019588HM50.00998
2PTC70.009989CM70.00998
3CM30.0099810CM50.00998
4SN60.0099811PBC60.00998
5SN50.0099812CM10.00998
6PTR30.0099813ATI70.00998
7PTC90.0099814ATI60.00998
Table 4. RFE’s optimal subset with varying training and testing sizes.
Table 4. RFE’s optimal subset with varying training and testing sizes.
Training SizeTesting SizeOptimal NumberOptimal Feature SubsetOptimal Number Rank
90%10%35ATI1, ATI2, ATI3, ATI6, ATI7, ATI8, ATI9, CM2, CM3,
CM5, CM6, HM1, HM4, HM5, HM6, HM7, PTC1, PTC2, PTC4,
PTC5, PTC8, PTC9, PTR8, ATT1, ATT2, ATT3, ATT5, ATT6,
SN4, SN5, PBC1, PBC2, PBC3, PBC4, PBC6
3
80%20%27ATI1, ATI2, ATI3, ATI6, ATI7, ATI8, CM3, CM5, CM6,
HM1, HM5, HM6, HM7, PTC1, PTC2, PTC5, PTC9, PTR8,
ATT1, ATT3, ATT5, ATT6, SN4, PBC2, PBC3, PBC4, PBC6
1
70%30%28ATI1, ATI2, ATI3, ATI6, ATI7, ATI8, CM3, CM5, CM6,
HM1, HM4, HM5, HM6, HM7, PTC1, PTC2, PTC5, PTC9,
PTR8, ATT1, ATT3, ATT5, ATT6, SN4, PBC2, PBC3, PBC4,
PBC6
2
60%40%27ATI1, ATI2, ATI3, ATI6, ATI7, ATI8, CM3, CM5, CM6,
HM1, HM5, HM6, HM7, PTC1, PTC2, PTC5, PTC9, PTR8,
ATT1, ATT3, ATT5, ATT6, SN4, PBC2, PBC3, PBC4, PBC6
1
50%50%27ATI1, ATI2, ATI3, ATI6, ATI7, ATI8, CM3, CM5, CM6,
HM1, HM5, HM6, HM7, PTC1, PTC2, PTC5, PTC9, PTR8,
ATT1, ATT3, ATT5, ATT6, SN4, PBC2, PBC3, PBC4, PBC6
1
Table 5. RFE’s top three parameters in finding the optimal subset.
Table 5. RFE’s top three parameters in finding the optimal subset.
Training SizeTesting SizeOptimal Feature SubsetModel Accuracy
80%20%ATI1, ATI2, ATI3, ATI6, ATI7, ATI8, CM3, CM5, CM6,
HM1, HM5, HM6, HM7, PTC1, PTC2, PTC5, PTC9, PTR8,
ATT1, ATT3, ATT5, ATT6, SN4, PBC2, PBC3, PBC4, PBC6
71.7741
60%40%ATI1, ATI2, ATI3, ATI6, ATI7, ATI8, CM3, CM5, CM6,
HM1, HM5, HM6, HM7, PTC1, PTC2, PTC5, PTC9, PTR8,
ATT1, ATT3, ATT5, ATT6, SN4, PBC2, PBC3, PBC4, PBC6
73.6331
50%50%ATI1, ATI2, ATI3, ATI6, ATI7, ATI8, CM3, CM5, CM6,
HM1, HM5, HM6, HM7, PTC1, PTC2, PTC5, PTC9, PTR8,
ATT1, ATT3, ATT5, ATT6, SN4, PBC2, PBC3, PBC4, PBC6
72.6232
Table 6. Optimal feature subset using LASSO.
Table 6. Optimal feature subset using LASSO.
NumberFeatureNumberFeatureNumberFeatureNumberFeature
1PBC211PTC921SN631PTC4
2PBC312HM722CM332PTC8
3ATT613ATI123ATI333PTR8
4PBC614PTC124ATI234PTC5
5PBC415PBC525ATI635HM6
6SN416ATI926ATT1
7HM517SN327PTR3
8ATI818SN528PTR7
9ATT319CM229ATI5
10ATT520CM630PTC2
Table 7. Application of logistic regression after permutation importance.
Table 7. Application of logistic regression after permutation importance.
Training–Testing SplitAccuracyPrecisionRecallF1 Score
90:1090.19610.910.900.87
80:2091.08910.920.910.88
70:3089.40400.880.890.87
60:4091.54220.910.920.90
50:5092.03180.910.920.91
Table 8. Application of logistic regression after RFE.
Table 8. Application of logistic regression after RFE.
Training–Testing SplitAccuracyPrecisionRecallF1 Score
90:1092.15690.930.920.90
80:2092.07920.930.920.90
70:3092.05300.920.920.91
60:4094.02990.940.940.93
50:5094.02390.940.940.93
Table 9. Application of logistic regression after LASSO.
Table 9. Application of logistic regression after LASSO.
Training–Testing SplitAccuracyPrecisionRecallF1 Score
90:1090.19610.910.900.87
80:2092.07920.930.920.90
70:3092.71520.930.930.92
60:4094.52740.940.950.94
50:5094.82070.950.950.94
Table 10. The best parameters for the combined permutation importance and ANN.
Table 10. The best parameters for the combined permutation importance and ANN.
FeatureNodes (Hidden Layer)Activation (Hidden Layer)Activation (Output Layer)OptimizerAverage Testing AccuracyAverage Standard Deviation
ATI30SwishSigmoidAdam96.97%0.0185
CM30SwishSigmoidAdam96.17%0.0211
HM20TanhSoftmaxAdam97.76%0.0118
PTCs30SwishSigmoidAdam95.77%0.0222
PTRs30SwishSigmoidRMSProp96.97%0.0160
SNs30SwishSoftmaxAdam96.52%0.0186
PBC30SwishSigmoidAdam96.62%0.0167
Table 11. The best parameters for the combined RFE and ANN.
Table 11. The best parameters for the combined RFE and ANN.
FeatureNodes (Hidden Layer)Activation (Hidden Layer)Activation (Output Layer)OptimizerAverage Testing AccuracyAverage Standard Deviation
ATI30SwishSoftmaxAdam96.42%0.0136
CM30SwishSigmoidAdam96.27%0.0171
HM30TanhSoftmaxAdam97.06%0.0142
PTCs30TanhSigmoidAdam96.32%0.0168
PTRs30SwishSigmoidAdam96.22%0.0131
ATT30SwishSigmoidAdam96.02%0.0133
SNs30TanhSigmoidRMSProp95.37%0.0161
PBC30TanhSoftmaxRMSProp96.92%0.0138
Table 12. The best parameters for the combined LASSO and ANN.
Table 12. The best parameters for the combined LASSO and ANN.
FeatureNodes (Hidden Layer)Activation (Hidden Layer)Activation (Output Layer)OptimizerAverage Testing AccuracyAverage Standard Deviation
ATI30TanhSoftmaxAdam97.16%0.0193
CM30SwishSigmoidAdam96.22%0.0183
HM30TanhSigmoidAdam97.51%0.0146
PTCs30SwishSoftmaxRMSProp96.57%0.0194
PTRs30SwishSigmoidAdam95.42%0.0216
ATT30SwishSoftmaxRMSProp96.62%0.0128
SNs20SwishSigmoidAdam96.07%0.0184
PBC30SwishSoftmaxAdam96.87%0.0133
Table 13. The best accuracy percentages for each feature selection combined with LR.
Table 13. The best accuracy percentages for each feature selection combined with LR.
AlgorithmTraining–Testing SplitAccuracyPrecisionRecallF1 Score
PI-LR50:5092.0318%0.910.920.91
RFE-LR60:4094.0299%0.940.940.93
LASSO-LR50:5094.8207%0.950.950.94
Table 14. Optimal parameters for each feature selection combined with ANN.
Table 14. Optimal parameters for each feature selection combined with ANN.
AlgorithmTraining–Testing SplitNodes (Hidden Layer)Activation (Hidden Layer)Activation (Output Layer)OptimizerMost-Important FeatureAverage Testing AccuracyAverage Standard Deviation
PI-ANN70:3020TanhSoftmaxAdamHM97.7483%0.009469
RFE-ANN70:3030TanhSoftmaxAdamHM97.6821%0.019308
LASSO-ANN70:3030TanhSigmoidAdamHM97.8146%0.011278
Table 15. Comparison of machine-learning algorithms’ accuracy values.
Table 15. Comparison of machine-learning algorithms’ accuracy values.
AlgorithmAccuracyAlgorithmAccuracy
PI-LR92.0318%PI-ANN97.7483%
RFE-LR94.0299%RFE-ANN97.6821%
LASSO-LR94.8207%LASSO-ANN97.8146%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Cahigas, M.M.L.; Ong, A.K.S.; Prasetyo, Y.T. Super Typhoon Rai’s Impacts on Siargao Tourism: Deciphering Tourists’ Revisit Intentions through Machine-Learning Algorithms. Sustainability 2023, 15, 8463. https://doi.org/10.3390/su15118463

AMA Style

Cahigas MML, Ong AKS, Prasetyo YT. Super Typhoon Rai’s Impacts on Siargao Tourism: Deciphering Tourists’ Revisit Intentions through Machine-Learning Algorithms. Sustainability. 2023; 15(11):8463. https://doi.org/10.3390/su15118463

Chicago/Turabian Style

Cahigas, Maela Madel L., Ardvin Kester S. Ong, and Yogi Tri Prasetyo. 2023. "Super Typhoon Rai’s Impacts on Siargao Tourism: Deciphering Tourists’ Revisit Intentions through Machine-Learning Algorithms" Sustainability 15, no. 11: 8463. https://doi.org/10.3390/su15118463

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop