The Impact of Hotel Customer Experience on Customer Satisfaction through Online Reviews

: With the growing popularity of the internet, customers can easily share their experiences and information in online reviews. Consumers recognize online reviews as a useful source of information prior to consumption, and many online reviews inﬂuence consumer purchasing decisions. Understanding the customer experience in online reviews is thus necessary to maintain customer satisfaction and repurchase intention for the sustainable development of the hotel business. This study assessed the fundamental selection attributes of customers from online reviews reﬂecting the hotel customer experience, and investigated their association with customer satisfaction. A total of 8229 reviews were collected from Google travel websites from December 2019 to July 2021. Text mining and semantic network analysis were adopted for big data analysis. Factor and regression analyses were then used for quantitative analysis. Based on linear regression analysis, the Service and Dining factors signiﬁcantly affected customer satisfaction. Service is a critical selection attribute for customers, and the provision of more particular services is necessary, especially after COVID-19. These results indicate that understanding online reviews can provide theoretical and practical implications for developing sustainable strategies for the hotel industry.


Introduction
When tourists plan to travel and book accommodations, they often search for hotels they have never been to before. Acquisition of hotel information can reduce the risk of choosing poorly. Traditionally, word of mouth (WOM) from friends has been used to reduce risks, and with the popularity of the internet, electronic word of mouth (eWOM) is becoming important [1,2], as consumers begin to trust in WOM transmitted over the internet [3,4]. Commonly, consumers want to express or share their opinions and seek information via digital means [5,6]. Filieri and McLeay [7] found that 96.4% of their respondents (7000) had used the internet as a source of information in the travel planning stage, and 90% referred to other customers' reviews.
Because the characteristics of the service field are not like those for tangible products, it is difficult to determine the actual situation of a hotel until the customer has experienced it [8]. Here, eWOM provides rich data that reflect consumer characteristics and perceptions of service satisfaction. The experience of the service mentioned in eWOM thus implies the main attributes and quality levels of the product or service considered by the customer [9], and eWOM can significantly affect tourists' purchase decisions and the business performance of hotels [10].
Many researchers have conducted surveys to investigate customer satisfaction and hotel selection attributes in the hotel and hospitality industry [11][12][13]. Although the survey method has the advantage of being able to elicit answers to the desired questions, there may be limitations due to factors that might cause measurement errors, such as the form of the questionnaire, the survey term, the response category, and the order of the survey [14]. To address this gap, online review mining was conducted to gain more accurate and detailed customer information through the application of big data analytics in the hospitality sector [15][16][17][18][19]. Filieri and Mariani [16] adopted big data analytics to examine how reviews from different countries evaluate the helpfulness of online reviews. Ban and Kim [19] analyzed user reviews to understand various airline passenger satisfaction attributes using text mining analysis.
However, there are still only limited studies on the online reviews of hotel customers that use big data analytics, explore the factors reflecting customer experiences based on these review data, and then verify the relationship between the extracted factors and customer satisfaction. Therefore, this study sought to derive meaningful implications by analyzing online reviews of hotels and exploring which attributes affected customer satisfaction.

Electronic Word of Mouth
E-commerce has a strategic emphasis for business and consumers, and WOM has been reconceptualized as eWOM since traditional, face-to-face WOM is changing into eWOM, as consumers can obtain information regarding products or services online before making a purchase decision [20,21]. When purchasing intangible products or services, consumers tend to rely more on eWOM for products or services they have not used or experienced before [22], as this helps customers obtain specific information influenced by customers' selection attributes and shared personal experiences, opinions, photos, hotel reviews, and vacation suggestions [23].
Online review websites such as Tripadvisor serve as an essential platform for consumers to share purchasing experiences and express their opinions about products and services [24]. Reviews exchanged among consumers contain information about the user's experience and how that experience is perceived [25], such that customers are more likely to trust eWOM than information provided by sellers [22]. By reading other people's reviews, potential consumers better construct their interpretation of the product and become more aware of the risks to their transactions. Vermeulen and Seegers [26] found that consumers' exposure to online reviews improved the probability of booking a hotel. Stringam, Gerdes, and Vanleeuwen [27] analyzed reviews on Expedia.com, an online travel agency platforms, and found that the overall satisfaction evaluation had a high correlation between recommendation intentions. Ban and Kim [19] analyzed the online reviews on Skytrax.com and quantitatively identified the relationship among six evaluation factors (seat comfort, staff, F&B, entertainment, ground service, and value for money), customer satisfaction, and recommendations.
This study thus investigates customer experience through text-mining analysis of reviews of hotel products, which are representative experience goods greatly affected by online reviews.

Customer Experience and Satisfaction
Customer experience, which means that an individual has experienced the goods or services of a company as a consumer, was first conceptualized by Schmitt [28] as being composed of sensory, emotional, cognitive, behavioral, and relational experiences. Meyer and Schwager [29] defined customer experience as the customers' personal/subjective response to direct/indirect contact with a firm in any form. Customer experience is remembered positively or negatively depending on the customer's situation, which leads to customer satisfaction, and customers who positively evaluate the experience can be said to feel satisfied [29].
Customer satisfaction is a complex experience in the hospitality industry, and assessing what the customer has experienced is complicated [30]. Focusing on customer experience and satisfaction has inevitably increased as the market has changed from a producer-driven market to a buyer-driven one [20]. Customer-satisfaction management is the only strategy that can respond to such market changes. Corporate marketing activities have made it a fundamental goal to focus on customer satisfaction, which, through customer experience, can increase customer loyalty, repurchase intention, positive WOM, and consequently, contribute to higher profitability [31].
Customers prefer and think about hotel selection attributes, which play a critical role in the information search and alternative selection process and is thus a target for evaluating hotel satisfaction and dissatisfaction [32,33]. Identifying the customer's hotel selection attributes is essential for improving service quality and increasing customer satisfaction to gain a competitive advantage [34].

Big Data Analysis
Big data analysis of online-generated data to predict consumer behavior and psychological is an emerging topic in the hospitality industry [19]. It is necessary to extract and convert data from online reviews before analysis, and text mining and semantic network analysis play an important role in big data analysis due to the large and unstructured nature of consumer-generated data. This study employed text analytics to deconstruct numerous customer reviews collected from Google.com, which was followed by semantic network analysis to examine the association with extracted keywords [19,22].
Text mining for knowledge discovery based on text databases was first mentioned by Feldman and Dagan [35], and is the discovery of valuable unknown patterns in text using information retrieval and extraction and natural language processing (NLP) techniques [36]. Text mining involves data collection, data refining, data analysis, a management information system and knowledge generation [37], as shown in Figure 1. When collecting data, researchers identify the type of information they want, clarify concepts, limit the scope of what they want to collect, and familiarize themselves with the characteristics of the keywords. Data refining and preprocessing involves converting unstructured textual data into structured forms. For accurate results, sophisticated pretreatment is essential. As a step to analyze text-based technologies such as information extraction, document summary, and clustering are involved in the next step for analyzing text-based data, and analysis methods suitable for research purposes are then applied to manage information systems at worksites and to accumulate knowledge [19,38].
Text mining for the hospitality industry has recently been the subject of active research. Boo and Busser [8] qualitatively and quantitatively investigated meeting planners' online reviews of destination hotels. The results yielded insights to respond to the online reviews and formed the basis for hotel evaluation criteria. Stepchenkova and Morrison [39] analyzed Russia-related texts on 212 websites, measured Russia's tourist destination image, and identified the differences between US and Russian websites. He, Zha, and Li [40] described an in-depth case study that applied text mining to analyze text on the Facebook and Twitter sites of three pizza chains: Pizza Hut, Domino's Pizza, and Papa John's Pizza, which are representative franchises in the industry. The results confirmed the value of social media competitive analysis and the power of text mining as an effective technique for extracting business value from the vast amount of available social media data. The formation of social issues through SNS (Social Networking Service) is accelerating in various fields. With the development of SNS, network analysis is becoming essential to extract different meanings and concepts inherent in text-based messages and to understand their relational characteristics [41]. Semantic network analysis assesses the structure of a semantic network retrieved according to the given text, and it also explores meaning through the structural relationship of words as message components, rather than lexical units [42].
Semantic network analysis uses individual words to clarify network structure and meaning within a text. Selecting a specific term and repeatedly using it when emphasizing a particular meaning is one method for content analysis of the relationship between words that appear simultaneously in a sentence or paragraph. The core of semantic network analysis is indicating the influence of words, and analysis based on structural identity consists of an index for classifying subgroups based on word similarity [19,30,43].

Data Collection
The data collection procedure for this study is as follows. Hotel reviews were collected from Google Travel (www.google.com/travel), the largest search engine globally. Google hotel reviews include detailed information about the hotel brand used by the customer, the reviewer's ID, review date, comment, rating, and type of trip. Figure 2 shows a specific example from Google Online Reviews. SCTM (Smart Crawling & Text Mining, developed by the Wellness and Tourism Big Data Research Institute at Kyungsung University) and TEXTOM (a big data analysis solution to collect and refine data and generate matrix data) were used to collect and refine online data. Words were ranked according to the frequency of their occurrence, to analyze the unstructured data. Table 1 shows the 25 recommended hotels and the number of reviews for each hotel. A total of 8448 reviews were collected, and 8229 reviews and 314,813 words were extracted, excluding data that were not readable or had only ratings with no review content. The data collection period was from December 2019, when COVID-19 first appeared, to July 2021, to determine if there were any different implications from a prior study of Ban, Choi, Choi, Lee, and Kim [44], who conducted research using online hotel reviews before the COVID-19 pandemic, making this a longitudinal study.

Data Analysis
The analysis was conducted according to previous studies [19,30,43,44]. Big data analysis was performed in two parts: text mining and semantic network analysis. Factor and linear regression analyses were then used. First, the top 90 most frequent words were extracted through the text mining refining process. The Python-based Natural Language Toolkit was used. A stop word refers to commonly used word (e.g., "the", "a", "an", "in"). The program ignores stop words both when indexing entries for searching and when retrieving them as the result of a search query. Through the refinement process, pronouns, prepositions, and meaningless words were removed, and the generated data included only words related to the hotel experience. The word matrix (word × word) was then deduced. To assess the overall satisfaction with the hotel experience, a distribution of hotel experience evaluations based on the rating score was used, and this 'overall rating' was used as a dependent variable, because the value can be treated as the primary output variable [43].
Second, based on the matrix data, the network of words was visualized using Ucinet 6.0 to clarify the connection structure and connectivity between nodes. Semantic network analysis focused on degree and eigenvector centralities, which are indicators quantified based on the centrality concept arrangement and measurement method. Freeman proposed eigenvector centrality as a measure of the influence of a node in a network [19,30]. Relative weights are assigned to all nodes in the network, based on the idea that connections to highscoring nodes contribute more to the score of the node in question than an equal number of links to low-scoring nodes [45]. Finally, CONCOR (CONvergence of iterated CORrelation) analysis was conducted to segment the words and acquire the dimensions of hotel customer experience. CONCOR analysis was performed repeatedly to find the connections and relationships between the words and similarity groups by forming clusters including keywords [42]. The results were visualized with Netdraw to provide a more intuitionistic visualization of the segmentation of the top-frequency words used by customers.
Finally, quantitative analysis was performed. By integrating the results of the CON-COR analysis and a comparison of word frequency and centrality, words were selected for further factor and linear regression analyses using dummy variables. Tao and Kim [43] investigated customer experience-related words with the highest explanatory power for customer satisfaction, and conducted a study on words with high frequency. These highfrequency words were ultimately judged to be appropriate because of Freeman's high centrality in demonstrating word relevance and variables focused on revealing communities in combination with key items in the CONCOR results.
Factor analysis was performed to retrieve the main factors affecting hotel customer satisfaction, using 55 out of the 90 top-frequency words. In addition, linear regression analysis, which consisted of four independent variables derived from the factor analysis and overall ratings as a dependent variable, was performed to verify the following hypothesis: The hotel experience shown in the online review can be used to explain customer satisfaction.
It is undeniable but surprising that "service" had the highest frequency, and this implies that, for hotel consumers, a hotel's service is the aspect they mentioned most. There were also words related to service, such as "restaurant", "experience", and "staff", that appeared with high frequency. Referring to the number of online hotel reviews, although the "Gran Hotel Ciudad de México" ranked 20th in the hotel brand among 25 hotels, it recorded the highest number of reviews, at 2064, which suggests that the two words "ciudad" and "mexico" have high web visibility. Words related to location or name of the hotel were also common, such as "ciudad", "ritz carlton", "chandy", "mandapa", "lisbon", and "france", as were words such as "experience", "trip", "adventure", "family", "security", and "price", which could reflect the purpose of the trip.

Semantic Network Analysis
Semantic network analysis identifies the relationship between words and expresses their connection in the network [44]. The centrality of the top 90 most frequent words was calculated, compared with word frequency, and synthesized in Table 3, in which the top 50 most frequent words are described. As a result, words such as "service", "hotel", "restaurant", "experience", and "breakfast" are shown to have both high frequency and high centrality. To a great extent, this indicated that these words were frequently used by online users and are closely associated with other nodes in the semantic network.
Words such as "place", "facility", and "security" had a relatively lower centrality ranking relative to their high frequency, which suggests that although these words were frequently searched by users, their connection and influence relative to other words is low. Words such as "room", "adventure", and "dinner" appeared in the opposite distribution, as they do not appear as frequently, while they did strongly relate to other terms with relatively high value. The nodes have great significance in the semantic network because of their strong connection to and influence over other nodes. Figure 4 illustrates the result of the CONCOR analysis. Four sets generated from the semantic network analysis were named based on the notable words and related meaning within the online reviews. Table 4 shows the words grouped in each cluster and the notable words.  The group names-destination, physical environment, service, and trip purposereflect the characteristics of the words included in each cluster. The destination cluster includes words related to hotel brands or tour spots (e.g., "France" "waldorf"), as well as general terms such as "locate" or "station". The physical environment cluster contains words related to facilities, such as "balcony" or "garden". The service cluster contains concepts: food and beverage (F&B) and staff service. F&B contains words such as "buffet" and "restaurant", while staff service includes words such as "attention" and "hospitality", and it is important to note that consumers tend to consider staff service and quality of service when choosing a hotel. The final cluster, trip purpose, contains words relating to purpose such as "conference" and "adventure", as well as including words related to sociality, such as "family" and "friend". After conducting CONCOR analysis, 55 words were used to determine the main factors affecting hotel customer satisfaction.

Factor Analysis
Many measurement variables can be reduced to smaller variables using the varimax rotation process through factor analysis. This study adopted the common factorial criteria for factor extraction. In this study, the standard eigenvalue value is 1.0 or higher, and the factor loading value is 0.400 or higher. Variables loading onto two factors simultaneously were dropped. From the results, 10 key words within four factors contributed 18.458% of all variance, and these were used as the independent variables to derive the key factors affecting customer satisfaction. According to Table 5, the KMO index was above 0.6, indicating a high correlation between related variables. In Bartlett's sphericity test, χ 2 was 84,453.902, with the overall significance of the correlation matrix p < 0.001. This means that these data are suitable for exploratory factor analysis. The four factors were named: Service (Factor 1), Physical Environment (Factor 2), Dining (Factor 3), and Location (Factor 4). Factor 1 contains "staff", "quality", and "service", which are related to the core services of the hotel. Factor 2 contains "facility", "environment", and "room", which are related to a tangible part of the hotel. Factor 3 contains "restaurant" and "brunch". Factor 4 consisted of concrete site names, such as "ciudad" and "mexico".

Linear Regression Analysis
Regression analysis was performed to determine how the independent variables affect the dependent variable. The hotel customer satisfaction rating was used as the dependent variable, and the four hotel customer experience factors were used as the independent variables. The results of refining variables through factor analysis yielded four independent variables-Service (S), Physical Environment (P), Dining (D), and Location (L)-and one dependent variable: Customer Satisfaction (CS). As it can be seen in Table 6, the overall variance explained by the four predictors was 10% (R 2 = 0.100), and the standard error of the estimated value was 0.70347. The correlation between the independent and dependent variables was relatively low, because the low frequency of online hotel reviews may not have included any of the four factors affecting customer experience and satisfaction. It is impossible to include all relevant variables in regression analysis to estimate output variables, such as opinion, from text-mining data, so the R 2 value can be low. According to prior studies using regression and factor analyses on online reviews for washing machines and hotels, the R 2 values were 12.5%, and 12%, respectively [30,44]. Here, Service (S, β = 0.027) and Dining (D, β = −0.021) are significant at the p < 0.05 and p < 0.1 level, respectively. Service (S) is positively related to customer satisfaction, and Dining (D) is negatively related to customer satisfaction. Based on these results, the regression equation can be expressed as follows: CS = 4.733 + 0.019S** + 0.004P − 0.015D* − 0.005L The Service (S) factor holds the highest standardized coefficients, which means staff service is the essential factor associated with customer satisfaction. The Dining (D) factor was found to negatively influence customer satisfaction, indicating that customers have a negative view of the dining factor, which included words such as restaurant and brunch.

Discussion and Conclusions
This study was conducted to assess the customers' experience and satisfaction based on online reviews. Keywords were derived through text mining, and frequency analysis was then performed. The top 90 most frequent words were extracted, and degree and eigenvector centrality analyses were performed to determine the relationship between keywords. CONCOR analysis was then adopted to generate four clusters: destination, physical environment, service, and trip purpose.
These results are similar to those found in a previous study by Ban et al. [44], who named a similar grouping of four clusters-intangible service, physical environment, location, and purpose. Both that study and the present one found factors related to the hotel's essential attributes, which suggests that increasing satisfaction with these attributes is paramount to forming positive eWOM. Overall, this work established that dimensions related to various facilities, location, service, and food and beverage significantly contribute to enhancing hotel customer experience. These findings reaffirm the existing arguments about the complexity and diversity of the hotel experience, which spans various encounters with food and beverage prices, quality, safety, improved service, overall feeling, image, comfortable condition, and location [11]. This work adds to the understanding of customer hotel selection attributes that are a prerequisite for improving service quality and enhancing customer satisfaction to gain a competitive advantage. Additionally, the keywords were visualized by drawing networks and nodes using NetDraw in UCINET 6.0. Factor and linear regression analyses were performed to determine the relationships between extracted factors and customer satisfaction.
This study provides five academic and practical implications based on the research results. For the theoretical implications, this study demonstrates the significance of extending the application area of semantic network analysis. This work extends our knowledge and serves as a benchmark for researchers and stakeholders concerning the factors that provide pleasing outcomes within the hotel context. Given the importance of the hotel sector in the tourism industry, this study empirically explored the hotel experience and satisfaction through big data analysis. Understanding online reviews as an expression of customer experience can help the hotel industry identify key attributes needed to achieve positive repurchase intention and increase revenue. Online reviews provide an efficient way for the hotel industry to collect feedback from hotel customers and discover how to generate positive revisit intention after the experience.
Second, the use of a semantic network analysis provides a valuable tool for exploring customers' comments about their hotel experiences. This method reveals features that explain why customers evaluate their hotel experience positively or negatively. By analyzing customer comments, we can assess the power and meaning of words commonly used by customers when sharing their experiences and how those word choices can inform recommendations through online sources. The power expressed in words contains the customers' expressions and evaluations of their experiences, and serves as a basis for understanding the reality of their experience.
Third, in terms of practical implications, "service" is an important factor influencing customer satisfaction (as in previous studies [46,47]), and a more remarkable result is that customers expect to receive more attentive service during COVID-19. Thus, managers or operators in the hotel industry should pay more attention to maintaining the standard service quality and providing more proactive or extra service to customers during this pandemic. Offering service with warm and sincere hospitality can create positive eWOM and satisfaction, as shown in the following quotations from online reviews: "José Luis and Víctor were always greeting us with a big smile probably one of the best treatment I have had in a hotel" and "The staff is simply amazing! They make every effort to make you feel comfortable and welcome." Appropriate staff service at the service point leads to positive reviews, which then form a positive image of the hotel. There is a need for continuous and systematic education and training to motivate employees with customer-centered thinking.
Fourth, it is noteworthy for hotel operators and managers that dining negatively influences customer satisfaction. One customer mentioned the restaurant and brunch within their review and stated they were not pleased with the dining experience. This indicates that customers using the hotel are paying the same cost as before the pandemic, but they are restricted from using some facilities (such as restaurants) due to COVID-19 or cannot receive regular services. Due to COVID-19, restrictions on restaurants that confine many people into a narrow space are unavoidable, but consideration from the hotel is necessary to ensure that customers do not feel uncomfortable. For example, the hotel could provide a service where customers can eat all the menu offerings from the restaurant in the room. Using a mobile app, the hotel could deliver the food ordered to the room without contacting others. If this exceptional dining service is implemented, it could affect customer satisfaction, revisit intention, and positive WOM. There is need for a service that does not cause inconvenience to customers despite the limited availability of F&B services due to COVID-19. If the hotel minimizes contact with employees and provides a service that allows the customer to dine in a private space through a delivery service or to order through a mobile app and watch the cooking process on the screen, no problems with hygiene or cleanliness that customers are particularly concerned about with COVID-19 would arise.
Finally, online reviews are a source of information for potential customers to make decisions. Considering that tourism decision making through the internet is rapidly increasing, this study can guide hotel marketing strategy, facility operation, and complaint management through big data analysis of online reviews.
This study has some limitations, and results should be interpreted with caution; these limitations also provide suggestions for future research. First, online reviews were collected from Google, the world's largest search engine. However, there is a possibility that using a specific online channel may not capture all customer preferences, so for representativeness, future samples should be based on analysis of various websites and should use data from many years. Second, it is not easy to understand the additional meaning of words when analyzing their frequency. Future research should adopt the further analysis of positives and negatives, and sentimental analysis could better clarify customer experience and satisfaction. Finally, in this study, reviews of 25 hotels recommended by TripAdvisor were collected from Google Travel and analyzed. However, the quantity of review data collected for each hotel differed widely due to the different sizes, types, and average costs of the hotels considered. Future studies should collect and analyze review data from hotels that share the same characteristics to derive more meaningful research results.