Analysis of Destination Images in the Emerging Ski Market: The Case Study in the Host City of the 2022 Beijing Winter Olympic Games

: This study aims to propose a text mining framework suitable for destination image (DI) research based on UGC (User Generated Content), which combines the LDA (Latent Dirichlet Allocation) model and sentiment analysis method based on custom rules and lexicon to identify and analyze the DI in the emerging ski market. The ski resorts in the host city of the 2022 Winter Olympic Games are selected as a case study. The ﬁndings reveal that (1) 9 image attributes, out of which two image attributes have not been identiﬁed before in winter destination studies, namely beginner suitability and ticketing service. (2) In the past seven snow seasons, the negative sentiment of tourists has shown a continuous downward trend. The positive sentiment has exhibited a slow upward trend. (3) For tourists from destination countries affected by the Winter Olympic Games, the destination image will be improved when the destination meets their expectations. When the destination cannot meet their expectations, the tourists still believe that the holding of the Winter Olympic will enhance the destination’s situation. The theoretical and managerial implications of these ﬁndings are discussed.


Introduction
Destination image (DI) is generally regarded as a critical aspect of destination development as it affects both the supply and demand sides of the markets [1]. DI identifying is considered an essential means to capture the strengths and weaknesses of the destination to improve product development and marketing [2]. Although DI is a well-established field [3,4], with increasing global competition and changing tourist motivations, communicating a positive DI remains a top priority for successful tourism management and destination marketing [5,6]. Therefore, it is vital to understand the tourists' DI for the development of the destinations. With the pervasiveness of online social media and we-media, a large volume of data called UGC has become an essential material for researchers to analyze and predict the tourism market [7,8]. Therefore, how to develop a comprehensive and effective approach to using massive UGC such as tourist reviews is an urgent issue in tourism marketing and management research [9]. Nevertheless, the present DI research is mainly restricted to questionnaires and scale measurements [10,11]. It has not yet developed a competent approach to take full advantage of UGC. Although there has been consistent growth in DI research based on UGC, many studies still use manual processing to analyze data [10], which questions the scalability of the method and the reliability of the results [12]. The introduction of NLP (Natural Language Processing) and data mining is an important way to solve the problem [13]. Topic modeling technique named LDA model in machine learning can classify irregular UGC data according to specific topics and mine potential topics that may not be mined manually [14,15]. Topic classification results represent the voice of customers, rather than confirming an existing theory, and the results may or may not cover all necessary topics related to contemporary background, thus realizing a new dimension of the theory [16,17]. It also improves the reliability of DI feature extraction based on UGC [12]. DI provides limited information about destinations, as they are generally stereotypical and represent a gross simplification of reality [18]. Therefore, only identifying the DI of tourists cannot understand the impact of a particular image on the sentiment characteristics of tourists. A limitation of topic modeling is the inability to determine the sentiment of the topic [16]. The sentiment analysis technique can not only identify the emotional tendency of the whole text but also can be used to classify the sentiment of specific DI, which can provide more valuable insights for the research of DI based on UGC. However, the current DI research based on the LDA model mostly adopted the single method, which lacks the organic combination with sentiment analysis technique.
Ski tourism is being accomplished in over 100 countries and more than 400 million yearly skier visits [4,19]. Although the ski industry faces tremendous challenges such as global warming, an aging population [20], and the COVID-19 epidemic, the economic benefits of ski tourism are still critical to many local and national economies [19,21]. This paper considers the ski resorts in China as a particularly interesting research context for two reasons. First, due to the anticipation of the 2022 Beijing Winter Olympic Games, the number of skiers has risen from 10.3 million in 2015 to 23.45 million in 2019, which has been the only fast-growing emerging ski market globally. Ranking among the top three countries in terms of the number of skiers after the United States and Germany, it has become one of the most promising target markets for many ski destinations [19,22]. Second, the development stage and pace of China's ski tourism industry is entirely different from that of western countries, and the Chinese people's spending capacity and travel experience are changing at an accelerated pace [23][24][25], whether the research conclusions based on international tourists apply to China's skiing market remains to be further discussed. A comprehensive review of DI research on winter tourism shows that most of the research on DI has been conducted in a western context. DI research in emerging markets seems to be absent. Besides, DI has become essential for arranging mega-sport events, especially for emerging markets [26][27][28]. Accordingly, studies of DI in emerging ski markets would be helpful to both the academic community and the tourism industry.
Based on the above, this paper seeks to accomplish three goals. First, in terms of method, a UGC mining framework is proposed. The LDA model and sentiment analysis method based on custom lexicon and rules are combined to identify and analyze the tourists' destination image and satisfaction distribution of the ski tourists. Second, in theoretical terms, this paper selects the ski resorts, the main venue of the 2022 Winter Olympic Games, as a case study. It identifies the unique image perception attributes of Chinese skiers and reveals how China's ski tourists are dissimilar from international ski tourists using the lens of DI. Third, to explore the impact of mega-sport events on DI, this study analyzes the UGC mentioning "Winter Olympic Games" (OG) and tentatively explores the image perception of tourists from the perspective of the OG. The paper is organized into five sections. A literature review is presented in Section 2. Section 3 contains the method construction and data analysis, and the results are discussed in Section 4. The paper closes with a discussion of the implications and limitations.

Destination Image
Accurate identification of DI of tourists is very important to determine the strengths and weaknesses of the destination, to improve product development and marketing [29]. DI refers to the individual's belief, perception, and feeling towards a particular destination [30], which is the influence factor of individual attractiveness evaluation and destination choice before travel, as well as the critical index of tourism experience evaluation, human-land emotional connection, and even revisit intention and local loyalty after a trip [31]. Gartner is one of the earliest scholars who proposed the "cognitive-affective" model. He believes that cognitive image and affective image are two distinct and interconnected attributes, and they work together on the images of tourism destinations [32]. Specifically, the cognitive image refers to the result of individual information processing and processing of objective attributes such as natural, cultural, and social environment, tourism products, services, and infrastructure of the destination. Affective image refers to the subjective emotional evaluation of various attributes of the destination [33].
Over the past few decades, DI has been widely used in the study of tourism destinations (e.g., [34][35][36][37]). However, few studies have studied winter/skiing tourist destination choices, preferences, and images [38,39]. For example, Kim et al. (2011) took the major ski resorts in the western United States as an example to explore the effects of different cognitive and emotional images on destination attractiveness. The study found that an interesting and comfortable atmosphere and skiing quality were significantly related to destination attractiveness [40]; Anderson et al. described Norway as a winter tourist destination for different destination images of tourists from Sweden, Denmark, and Germany. The study found that the cognitive dimension of the destination image is particularly important to Swedes, while Germans have the strongest perception of the emotional dimension [5]. Kim discussed the influence of demographics, experience, and professional knowledge on the formation of destination terrain image of ski resorts. It was found that the affective comfort image of skiers is more affected by expertise level than cognitive image [20]. Most of the studies on the winter destination image are carried out in the context of western countries. Such research is more or less absent with respect to emerging winter destinations. However, there are great differences in behavioral preferences between Western and Asian tourists [7]. Therefore, exploring the multi-dimensional connotation of DI in the emerging ski market is helpful to better understand the market preference, which is necessary for more effective and targeted marketing and destination development [41].

Destination Image Measuring Based on UGC
With the development of the Internet era, in addition to the traditional questionnaires and scale measurements, the UGC data has become an important material for researchers to analyze the destination image [10,12]. To date, many studies still use manual processing to analyze the data [10,42], such as content analysis and grounded theory [17]. Researchers who use content analysis generally utilize computer-aided text analysis tools to improve their scalability and 'systematicness' further. Such as General Ququirer and Language Query and Word Count (LIWC), however, the same words may have different meanings in different contexts, and the mutual exclusion of words excludes "polysemy" [43]. Thus, a common criticism of content analysis is that it reduces complex theoretical structures to overly general and simplistic indexes, resulting in out-of-context results [44]. Grounded theory is a highly contextual approach to developing a theory by gathering and intensively touching text and then using comparative coding to identify the higher-order structure [45]. Researchers are committed to exploring through direct contact with the social world studied while rejecting transcendental theorizing [46]. Therefore, grounded theory is not a method of measuring, it is to fundamentally identify the deep structure in the data to gain a rich understanding of the social process [46]. However, its theoretical flexibility also makes it the target of some critiques. At the beginning of this century, topic modeling was developed as a unique NLP-like method for information retrieval and classification of large amounts of text [47]. In the past decade, social scientists have increasingly used topic modeling LDA to analyze text data [16,17,48]. Topic modeling uses statistical associations of words in the text to generate potential topics clusters of co-existing words that jointly represent higher-level concepts [17]. Compared with the above methods, researchers do not have to impose interpretation rules on the data, and it can identify essential topics that humans cannot identify. It allows polysemy because topics are not mutually exclusive [43], which partly improves the reliability of tourism destination image feature extraction based on UGC.
To date, few articles have used it to explore the DI (e.g., [12,49]). However, most of the current studies focus on destination images as a whole or only on cognitive images (e.g., [16]), the research focusing on the affective image is absent. The research methods on winter destination images are still limited to questionnaires and interviews [1,5,50]. In this study, the application of the method was rooted in theory-driven. The LDA model was used to identify and analyze the destination's cognitive image and affective image based on cognitive-affective theory, exploring tourists' unique image perception attributes in the emerging ski market based on the previous research of winter destinations.

Sentiment Analysis
Sentiment analysis (SA), also known as opinion mining, has grown to be the most active area in the natural language processing (NLP) field with unprecedented attention [51,52]. The earlier research on tourism sentiment mainly adopted the methods of interview and questionnaire. However, with the gradual deepening of the study, more and more scholars have realized the importance of big data as a tool for tourist sentiment analysis [53]. At present, sentiment analysis mainly includes sentiment analysis based on machine learning and sentiment analysis based on lexicon [54]. The former is an analytical supervised-learning method, which converts text into digital features and uses a classifier to classify them. However, it requires manual annotation of massive data in a specific field in advance, which costs a lot of time, human resources and equipment, and cannot be applied to non-specific areas. The latter is an analysis method of unsupervised learning, which matches the emotional words in the analysis text through the emotion lexicon and calculates the emotional tendency value of the text according to the grammatical rules. Compared with machine learning, sentiment lexicons not only do not need data annotation, and the cost is low but also can be easily extended to be applied in multiple research fields [53].
To date, the sentiment analysis research based on the sentiment lexicons is mostly based on the public lexicon, while the lexicon in ski tourism should be related to the field of sports and tourism, which has its professionalism and uniqueness. A general lexicon is not specific to tourism, which is difficult to accurately analyze the sentiment characteristics of ski tourists [7]. Therefore, a domain lexicon of ski tourism should be developed.
This study uses the LDA model to identify the topic of UGC. According to the theory connotation of the cognitive-affective theory, cognitive image and affective image are summarized. However, a limitation of topic modeling is the inability to determine the sentiment of the topic [16]. To make up for this limitation, this study combines the LDA model with the sentiment analysis method based on domain lexicon, and the satisfaction distribution is further explained by analyzing the emotional polarity through the domain lexicon, mining the focus of positive and negative evaluation, identifying the satisfaction changing trend. In addition, the UGC mentioning "Winter Olympic Games" is analyzed, and the image perception of tourists is discussed from the perspective of the Winter Olympic Games.

Data Collection and Pre-Processing
The Chongli ski resort cluster is located in Zhangjiakou city, northwest of Hebei Province, 220 km away from Beijing. It is the largest ski cluster in China. In July 2015, Beijing city with Zhangjiakou city successfully won the right to host the 2022 Winter Olympic Games. Chongli becomes the main venue for snow events of the 2022 Winter Olympic Games (http://global.chinadaily.com.cn/a/201912/25/WS5e030efba310cf3e355 8090a.html) (accessed on 20 December 2021). There are seven ski resorts in this region, such as Wanlong, Thaiwoo, Secret Garden, etc., with a total area of 4.1 square kilometers of slopes and 129.6 km length of slopes. The seven major ski resorts are all over 1500 meters above sea level, with a total area of more than 12 million square meters and a total capacity of 54,515 people per hour [19]. Simultaneously, the supporting service facilities of Secret Garden, Wanlong, Taiwoo, and Fulong ski resorts have been shortlisted in the "Top Ten Ski Resorts in China". The high-speed railway from Beijing to Chongli was officially opened in December 2019. Now Chongli has become one of the most popular winter tourist destinations in China.
This study aims to establish an online reviews analysis framework based on machine learning methods. This study uses the LDA model to explore the cognitive image and affective image of tourists. To understand the focus factors of tourists' satisfaction, a sentiment analysis method based on custom rules and lexicon are used to measure the sentiment polarity of tourists in each perception dimension. At the same time, to understand the impact of the Beijing Winter Olympic Games on tourists' image perception, this study makes a qualitative analysis of the reviews mentioned "Winter Olympic Games". It makes a preliminary discussion on tourists' image perception from the perspective of the Winter Olympic Games. This research paradigm has certain reference significance for other types of research. The overall analysis framework is shown in Figure 1.  In this study, the UGC data of seven ski resorts are obtained by using the python web crawler. Based on the comprehensive measurement of the platform's popularity, number and flow of users, and reading visits [9]. Three well-known China tourism communities which contain OTA (Online Tourism Agency) business are selected, which are Ctrip.com (accessed on 20 December 2021), Qunar.com (accessed on 20 December 2021), and Mafengwo.com (accessed on 20 December 2021). Finally, a total of 13,245 reviews are obtained. The time range of the data set is from November 2014 to February 2021.
The basic process of data preprocessing and cleaning is according to the following steps: (1) Emoticons, English words, punctuation marks, special characters, URL links, and repeated UGC in the text are eliminated (2) Jieba system is used to segment the extracted text. (

3) Remove meaningless stop words
Finally, 8762 reviews collected are involved in the following LDA model calculation and sentiment analysis.

LDA Model
LDA model is an unsupervised machine learning technique that uses a three-layer Bayesian probability model to identify topic information hidden in large-scale documents. According to the results of the LDA calculation, two probability distributions of text-topic and topic-word can be obtained. The topic-word probability distribution is represented by a series of feature words and their probability values that appear in the topic. The greater the probability value of the feature words, the higher the contribution rate to the topic and the greater the correlation with the topic, thus reflecting each topic's internal structure [55]. This study uses Python's gensim toolkit to call the LDA model to realize topic analysis and provide structured thematic data for condensing DI.
After repeatedly testing the differentiation of the different number of topic classification results and filtering out meaningless words, for instance, the number of ski resort names in UGC data is very frequently, such as Wanlong, Thaiwoo, etc., to avoid the impact of such words on the probability of other words, the names of seven ski resorts are included in the stop words list. For evaluating the LDA, Blei et al. [47] proposed to use confusion degree (perplexity value) as the criterion. Through the evaluation of the resulting theme and the degree of confusion, the model parameter model is modified. At the same time, the research calls the PyLDAvis package to visualize the distance between each topic. While constantly debugging the number of topics, observing the perplexity value, and clustering visualization results under different topics, it is found that when k = 9, the topic differentiation is obvious. Therefore, it is determined that the optimal number of topics K of the LDA model in this study is 9, and the topic distance diagram and feature words are shown in Figure 2. The original feature words are in Chinese, which are generated automatically according to the original Chinese text data. To increase the readability of the Figure, we replace the feature words with corresponding English explanations.
Sustainability 2022, 4 FOR PEER REVIEW 7 words are shown in Figure 2. The original feature words are in Chinese, which are generated automatically according to the original Chinese text data. To increase the readability of the Figure, we replace the feature words with corresponding English explanations.

Sentiment Polarity Statistic Based on Domain Lexicon
The more comprehensive sentiment words are included in the sentiment lexicon. The more accurate the judgment will be. An existing lexicon called the HowNet lexicon contains 91,016 Chinese words, including six basic types of words: positive sentiment, negative sentiment, positive evaluation, negative evaluation, magnitude of degree, and viewpoints. As this study argued above, HowNet is not appropriate for analyzing tourist sentiments [7]. Therefore, this study develops a particular lexicon for skiing tourism based on the HowNet lexicon. We use a python web crawler to crawl online reviews from four

Sentiment Polarity Statistic Based on Domain Lexicon
The more comprehensive sentiment words are included in the sentiment lexicon. The more accurate the judgment will be. An existing lexicon called the HowNet lexicon contains 91,016 Chinese words, including six basic types of words: positive sentiment, negative sentiment, positive evaluation, negative evaluation, magnitude of degree, and viewpoints.
As this study argued above, HowNet is not appropriate for analyzing tourist sentiments [7]. Therefore, this study develops a particular lexicon for skiing tourism based on the HowNet lexicon. We use a python web crawler to crawl online reviews from four major ski areas in China, collecting a total of 13,260 reviews. Through manual screening, a lexicon that belongs to the field of skiing tourism is extracted. A total of 262 positive words such as "addictive", "powder snow", and 111 negative words such as "overcrowding" and "frozen to the bone" are obtained.
In this study, the domain lexicon and the HowNet lexicon are combined to get the exclusive sentiment lexicon for ski tourism, and the repetitive sentiment words are eliminated. Finally, the complete exclusive sentiment lexicon contains 3670 positive words and 3265 negative words.
The semantic logic of the text is composed of degree adverbs and negative adverbs. When these words appear in a sentence, sentiment scores are redistributed according to different combinations of these words. For instance, in the comment: "the instructor is very patient", the degree adverb "very" increases the emotional intensity of "patient". In this study, a degree adverb lexicon is constructed according to the HowNet degree level words, and the weight is assigned according to five levels. The emergence of negative words often reverses the sentiment polarity. Given the situation that there are negative adverbs before sentiment words, the negative meaning is expressed when the number of negative words in the phrase is odd; when the number of negative words in the phrase is even, it means positive meaning. Combined with the corpus of this study and Chinese language habits, a total of 76 negative words are collected, and their weights are set to "−1". The filtering rules of semantic logic and coefficients are shown in Table 1 below. "extremely", "unparalleled" super 30 2.5 "too", "more than" very 42 2 "greatly", "very much" comparatively 37 1.5 "relatively","rather" A little/slightly 29 0.75 "a little" Negative adverb 76 -1 "not","never" Data source: collected by the authors.
Sentiment scoring mainly includes the following three steps.
(1) We use python to read the sentiment lexicon, negative adverb lists, and degree adverb lists.
(2) Traverse the negative words and degree adverbs between sentiment lexicon in each review, and calculate their corresponding weights. The equations for calculating the sentiment value of each sentiment word in the reviews are as follows: where l(w) represents the sentiment value of the sentiment word part; s(w) can be defined as the sentiment value of sentiment words, d(w) denotes the weight value of negative words; a(w) the sum of the weight values of all adverbs of degree before the sentiment words; m(w) denotes the relative position of the negative word before the sentiment word and the adverb of degree. If there is a negative word before the degree adverb, the value of m(w) is 0.5, otherwise, the value of m(w) is 1.
In Equation (2), n represents the number of negative words before the sentiment word; t is the number of degree adverbs before the sentiment word; whereas agi is the weight value of the ith degree adverb.
(3) Each review contains multiple sentiment words, and the final sentiment value is calculated as follows.
In Equation (5), r is the set of sentiment words in each review; L(r) is the final sentiment value of each review. If L(r) ≥ 0 the sentiment of review is positive; otherwise, the sentiment is negative.

Image Perception Dimension Extraction
Although the LDA model can classify topics based on the relevance of the feature words in the text, there is no standard or unified method to concisely express the topic of each classification result. In the existing research, researchers generally judge the topic semantics based on the research objective [56]. To solve this problem, this study selects the words that appear most frequently in each topic but less frequently in other topics, and regards them as the dominant elements of the topic [9]. The process is extracted by different authors and judged the relevance of the results. The leading elements are named based on a review of the existing literature to reduce the subjectivity of the condensed topic [14,57]. In addition, to better interpret the feature words, it is necessary to understand the content of the text itself and the actual situation of the destination. For example, the word "hour" appears in both attributes of traveling cost and transportation. By screening the text, it can be seen that there are mainly two aspects of the reference to "hour", one is the mention of self-driving time, and the other is describing the skiing time with its corresponding price. Therefore, we keep "hour" in these two attributes. In the end, each topic keeps 10 related feature words, which are representative elements contained in each attribute.
According to the theory connotation of the cognitive-affective theory, we have summarized the first eight topics as the cognitive image and the last one as the affective image ( Table 2). The second column is the attributes composition of the tourists' perception image, which is also the result of the conciseness of the topic. The attributes' naming comes from a comprehensive review of the relevant literature, and the third column retains the highfrequency elements most related to the perception attributes. The number in parentheses is the weight of the elements, indicating the importance of each element in the attribute. The fourth column is the judgment of whether the attribute appears in the field of ski tourism. The fifth is the reference of attribute names.
Most notably, among these nine attributes, beginner suitability and ticketing service appear for the first time in the literature research of ski tourism. The attribute of beginner suitability includes the words such as "first time, beginner, practice". The beginner is the highest weight, followed by the instructor. It can also be seen that for beginners, instructors account for a large proportion, which is also consistent with the striking features of China's ski market that almost 80% of skiers are beginners. Many of them hire instructors and are more concerned about the suitability of the facilities to their own skills in the ski destinations, which is undoubtedly a prominent perception feature of Chinese ski tourists.
Ticketing service also has never been studied in the research of ski destination image. From the words "convenience, staff, queuing, window". It can be inferred that tourists pay close attention to the process and mode of purchasing tickets. From the words "ticket exchange" and "ID card", we can infer that due to the pervasiveness of the Internet and smartphones, tourism online booking platform has become an essential way for tourists to purchase tickets. Hence, they are also very concerned about the convenience of the process from online ticket purchase to offline ticket collection.  We sum up the expression of the OG, international, national level, and so on as the reputation of the destination. In this dimension, the weight of the OG is as high as 0.032, which is also the highest representative element, which can reflect the impact of the holding of the OG on the image perception of tourists. In addition, due to lack of skiing experience, beginner skiers have a weak ability to distinguish between the advantages and disadvantages of facilities in ski resorts compared to the veterans, so they will automatically associate the experience with the reputation. Among the elements of ski condition, ski slopes also account for the highest proportion, followed by ski lifts. This attribute is also the core concern of ski tourists. In the attribute of travel cost, the weight of "expensive" reaches 0.056, which is also the highest element weight; the weight of the "price" reaches 0.036. Thus it can be seen that the high price is also a concern of tourists. To understand the tourists' specific attention to the price factor, we look for the specific reviews with the word "expensive". We found that although the price is mentioned in many reviews, the high price does not lead to strong dissatisfaction of tourists. As the review wrote: "The experience is very great, but it is a little expensive", "We will come again, if only the price is lower", "the scenery is beautiful, the slopes are plenty, and the food is delicious, all is wonderful except the expensive", it shows that preferential prices will not become a factor of tourist satisfaction, high prices will not lead to tourist dissatisfaction, tourists are more concerned about a good skiing experience.
In addition, accommodation, transportation, and atmosphere attributes are paid more attention to by tourists, and they often appear in the study of ski destination images (e.g., [5,60,61]. Through the elements of the parking lot and high-speed train, it can be seen that tourists mostly travel by self-driving and high-speed rail. The weight of Beijing is also relatively high, so it can be inferred that tourists from Beijing to Chongli also account for a certain proportion. As seen in the atmosphere attribute, the landscape is the most weighted attribute, followed by night skiing.
Finally, we sum up the attribute of emotional experience as the affective image of tourists. From the words "good, great, deserve, happy, recommended", we can see that tourists' feelings towards the destination are mostly positive. We will make a further analysis of the satisfaction distribution of tourists in Section 4.2.

Satisfaction Distribution
Only identifying the DI of tourists cannot understand the impact of a particular image on the emotional characteristics of tourists. Therefore, based on identifying the multidimensional connotation of the destination image, this study uses the sentiment analysis method to further explore the satisfaction characteristics of each dimension, to explore the extent to which each attribute affects the image perception of tourists. This study calculated the overall satisfaction and the satisfaction distribution of each perceptual attribute based on lexicon building and custom rules according to Section 3.3. The satisfaction of each attribute is shown in Table 3. This research takes all the reviews as a whole and each review as an analysis unit. Through the sentiment analysis process of the Section 3.3, a total number of 5758 positive reviews, 2552 neutral reviews, and 452 negative reviews are obtained, with an overall satisfaction rate of 65.71%. According to total reviews, we can see the focus of tourists, ski condition is the most frequently mentioned attribute, followed by beginner suitability and travel cost. The lowest ones are accommodation and well-known degrees. Through the proportion of satisfaction, it can be found that atmosphere is the attribute with the highest proportion of tourist satisfaction, followed by emotional experience and well-known degree. The lowest is the travel cost and beginner suitability. Due to the small number of negative reviews, we look up the text of only 452 negative reviews, and we find that negative review focuses on the perception of service, price, and congestion. There are comments as follows: "The price has gone up again this year, and it doesn't feel worth it". "I'm not afraid of steep slopes but afraid of a lot of people. I don't have any fun when there are a lot of people".
"The attitude of the service staff at the information desk is very terrible, and it is necessary to change a batch of service staffs during the Winter Olympic Games".
To explore the changing trend of tourists' satisfaction tendency, this research divides the reviews into seven snow seasons. According to the business hours of each winter in Chongli ski resorts, we regard the reviews from October to April of the following year as a snow season and calculate the proportion of reviews containing "Winter Olympic Games" to the total number of reviews of each snow season (As shown in Figure 3).
Chongli ski resorts, we regard the reviews from October to April of the following year as a snow season and calculate the proportion of reviews containing "Winter Olympic Games" to the total number of reviews of each snow season (As shown in Figure 3). The graph in Figure 3 shows that the change of satisfaction tendency during seven snow seasons, with positive reviews, are increased from 64.01% in 2014 to 66.79% in 2020 and negative reviews are decreased from 7.48% in 2014 to 3.26% in 2020. Compared with negative reviews, positive reviews have little change. However, comparing the overall trend, it can be seen that it is the best snow season in the past 2020/21 snow season as the positive reviews have increased showing that the skiing resorts have become better than the previous years. Negative reviews only account for 3.26%.

Tourists' DI from the Perspective of Winter Olympic Games
There is a close relationship between mega-events and tourist destinations as the hosts of these events are important tourist destinations. Since the 1990s, the rapid growth of mega-events has made these events a strategic lever for the management of DI [3,[62][63][64][65]. Many destinations and countries are competing to host mega-sport events. The impact of mega-sport events on DI has attracted much attention [65][66][67]. Although China won the bid for Olympic Games in July 2015 successfully, the reviews refer to the OG that already existed in the 2014/15 snow season (see Figure 4). In the following six snow seasons, the reviews mentioning OG accounted for an average of 6.12% of the total reviews each year. To further explore the impact of the OG image on the DI, this study analyzed 526 reviews that included "winter Olympic games" or "Olympic games". Due to the small sample size, we read and examine this part of the reviews one by one manually. Finally, the impact of OG on the DI is summed up which is shown in Figure 4. We summed up the two motivations for skiing in Chongli. One is to feel the halo of the Olympic Games, such as the comment "this is the main venue of the Winter Olympic Games, to greet the OG, my family specially came here to ski this year." The other is for high quality, that is, the OG represents high quality in their mind, for example, "As the OG can be held, the hardware facilities must be impeccable, here really did not let me down." The graph in Figure 3 shows that the change of satisfaction tendency during seven snow seasons, with positive reviews, are increased from 64.01% in 2014 to 66.79% in 2020 and negative reviews are decreased from 7.48% in 2014 to 3.26% in 2020. Compared with negative reviews, positive reviews have little change. However, comparing the overall trend, it can be seen that it is the best snow season in the past 2020/21 snow season as the positive reviews have increased showing that the skiing resorts have become better than the previous years. Negative reviews only account for 3.26%.

Tourists' DI from the Perspective of Winter Olympic Games
There is a close relationship between mega-events and tourist destinations as the hosts of these events are important tourist destinations. Since the 1990s, the rapid growth of mega-events has made these events a strategic lever for the management of DI [3,[62][63][64][65]. Many destinations and countries are competing to host mega-sport events. The impact of mega-sport events on DI has attracted much attention [65][66][67]. Although China won the bid for Olympic Games in July 2015 successfully, the reviews refer to the OG that already existed in the 2014/15 snow season (see Figure 4). In the following six snow seasons, the reviews mentioning OG accounted for an average of 6.12% of the total reviews each year. To further explore the impact of the OG image on the DI, this study analyzed 526 reviews that included "winter Olympic games" or "Olympic games". Due to the small sample size, we read and examine this part of the reviews one by one manually. Finally, the impact of OG on the DI is summed up which is shown in Figure 4. We summed up the two motivations for skiing in Chongli. One is to feel the halo of the Olympic Games, such as the comment "this is the main venue of the Winter Olympic Games, to greet the OG, my family specially came here to ski this year". The other is for high quality, that is, the OG represents high quality in their mind, for example, "As the OG can be held, the hardware facilities must be impeccable, here really did not let me down".  Through the statistics of the satisfaction of this part, the positive reviews account for 81%. The positive evaluation mostly comes from the mentioning of the ski resorts' facilities, such as the comment "it is worthy of being the main venue of the Olympic Games, the slopes and lifts are very professional". It means that when tourists are satisfied with the destination, the DI is improved. We interpret the remaining negative reviews. Still, almost all tourists express that despite having some problems, they firmly believe or hope that under the influence of the OG, it will be better, such as: "From a professional point of view, there is still a certain gap, with the influence of 2022 Winter Olympic Games, the construction here will be further strengthened".
"The scenic spot is not bad, the service is poor, and the price is too high. I hope the destination can be improved to add luster to the Winter Olympic Games".
The improvement of the awareness of the destination in Chongli ski resorts benefits largely from the anticipation of the OG, although we cannot estimate how many tourists choose Chongli because of the influence of the OG. However, from the reviews referring to the OG, it indicates that the OG image affects the destination choice of tourists to a certain extent when the destination meets the expectations of tourists. Mega-sport events can indeed improve the image of the destination. Rose and Spiegel [66] and Andersson et al. [67] came to the same conclusion when studying the perception image of tourists before and after visiting the event. The difference is that before mega-sport events, when the destination does not meet their expectations, as far as the tourists from the destination country are concerned, they still believe that the holding of mega-sport events will improve the situation of the destination.

Discussion
This study has attempted to propose a text mining framework suitable for DI research based on the UGC, the ski resorts in the host city of the 2022 Winter Olympic Games was selected as a case study. The LDA model and sentiment analysis method based on custom rules and lexicon are combined to identify and analyze the destination's cognitive image and affective image, exploring the satisfaction distribution of the DI. The result has disclosed the unique image perception attributes of Chinese skiers and how China's ski tourists are dissimilar from international ski tourists through the lens of DI. Besides, to explore the impact of mega-sport events on DI, this study analyzes the UGC mentioning OG and tentatively explores the image perception of tourists from the perspective of the OG.

General Discussions
From a theoretical perspective, the study extends our understanding of the DI within the context of emerging ski markets. It also extends the body of knowledge developed in previous studies by identifying the unique perception attributes in China. In this study, nine attributes of image perception of China's ski market are identified, which are reputation, transportation, accommodation, service quality, atmosphere, ski condition, travel cost, emotional experience, and beginner suitability, of which, beginner suitability and ticketing service are the first time to be identified in the DI literature of ski destinations, which is one of the most noteworthy results in this study. It can possibly be explained that China's ski market is the largest beginner market globally, and beginner skiers account for a high proportion. Hence, tourists are highly concerned about whether ski conditions are suitable for their skiing level. A study by Matzler et al. [21] found that beginners have different expectations and choice criteria for ski resorts. It can be inferred that in the emerging ski market, beginner suitability can also be regarded as one of the important criteria for beginners to choose their destination. It can be taken as the dissimilitude between developed ski markets and emerging ski markets. The popularity of intelligent systems on tourist destinations makes online booking become a mainstream way for tourists to buy tickets. In addition, skiing tourism is different from general tourism types. The preparation time before formal skiing is relatively long, such as changing clothes, snowshoeing, equipment rental. In this case, it matters to pick up tickets conveniently and rapidly, which possibly explained why tourists pay more attention to the attributes of ticketing services. However, in the previous DI construction of ski destinations, the attribute of ticketing service has not been mined. Nowadays, online booking has become commonplace among tourists. It is therefore not excluded that in the developed market of the ski industry, ticketing service is also one of the attributes that ski tourists are concerned about, which should be paid attention to in future research. The unique result identified also shows the power and potential of topic modeling techniques. In addition to supporting existing theories, it can also provide additional insights into the image perception of destinations, highlighting the advantages of LDA compared with deductive theory [16].
The other seven attributes identified also support and verify the corresponding literature (As shown in Table 2). Among them, the atmosphere is the attribute with the highest satisfaction. In this attribute, in addition to the elements of corresponding activities like night skiing, hot springs, the element of fellow travelers also accounts for a large proportion, such as friends, children, families, which indicates ski tourists pay great attention to the interaction with fellow travelers, which embrace the same arguments made by Vanat on Chinese skiers as well [19]. He believed that in China, skiing is treated as an entertainment activity rather than a leisure sport, and ski resorts are regarded as ski playgrounds rather than holiday destinations. The communication and interaction with 'people' is an essential part of their perception process, and the emotional expressions of fellow visitors and their interaction with themselves will affect their tourist experience to a great extent, as Chen et al. believed that the visitors can feel happy from the pleasant expressions and states of others [68]. Travel cost is one of the attributes that ski tourists pay more attention to with the lowest satisfaction. Matzler et al. [69] compared the influence of the price factor between the first-time visitors and the repeat visitors to a specific ski area. The study found that first-time visitors were more sensitive to the price factor. In the emerging ski market, the proportion of first-time skiers is relatively high. Therefore, the price factor is highly concerned in this study, which also supports the previous research conclusion. Ski condition is the most concerned attribute among the cognitive image. It can be inferred that skiing is still the primary experience of their travel. However, an image perception study of German ski destinations has shown that German ski tourists pay more attention to tourism activities instead of skiing [70]. One possible explanation may be that compared with German ski destinations, the tourism products of China's ski destinations are not diversified enough. Several studies have emphasized that for tourists who have past experiences on a destination, the cognitive field tends to dominate the destination image. In contrast, the affective domain is more prominent in the destination image of those who have no previous experience [33,71,72]. This can possibly be explained by beginners or first-time visitors accounting for a high proportion, it can also understand why the attribute of emotional experience is the most frequently mentioned attribute in this study. Besides, we discussed the evolution of satisfaction in seven consecutive snow seasons. The result shows that positive reviews have an increasing trend, negative reviews have a downward trend, and the change rate of negative reviews is greater than positive reviews, reflecting the ski resorts in the Chongli area have been continuously improving for a better direction.
A preliminary study on the tourists' perception image from the perspective of the OG found that when the destination meets tourists' expectations, the image of the destination is promoted. When it does not meet the expectations of tourists, tourists are still optimistic about the future development of the destination. It indicates that the holding of mega-sport events undoubtedly has a great impact on the residents/tourists of the destination countries. Mega events can improve the destination image, which is similar to the view of Rose and Spiegel [67] and Andersson et al., [68]. Unlike the previous studies, this study believes that before the mega-events, when the destination does not meet their expectations, tourists from the destination country still reckon that the holding of the mega event will improve the situation of the destination, expecting the DI to match that of the event to contribute to the event itself. Research findings enrich the research content of the impact of mega-events on the DI. Although this is only a preliminary study, it also serves as a reference for further DI research influenced by the upcoming Beijing 2022 Winter Olympics.
From a methodological point of view, first, a limitation of topic modeling is the inability to determine sentiment [16]. Therefore, this study combines the image perception recognition based on the LDA model with the sentiment analysis method based on domain lexicon and further analyzes the emotional tendency of each perceptual attribute based on the DI identified. To our knowledge, no comparable research has taken such a text mining framework to analyze the image perception and satisfaction of tourism destinations. On the one hand, it provides a new research paradigm for studying DI based on UGC. On the other hand, research results can provide managers with a novel method to understand what attributes of DI they need to posit more emphasis to attract more tourists based on the different destination types in the future. Second, the current research on lexicon-based sentiment analysis mainly uses the public sentiment lexicon [42], making it difficult to analyze the emotional characteristics of visitors' tourism experience accurately. This paper constructs a sentiment lexicon based on ski tourism, which is practical for detecting Chinese visitors' sentiment tendencies.

Practical Implications
Based on the above research findings and their theoretical and methodological significance, relevant practitioners interested in attracting more visitors from China's ski market, such as destination managers/event/marketers, can benefit from this study.
This research identifies two cognitive-related images, namely beginner suitability and ticketing service, which suggests destination managers can carry out targeted marketing for these two attributes. Beginner skiing groups account for a large proportion of the emerging ski market. It is vital to meet the requirements of beginner skiers. It is suggested that destinations should pay more attention to the behavior and preferences of beginner skiers to enhance the destination's attractiveness. In the attribute of beginner suitability, the weight value of the instructor is high, indicating that tourists should pay attention to instructors, so destinations can launch some packages combined with skiing teaching; Most beginners travel with other people and pay close attention to the entertainment attributes of the destinations. Therefore, it is suggested that destination developers develop more entertainment projects for the consumers who do not intend to indulge in winter sports, so they can interact with each other with high quality.
Online ticket purchasing has become a typical method of purchasing the ticket, so the procedures for checking or picking up tickets offline should be made easy and convenient to the extent that possible to provide an expedient service experience for tourists. There is a considerable impact of mega-sport events on tourists from the destination country, even before it is officially held. Therefore, DMOs should take advantage of these mega-sport events, combining tourism products with the image of mega-events to attract more consumers and use all possible media channels to do effective marketing. Additionally, Ski destination managers can consider applying such a method proposed in this study to their tourism evaluation system as part of the construction of tourism informatization strategy to monitor changes in perception and priorities by analyzing the latest UGC of customers.

Limitations and Future Studies
This research also has some shortcomings which will be considered in our future research. First, some of the review texts are short, and the language expression is arbitrary, so there may be some deviations when introducing the LDA model to extract feature words. Therefore, the following research will develop a scale based on the classification attributes of the TDI, to further verify the conclusions of this research. Besides, this study believes that when applying the LDA model, it is necessary to fully grasp the background knowledge of the research objects and understand the actual situation to avoid excessive interpretation or misinterpretation of feature words. Second, Only Chinese data are used to identify the destination image of the emerging ski tourism market, and whether the research results apply to other emerging ski markets remains to be further verified. In addition, the research on the difference in image perception between Chinese and western ski tourists based on the text mining framework proposed in this study will be the focus of the next research. Third, the preliminary and tentative analysis of the OG impact on DI is limited to the relevant reviews referring to the OG. Thus, the summary of the effects of the OG image perception does not relate to all tourists, restricting the generalizability of the results to some degree. In the future, the UGC of tourists in the post-Olympic era can become the next interesting object of DI, and compare it with the research results obtained in this paper.

Data Availability Statement:
The data presented in this study are available on request from the first author. The data are not publicly available due to data which also forms part of an ongoing study.