A Thousand Words Express a Common Idea? Understanding International Tourists’ Reviews of Mt. Huangshan, China, through a Deep Learning Approach

: Tourists’ experiential perceptions and speciﬁc behaviors are of importance to facilitate geographers’ and planners’ understanding of landscape surroundings. In addition, the potentially signiﬁcant role of online user generated content (UGC) in tourism landscape research has only received limited attention, especially in the era of artiﬁcial intelligence. The motivation of the present study is to understand international tourists’ online reviews of Mt. Huangshan in China. Through a state-of-the-art natural language processing network (BERT) analyzing posted reviews across international tourists, our results facilitate relevant landscape development and design decisions. Second, the proposed analytic method can be an exempliﬁed model to inspire relevant landscape planners and decision-makers to conduct future researches. Through the clustering results, several key topics are revealed, including international tourists’ perceptual image of Mt. Huangshan, tour route planning, and negative experience of staying.


Introduction
Tourists' experiential perceptions and specific behaviors are of importance to facilitate geographers' and planners' understanding of landscape surroundings. Many studies have exerted efforts to uncover hidden information from communications with tourists. Coeterier [1], for instance, identifies the theoretical scope of landscape perception and evaluation in planning and design to bring planners and tourists closer. Chang [2] proposed managerial strategies by interrogating the interaction between planners and users of the urban landscape in Singapore. Tourists' psychological perceptions (e.g., perceived authenticity) could assist in the marketing of secondary explorations to destinations [3]. Other scholars [4] construe that a perceived authentic experience in sparking tourist imagination brings consumption in landscape tourism.
With the increasing demand for aesthetic values and spiritual enrichment, mountain regions with an appealing landscape and a high level of near-natural habitats are therefore of increasing importance for providing cultural ecosystem service to fulfill tourists' psychological and psychical experience [5]. To be more specific, the tourist experience refers to an embodied experience, which can be achieved through active participation in the tourism environment and through the emotional senses of touch and affect [6][7][8], while a tourist landscape is a landform area that is different from other landscape types. It is recognized and accepted by users to meet their travel and leisure needs and expectations [9,10]. Accordingly, the assessment of tourist needs (e.g., willingness to undertake an activity) reveals substantial differences in the spatial diversity of intrinsic and service potentials learning-based automated textual analysis, researchers are poised to interpret massive raw material into valuable insights.
Deep learning, or more specifically natural language processing (NLP), is an effective method for classifying and processing data. Compared with traditional data processing methods, it can learn and analyze data more thoroughly. By processing the review information on the travel platform, NLP can analyze the information characteristics of all reviews to find the deep meaning. This allows deep learning to better meet the needs of personalized recommendations for tourist attractions [30].
The purpose of this study is to use a state-of-art approach of deep learning algorithm to process and analyze online reviews regarding a Chinese landscape written by international tourists. Chinese landscapes with unique cultural meanings and forms increasingly attract international academic attention [10]. Among those, Mt. Huangshan (黄山 also known as Yellow Mountain), as a UNESCO (Paris, France) World Heritage Site in 1990, is one of the major natural landscapes attracting international tourists [31]. It shows the Eastern sentiments of "shan shui" (literally, "mountain water" in Chinese), which had a profound influence on the aesthetics of natural landscapes through imbuing the value of the indivisibility of man and nature [32]. The landscape is well known for its scenery, sunsets, peculiarly-shaped granite peaks, pine trees, hot springs, winter snow and views of the clouds from above. Mt. Huangshan is a frequent subject of traditional Chinese paintings and literature, as well as modern photography, as shown in Figures 1 and 2. In 2019, tourists made 74,022,100 trips in Mt. Huangshan, of which international visitors made 1,637,200 trips [33]. They mainly come from Singapore, Indonesia, Malaysia, France, the United States, South Korea, and Japan [33].
Accordingly, the significance of the current study is two-fold. First, though Mt. Huangshan is a famous natural landscape with a global reputation, limited prior studies address the questions of how international tourists perceive, concern, and evaluate the tourism landscape of Mt. Huangshan. Through a machine-learning-based analysis of posted reviews across international tourists, our results facilitate relevant landscape development and design decisions. Second, the proposed analytic method, BERT (bidirectional encoder representations from transformers) clustering, can be an exemplified state-of-art (SOTA) model to more effectively summarize user needs and inspire relevant landscape planners and decision-makers to conduct future researches.
Land 2021, 10, x FOR PEER REVIEW  3 of 16 networks, in a manner of computer-assisted automated textual content analysis [16]. Relevant studies on product development have shown that the machine-learning approach identifies more unique and helpful categories, compared to the manual review [28,29]. Through machine-learning-based automated textual analysis, researchers are poised to interpret massive raw material into valuable insights. Deep learning, or more specifically natural language processing (NLP), is an effective method for classifying and processing data. Compared with traditional data processing methods, it can learn and analyze data more thoroughly. By processing the review information on the travel platform, NLP can analyze the information characteristics of all reviews to find the deep meaning. This allows deep learning to better meet the needs of personalized recommendations for tourist attractions [30].
The purpose of this study is to use a state-of-art approach of deep learning algorithm to process and analyze online reviews regarding a Chinese landscape written by international tourists. Chinese landscapes with unique cultural meanings and forms increasingly attract international academic attention [10]. Among those, Mt. Huangshan (黄山, also known as Yellow Mountain), as a UNESCO (Paris, France) World Heritage Site in 1990, is one of the major natural landscapes attracting international tourists [31]. It shows the Eastern sentiments of "shan shui" (literally, "mountain water" in Chinese), which had a profound influence on the aesthetics of natural landscapes through imbuing the value of the indivisibility of man and nature [32]. The landscape is well known for its scenery, sunsets, peculiarly-shaped granite peaks, pine trees, hot springs, winter snow and views of the clouds from above. Mt. Huangshan is a frequent subject of traditional Chinese paintings and literature, as well as modern photography, as shown in Figures 1 and 2. In 2019, tourists made 74,022,100 trips in Mt. Huangshan, of which international visitors made 1,637,200 trips [33]. They mainly come from Singapore, Indonesia, Malaysia, France, the United States, South Korea, and Japan [33].  Accordingly, the significance of the current study is two-fold. First, though Mt. Huangshan is a famous natural landscape with a global reputation, limited prior studies address the questions of how international tourists perceive, concern, and evaluate the tourism landscape of Mt. Huangshan. Through a machine-learning-based analysis of posted reviews across international tourists, our results facilitate relevant landscape development and design decisions. Second, the proposed analytic method, BERT (bidirectional encoder representations from transformers) clustering, can be an exemplified stateof-art (SOTA) model to more effectively summarize user needs and inspire relevant landscape planners and decision-makers to conduct future researches.

Traditional Methods to Explore Tourist Needs
Tourism planning teams would conduct different methods, such as focus groups, to explore insights into tourist needs or related experiences that focus on their behavior [37]. For example, Jin and Wang [38] reviewed 161 academic pieces of literature that described, analyzed, and summarized Chinese tourist behaviors. Kim and McKercher [39] indicated that tourist behavior might be influenced by both national culture and tourist culture. Tourist behavior could also be used to explore attributes for conjoint analysis. Nuraeni and his colleagues [40] suggested a conjoint analysis framework to understand young people's tourist attributes and underlying reasons for these preferences.
Besides, other disciplines, such as psychology and sociology, have also exploited various methods to identify tourist behavior from direct communication with tourists. Indepth interviews, focus group, participant observation, and ethnographic research are common methods that are widely used in academic research. Scholars would collect qualitative data to form a corpus, review the data, get rid of redundancy, and manually structure various constructs of tourist behavior [41]. A few scholars further refined the interview method with structured interviews, semi-structured interviews, and unstructured interviews [42].
To be more specific, tourist behavior exploration typically bases on around 25 indepth interviews. Then skilled analysts would review the qualitative data and remove the irrelevant transcript to form a corpus of 80-120 statements on consumer behavior. These statements would be clustered to indicate various constructs (primary, secondary, and

Traditional Methods to Explore Tourist Needs
Tourism planning teams would conduct different methods, such as focus groups, to explore insights into tourist needs or related experiences that focus on their behavior [37]. For example, Jin and Wang [38] reviewed 161 academic pieces of literature that described, analyzed, and summarized Chinese tourist behaviors. Kim and McKercher [39] indicated that tourist behavior might be influenced by both national culture and tourist culture. Tourist behavior could also be used to explore attributes for conjoint analysis. Nuraeni and his colleagues [40] suggested a conjoint analysis framework to understand young people's tourist attributes and underlying reasons for these preferences.
Besides, other disciplines, such as psychology and sociology, have also exploited various methods to identify tourist behavior from direct communication with tourists. In-depth interviews, focus group, participant observation, and ethnographic research are common methods that are widely used in academic research. Scholars would collect qualitative data to form a corpus, review the data, get rid of redundancy, and manually structure various constructs of tourist behavior [41]. A few scholars further refined the interview method with structured interviews, semi-structured interviews, and unstructured interviews [42].
To be more specific, tourist behavior exploration typically bases on around 25 indepth interviews. Then skilled analysts would review the qualitative data and remove the irrelevant transcript to form a corpus of 80-120 statements on consumer behavior. These statements would be clustered to indicate various constructs (primary, secondary, and tertiary) of consumer behavior in a hierarchy [43]. Then, this process followed a method that aims at identifying and extracting tourist needs. The latest qualitative method tries to collect tourist needs in a more direct way. For example, Schaffhausen and his colleagues [44] relied on a crowdsourcing method to conduct need assessment.

UGC Text in Tourist Research
Tourist research has long focused on different approaches evaluating disorganized qualitative data to solve tourism-related issues. For example, Burgess et al. [45] have suggested using word frequency and occurrence percentage to map the needs grouping in tourist reviews. The latest research tries to analyze the relationship between tourist opinions in landscape & hotel and related sentimental score, sales volume, and rating [45][46][47]. For example, González-Rodríguez et al. [48] analyzed how the sentiment score of online reviews influenced electronic word of mouth's credibility scores in the context of the city of Barcelona. Indeed, previous research has encouraged the adoption of a standard approach to evaluate tourist experience by identifying the qualitative data with a representative of tourists [49].
In the field of marketing research, the consumer needs elicitation research also potentially contributes to the aim of this paper, though it might be more focused on consumers' physical needs in different contexts rather than more abstract experience in tourism. For instance, Timoshenko and Hauser [28] have proposed a systematic method to extract consumers' needs in the context of oral care products. To specify, this paper relies on a machine learning technique, word-embedding-based sentence embedding, to automatically cluster different oral products' needs, providing suggestions for further product development. Besides, a supervised machine learning approach could even explore both abstract experience and physical needs at the same time [50]. The current study contributes to both tourism and marketing research by conducting a state-of-art, context-related semantic clustering analysis to explore tourist experience.

Natural Language Processing (NLP)
Two fields of studies in natural language processing (NLP) were related to the current work: dense word and sentence embedding, and bidirectional encoder representations from transformers (BERT). In natural language processing, semantic words or sentences are trained to be vectorized as real-valued mappings (around 20-400 dimensions) in which similar words or sentences are close to each other in the vector space [51]. This process reflects an assumption in the word2vec embedding: words or sentences co-occur in the same context shared similar linguistic connotations [51]. In addition, after training a large sample of a corpus, the word2vec model could reflect not only the similarity between words or sentences, but also the semantic relationships [52]. Based on the corpus of Google News, word2vec embedding could reflect the following relationship: In 2018, Jacob Devlin and the research team [53] from Google introduced BERT as a state-of-art (SOTA) technique for NLP. BERT is a pre-trained transformer NLP network [53], which achieved many state-of-the-art results for different downstream tasks, such as question-answering tasks, named entity recognition (NER), and sentence pair classification [54]. Different from prior natural language models, BERT was bidirectional pre-trained through a plain corpus for an unsupervised representation where this process involves the context of occurrence of a given word [53]. For example, whereas the word vector for "get" would enjoy the same representation for both of its occurrences in the sentences "She gets an apple" and "She gets alone", BERT would give different embedding for different contextualized sentences.
The main innovations of this model are in the pre-train method. That is to say, the mask LM (MLM) and the next sentence prediction (NSP) are used to capture the representation of text and sentence level, respectively. Compared with traditional word & sentence embedding, BERT has two significant advantages:

Static to Dynamic: The Word Polysemy Problem
Word2vec starts from the distributed hypothesis of word meaning (the meaning of a word is given by words that frequently appear in its context), and the end result is a look-up table, where each word is associated with a unique dense vector [51]. This is obviously not a perfect solution since it cannot deal with the problem of polysemy: each word in natural language may have different meanings [55]. Expressing a word's meaning with numerical values requires that it should not be a fixed vector. Furthermore, the word representation generated by word2vec is static, regardless of context. Solving the Land 2021, 10, 549 6 of 15 problem of polysemy is inseparable from context. We need not only a single injective of a word into a vector, but also a function (model) that takes context into consideration. Thus, Peters et al. [56] introduced ELMo (embeddings from language models) in which the representation of each word should be a function of the entire text sequence. Its embeddings are context-specific, providing particular representations for texts which enjoy the same spelling but are homonyms, for example, "right" in "right answer" and "on the right". This idea was naturally followed by subsequent BERT [53]. BERT uses a transformer (encoder) as a feature extractor. This method naturally makes good use of the context and does not require bidirectional stacking [57]. Cooperating with denoising targets on large-scale corpora, the generated representations are constructive for downstream tasks, such as classification. Therefore, compared with the word embedding method represented by word2vec, BERT has a more noticeable improvement which is more dynamic and can model the phenomenon of polysemy.

Simple to Rich: Multi-Layered Features of Words
A good language model should not only express the polysemy of the word modeling but also be able to reflect the complex characteristics of words, including syntax, semantics, etc. [53]. Word embedding methods such as word2vec do not have this advantage by themselves because it is too simple [56]. Pre-training methods such as ELMo, BERT, etc., could reflect different levels of features on different network layers due to its learning in a "deep" network [58]. Generally speaking, the generated features from high levels could reflect more abstract and context-dependent elements, while the features generated at lower levels focus more on the grammatical level [56]. There are many benefits to modeling multilayer features since language representations all need to serve downstream tasks [55]. On the one hand, look-up-styled word2vec embeddings are difficult to adapt and perform well to all downstream tasks, thus a variety of adapted models are introduced for different tasks, which are basically generated by adding their own inductive biases for each task [59]. On the other hand, it is more ideal for solving the problem in upstream tasks though it is necessary to acknowledge the importance of word2vec in many downstream tasks [28]. Accordingly, pre-trained models are designed to include different levels of language features at different network layers since different tasks rely on different levels of features differently [53]: some tasks might rely on more abstract information, while others focus more on grammatical information. In this way, BERT can selectively use the information at all levels, which is a naturally more ideal solution than the word2vec embedding method.

Materials and Methods
A hybrid method of BERT and deep learning is accordingly introduced to screen UGC in the TripAdvisor for sentences contained various context-related tourist experiences. Previous research has shown machine-learning-based methods could accomplish various kinds of tasks. For instance, a hybrid method, merging machine learning and subjective evaluation, was conducted to resolve the problem of author disambiguation [60]. To specify, supervised machine learning was first utilized to explore the possible clusters of related papers, then careful identifications of the associations between authors and papers were manually operated by experts. Another example was in the context of oral product needs explorations. Timoshenko and Hauser [28] initially clustered reviews in the Amazon and then manually extracted product needs from each cluster for further product development. Figure 3 shows the whole process, consisting of three main steps in this study. This current architecture could finish the same task as the voice of the consumer method discussed in Section 2. The UGC collection is analogous to the transcript of interviews or focus groups. The clustering of sentence embedding is similar to extracting different consumer needs. The approach to retrieve a hierarchical architecture of consumer needs is, to some extent, equal to the needs generation from UGC or interviews.  Figure 3 shows the whole process, consisting of three main steps in this study. This current architecture could finish the same task as the voice of the consumer method discussed in Section 2. The UGC collection is analogous to the transcript of interviews or focus groups. The clustering of sentence embedding is similar to extracting different consumer needs. The approach to retrieve a hierarchical architecture of consumer needs is, to some extent, equal to the needs generation from UGC or interviews.

Preprocess UGC
As one of the largest social travel interactive website in the world, TripAdvisor contained over 300 million reviewers and around 750 million reviews of landscapes, hotels, restaurants, and tourism-related information [61], making it one of the ideal sources for UGC and tourist information exploration in tourism research [19,61]. Thus, we used Python Web Crawl to collect tourism reviews of Mt. Huangshan on 20 Dec 2019 with the criteria that the language of reviews in English. In total, 983 reviews are collected along with their review titles, links, origin places, contributions, dates, and votes.
Previous qualitative research has suggested that each sentence in the interview corpus is a natural unit that could potentially reflect consumer opinion or experience [41]. Thus, the crawled reviews from TripAdvisor were split into a set of sentences via an unsupervised sentence split toolkit [62]. Then, we cleaned them by removing the HTML, converting all letters to lowercase, transferring numbers into number signs, removing punctuations, accent marks, and other diacritics, removing white spaces, and expanding abbreviations [28].

BERT Sentence Embeddings Clustering
Regarding the clustering tasks, a commonly used method to tokenize each sentence to a vector space in which semantically similar sentences have a closer distance between each other. Previous research has tried to put sentences into BERT and retrieve the fixedsized BERT embedding. For example, Reimers and Gurevych [54] fine-tuned BERT with siamese and triplet structure to develop more semantically meaningful BERT embedding which could be distinguished by its cosine-similarity. Because Siamese BERT-networks have shown its efficacy in computing the sentence similarity [63], we also the same structure to fine-tune BERT in this application.

Preprocess UGC
As one of the largest social travel interactive website in the world, TripAdvisor contained over 300 million reviewers and around 750 million reviews of landscapes, hotels, restaurants, and tourism-related information [61], making it one of the ideal sources for UGC and tourist information exploration in tourism research [19,61]. Thus, we used Python Web Crawl to collect tourism reviews of Mt. Huangshan on 20 Dec 2019 with the criteria that the language of reviews in English. In total, 983 reviews are collected along with their review titles, links, origin places, contributions, dates, and votes.
Previous qualitative research has suggested that each sentence in the interview corpus is a natural unit that could potentially reflect consumer opinion or experience [41]. Thus, the crawled reviews from TripAdvisor were split into a set of sentences via an unsupervised sentence split toolkit [62]. Then, we cleaned them by removing the HTML, converting all letters to lowercase, transferring numbers into number signs, removing punctuations, accent marks, and other diacritics, removing white spaces, and expanding abbreviations [28].

BERT Sentence Embeddings Clustering
Regarding the clustering tasks, a commonly used method to tokenize each sentence to a vector space in which semantically similar sentences have a closer distance between each other. Previous research has tried to put sentences into BERT and retrieve the fixed-sized BERT embedding. For example, Reimers and Gurevych [54] fine-tuned BERT with siamese and triplet structure to develop more semantically meaningful BERT embedding which could be distinguished by its cosine-similarity. Because Siamese BERT-networks have shown its efficacy in computing the sentence similarity [63], we also the same structure to fine-tune BERT in this application.
Considering similar sentences should have close distance in the BERT embedding vector space, the set of sentences were then grouped into the cluster via K-Means clustering algorithm, which is a method of vector quantization commonly used in text clustering studies [64,65] and tourism research [66]. To specify, K-means clustering is to divide the n observations (x 1 , x 2 , . . . , x n ) into k (≤n) set S = {S 1 , S 2 , . . . , S k }, ensuring the minimization of the within-cluster sum of squares.
where u i is the mean of points in S i .
To identify an optimal number of clusters, the elbow method was widely used to determine Y clusters [67]. This method relies on calculating the sum of squared distance as different clusters of k increase to choose the optimal number of k when the sum of squared distance is only reduced marginally. As shown in Figure 4, k = 15 might be an appropriate number of clusters for this dataset [68,69]. where ui is the mean of points in Si.
To identify an optimal number of clusters, the elbow method was widely used to determine Y clusters [67]. This method relies on calculating the sum of squared distance as different clusters of k increase to choose the optimal number of k when the sum of squared distance is only reduced marginally. As shown in Figure 4, k = 15 might be an appropriate number of clusters for this dataset [68,69].

Manually Retrieve Tourist Experience
In order to get an insight into the abstract opinions on tourist experience, we invited three experienced qualitative tourist researchers to retrieve the relevant intuitions from the clusters. Ward's hierarchical clustering method was implemented since it is suitable for exploring customer's voices [70]. To specify, one sentence from each cluster was randomly sampled and reviewed by two researchers individually. Then, the third researcher checked their summarized topics for each cluster. Three researchers discussed to reach a consensus if they had different opinions on the same cluster. A detailed evaluation of each cluster is discussed in Section 4.

Results
In this section, we initially summarized the demographic information of UGC reviews. As shown in Figure

Manually Retrieve Tourist Experience
In order to get an insight into the abstract opinions on tourist experience, we invited three experienced qualitative tourist researchers to retrieve the relevant intuitions from the clusters. Ward's hierarchical clustering method was implemented since it is suitable for exploring customer's voices [70]. To specify, one sentence from each cluster was randomly sampled and reviewed by two researchers individually. Then, the third researcher checked their summarized topics for each cluster. Three researchers discussed to reach a consensus if they had different opinions on the same cluster. A detailed evaluation of each cluster is discussed in Section 4.

Results
In this section, we initially summarized the demographic information of UGC reviews. As shown in Figure  Next, we presented and discussed the results of the clustering, namely the categorization of the clusters. Table 1 shows examples of words grounded into clusters.  Next, we presented and discussed the results of the clustering, namely the categorization of the clusters. Table 1 shows examples of words grounded into clusters.

Cluster
Sub-Cluster Examples of Reviews

Landscape image
Chinese art "It looks like all the Chinese paintings in front of you." Aesthetics of nature "Nature leads us on beautiful mountain paths in Huangshan to view breathtaking scenery explaining the stories behind wonderful rock." Visual uniqueness "The landscape is dramatic and clouds descending between the peaks can be magical." Pleasure "We loved our time here." Tour route designing Hotel " . . . also advise that it is better to stay in hotels in the mountain so that you need not rush as cable car stops operation at pm." Hiking "It turned out to be quite simple as there are many clear signages along the route." Scenery "When we visited we got into rain but with the low hanging clouds and the peaks, reaching out the scenery was so beautiful I perfect with blue skies and mid weather" Path plan "My opinion is to take the cable car to save 2 h hiking up as the main attractions are another 3 h hikes up." Transportation "Bus took about 10 min to hongchun and another 5 min walk to the village itself." Tour guide "I booked my hotels but first time we engaged a guide but we found nature adviser whom I found through good reviews at tours by locals very knowledgeable and we really enjoyed the history and culture." Cable car "Take the cable car to get up there will be enough steps you have to climb." General suggestions "Mt Huangshan is spread over quite a large area with many peaks and viewpoints so it's worthwhile to do some planning as to which routes to walk as it's not possible to cover everything within a day or even two."

Negative experience
Overcrowding "I didn't miss the crowds again but once you leave the tram area the crowds will." Physical fatigue "You have to take tons of steps though so you will be very thirsty." Unpredictable weather "Unfortunately, the weather was not kind to us and it was very cloudy when we went."

Landscape Image of Mt. Huangshan
Some of the clustered categories focus on the descriptions of the image of Mt Huangshan. It captures the international tourists' comprehensive understanding of the sceneries and atmosphere.
Chinese art: The collected tourist reviews show a similar impression of a visual relationship between the Mt. Huangshan and Chinese classical paintings. International tourists view Mt. Huangshan as a physical representation of Chinese paintings they have seen before. Many previous descriptions of Mt. Huangshan are centered on its significance in Chinese arts and literature [71]. It has been a frequent subject of poetry and artwork, typically Chinese ink painting. Most international tourists perceive the artistic visions of Mt. Huangshan, and visually connect its scenes with Chinese art.
Aesthetics of Nature: Many reviews provide sights on the aesthetic relationship between "nature and man". The natural environment of Mt. Huangshan enhances international tourists' aesthetic experience, embodying the indivisibility of nature and man. Chinese aesthetics is anthropomorphic (attributing human characteristics to non-human features, animals, plants, etc.). Shortly, all modalities of being are organically connected [31,72]. Through their written reviews, the International tourists described there is a nature force driving them to immerse themselves in the natural scenery of Mt. Huangshan. Under Confucian and Daoist values, in nature, all human beings were exhorted to pursuit a pearl of ultimate wisdom in nature, such as mountains. Though previous scholars construe that international visitors hold a different view that nature is ideally free from human intervention [32,73], the natural environment of Mt. Huangshan helps international tourists atmospherically construct an eastern aesthetics of nature.
Visual uniqueness: Corresponding to the above two clusters, the international tourists expressed a picturesque vision when encountering the scenery of Mt. Huangshan. They described the sea of clouds (云海), huge mountains and rocks, etc., which were regarded as "out of this world" and "magical". According to the early Chinese literature in the sixteenth century, those early travelers to Mt. Huangshan were awestruck at sight and lost all consciousness of the existence of the human world [74]. Because of such a sacred atmosphere, Mt. Huangshan has been considered a place to shape Buddhist and Taoist believes.
Pleasure: This cluster of reviews is centered on the affective experience when staying in Mt. Huangshan. International tourists described the enjoyment of meeting good services, such as clear directions sign and friendly local guides. "Fantastic", "loved", and "excellent" were mentioned by them with high frequency. Scholars propose that affective factors are fundamental to build the overall image of a destination [75]. However, few prior studies reveal the specific affective response to Mt. Huangshan, especially from the perspective of international tourists.

Tour Route Designing
Many clusters extracted from international tourist reviews present the design of the tour route.
Hotel: International tourists frequently mention the attributes of hotels. For instance, some reviews described the room types they stayed in, and they also mention the specific names of hotels. The main recommendation was to choose a premium location to see some famous scenes, such as staying in a hotel located on the top of the mountain to see the sunrise.
Hiking: It is regarded as slow-paced, simple mobility characterized by intermittent tangible relationships with the surroundings, such as other people, places, and events [76]. As the main tourist activity in a mountain landscape, hiking recently attracts more and more scholars' attention [77]. It has been the most relevant leisure activity for the international population in countries like the UK and Germany, given tourists' increasing pursuit of a higher quality of life with nature [78]. Therefore, when clustering the reviews on TripAdvisor, a clear category is revealed to describe the experience of hiking in Mt. Huangshan. For instance, some reviews describe how long it takes to climb a specific peak, and the surrounding environment when hiking was also suggested.
Scenery: One category of reviews is centered on the strategies of viewing up-to-date points of interest when visiting Mt. Huangshan. International tourists proposed specific suggestions on the planning of how and when to capture the best view.
Path plan: This cluster of reviews is centered on the suggestion of planning the specific path when visiting Mt. Huangshan. Because of the language issue, which is frequently mentioned in their reviews, many international tourists encountered difficulties in planning a reasonable and time-efficient path route.
Transportation: Among those categories, transportation issues play an important role in the collected reviews. International tourists devoted many words on the topic in planning and integrating different transportation options to move from one point to the following.
Tour guide: As a prevalent option for international tourists in Mt. Huangshan, tour guides were discussed with high frequency. Their sharings range from the personalities to the capabilities of the tour guides. Some previous studies have also investigated the role of tour guides playing in outbound tourism. For instance, international visitors tend to learn the culture of the region or country through communicating with their tour guides [79]. In the case of Mt Huangshan, international tourists' reviews show a preference for tour guide services, and they like to share their experience with this.
Cable car: Another frequently discussed topic is suggestions for taking cable cars. It has been one of the main ways to facilitate mountain tourism [80]. Through scrutinizing their written reviews, it can be inferred that many international tourists had chosen to take cable cars to save energy and enjoy the scenery of Mt. Huangshan.
General suggestions: Some reviews are related to general information about planning a tour route before visiting Mt. Huangshan.

Negative Experience
Some categories of clustered reviews are related to the negative experience of staying in Mt Huangshan.
Overcrowding: Many negative reviews expressed the disappointed emotions caused by a high volume of tourists at the same time, which had made the place of Mt. Huangshan unmanageable.
Physical fatigue: Another main negative topic is the exhaustion caused by unexpected hiking and climbing steps in Mt. Huangshan.
Unpredictable weather: The last topic is about the negative experience of encountering changing weather. Prior scholars have also argued that weather exerts a level of influence on individuals' attitudes toward a destination, especially in mountain tourism [81].

Discussion
Many previous descriptions of Mt. Huangshan are centered on its significance in Chinese arts and literature [71]. It has been a frequent subject of poetry and artwork, typically Chinese ink painting, as shown in Figure 1. Our data strengthen the artistic attributes of Mt. Huangshan in international tourists' landscape image of Mt. Huangshan.
Our findings revealed the aesthetics towards Mt. Huangshan from international tourists' view. Chinese aesthetics is anthropomorphic (attributing human characteristics to non-human features, animals, plants, etc.). Shortly, all modalities of being are organically connected [31,72]. Through their written reviews, the international tourists described how there is a natural force driving them to immerse themselves in the natural scenery of Mt. Huangshan. Under Confucian and Daoist values, in nature, all human beings were exhorted to a pursuit a pearl of ultimate wisdom in nature, such as mountains. Though previous scholars construe that foreigners hold a different view that nature is ideally free from human intervention [32,73], the natural environment of Mt. Huangshan helps international tourists atmospherically construct an eastern aesthetics of nature. For example, previous scholars studied Mt. Huangshan as a predominantly natural attraction for international visitors (879 respondents came from 41 different countries) [82]. They mainly focus on the impact of World Heritage List status on the marketing promotion for Mt. Huangshan [82].
Mt. Huangshan was described as a visually unique scenery in international tourists' eyes. According to the early Chinese literature in the sixteenth century, those early travelers to Mt. Huangshan were awestruck at sight and lost all consciousness of the existence of the human world [74]. Because of such a sacred atmosphere, Mt. Huangshan has been considered a place to shape Buddhist and Taoist believes. Consistent with this view, we found that international tourists perceived Mt. Huangshan as unique imagery which is different from other international landscapes [74].
Scholars propose that affective factors are fundamental to build the overall image of a destination [75]. However, few prior studies reveal the specific affective response to Mt. Huangshan, especially from the perspective of international tourists. Our findings revealed the affective factor of pleasantness in international tourists' eyes.
Our findings also suggest specific international tourists' concerns on tour route designing when visiting Mt. Huangshan, such as transportation. Besides, relevant negative factors in our findings enrich current literature on studying international tourists' experience of Mt. Huangshan [31].

Conclusions
The motivation of this study is to understand international tourists' online reviews of Mt. Huangshan. Though enjoying high visibility and reputation abroad, few prior studies investigate how international tourists perceive, concern, and evaluate the experience of visiting there. Through a state-of-art machine-learning-based analysis of posted reviews across international tourists, several key topics are revealed, including international tourists' perceptual image of Mt. Huangshan, tour route planning, and negative experience of staying.
Our clustering results generate helpful insights for relevant landscape development and design decisions. First, the landscape image of Mt. Huangshan revealed in our study suggests that relevant marketers and managers should promote Mt. Huangshan as a typical scenery of Chinese painting arts and Eastern philosophy, which support destination marketing. Second, with regard to the tour route designing, international tourists showed their concern regarding hotels, transportation, and cable cars, etc. Thus, relevant services can contribute to international tourists' tour planning. For instance, international tourists showed the need for tour guiding, more pre-visiting suggestions should be considered in the future. Last, negative experiences should gain more attention when serving international tourists. Relevant departments should develop strategies for controlling the passenger flow to Mt. Huangshan. Our data suggest that international tourists feel exhausted by unexpected hiking and climbing steps in Mt. Huangshan. Destination planners can develop some facilities for rest along the way. Besides, weather issues should be considered when designing such facilities, such as some convenient tools for tourists to shelter from the rain.
Also, the proposed analytic method can be an exemplified state-of-art (SOTA) model to more effectively summarize user needs and inspire relevant landscape planners and decision-makers to conduct future researches.
This study has some limitations. First, this study only uses the data from one online platform. Future studies can integrate more online platforms and conduct in-depth interviews for a more comprehensive understanding of intentional tourists and further validate the findings in this study. Second, future studies can consider comparing the differences and similarities between international and domestic tourists' reviews of Mt. Huangshan, which could potentially reveal the role of cultural differences in the tourist experience.
Author Contributions: Conceptualization, Z.Q., C.C., and Y.S.; methodology, C.C. and Y.S.; software, Y.S.; validation, C.C.; formal analysis, Z.Q. All authors have read and agreed to the published version of the manuscript.