Geospatial Semantics Analysis of the Qinghai–Tibetan Plateau Based on Microblog Short Texts

: Place descriptions record qualitative information related to places and their spatial relation-ships; thus, the geospatial semantics of a place can be extracted from place descriptions. In this study, geotagged microblog short texts recorded in 2017 from the Tibetan Autonomous Region and Qinghai Province were used to extract the place semantics of the Qinghai–Tibetan Plateau (QTP). ERNIE, a language representation model enhanced by knowledge, was employed to extract thematic topics from the microblog short texts, which were then geolocated and used to analyze the place semantics of the QTP. Considering the large number of microblogs published by tourists in both Qinghai and Tibet, we separated the texts into four datasets according to the user, i.e., local users in Tibet, tourists in Tibet, local users in Qinghai, and tourists in Qinghai, to explore the place semantics of the QTP from different perspectives. The results revealed clear spatial variability in the thematic topics. Tibet is characterized by travel- and scenery-related language, whereas Qinghai is characterized by emotion, work, and beauty salon-related language. The human cognition of place semantics differs between local residents and tourists, and with a greater difference between the two in Tibet than in Qinghai. Weibo texts also indicate that local residents and tourists are concerned with different aspects of the same thematic topics. The cities on the QTP can be classiﬁed into three groups according to their geospatial semantic components, i.e., tourism-focused, life-focused, and religion-focused cities.


Introduction
Semantics refers to the meaning of expressions in a language, and includes realistic semantics and cognitive semantics. In cognitive semantics, the meanings of language expressions are related to human cognitive ability [1]. When referring to space, semantics deals with the meaning of spatial language [2], which is an interdisciplinary research area combining Geographic Information Science (GIScience), cognitive science, artificial intelligence (AI), and the Semantic Web [3,4]. Spatial semantics in the linguistics domain typically involves how languages structure space and schematize spatial relations from perceptual representations and world knowledge, which is the result of spatial cognition [5,6]. In the field of geography, geospatial semantics analyzes the meaning of digital referents at the geographic scale and involves the concepts of geographical entities and ontology; its purpose is to deal with the semantic interoperability of geo-referenced information [3,4].
Place semantics is endowed with the natural attributes of geographical entities and the human activities surrounding them, and can be captured through human descriptions and

Geospatial Semantics
Kuhn defined geospatial semantics as "understanding GIS contents and capturing this understanding in formal theories" [4]. This definition expresses the fact that geospatial semantics involves the human cognition and formal modeling of geographic concepts. There is a large body of literature on geospatial cognition, ranging from behavior geography and mental representation to language descriptions of geospaces [28][29][30][31][32]. Research on the formal modeling of geographic concepts includes geographical ontology, digital gazetteers, geographical information retrieval and linked data [33][34][35][36][37][38][39].
Recent work on geospatial semantics has focused on eliciting semantic information from semi-structured and unstructured resources [17,34,39]. The semantics of a place or a geospatial entity not only originate from its natural attributes, but also from human activities in the place. Cai et al. [40] used geospatial semantics to represent the meaning of a place, which is related to the functions provided by the place as well as human activities within the place. A place may provide many functions where people can engage in various activities; thus, it can have multiple meanings, which can be inferred from human mobility and activities [41]. Increasing amounts of crowdsourced big data, such as mobile data, smart car data, social media data, and points of interest (POIs), reveal patterns of human mobility and activities. These data can therefore be used to extract the multiple activityrelated semantics of a place. For example, Gao et al. [42] identified urban functional regions using POIs and user check-in data on social media, Wang et al. [43] detected the geospatial semantics of urban regions based on POI categories, and Tu et al. [44] and Cai et al. [40] interpreted dynamic urban functions and spatial semantics through human activities via mobile phone and positioning data.
Place descriptions depict the qualitative characteristics of geographic locations from multiple perspectives, and can be a rich source of geospatial semantics. For example, Hu et al. [26] extracted the place semantics of cities from news articles and Huang [27] categorized geographic features using text documents. Geographically referenced social media texts provide a vast and valuable source of place descriptions; thus, they have previously been used to extract the semantic information of places. Steiger et al. [45] analyzed the spatiotemporal and semantic characteristics of georeferenced Tweets and found that the extracted spatiotemporal and semantic clusters of Tweets indicated the human activity patterns and urban structure. Moreover, Chen et al. [46] extracted and analyzed the hidden semantics of regions from georeferenced social media data using the Latent Semantic Analysis method. Furthermore, Lansley and Longley extracted topics of geo-tagged Tweets posted in London, UK, and found clear spatial and temporal variations in topics and attitudes [47]. Georeferenced social media data are increasingly used to study spatial regions and human activities, as well as geospatial semantics, due to their characteristics of large volume, easy acquisition and timeliness [41,46,47].

Natural Language Processing Model
Latent Dirichlet allocation (LDA) is the most widely used method for extracting topics from corpora [42,45,48]. LDA is a document generative model, which assumes that documents exhibit a joint probability distribution for thematic topics and words [49]. LDA is appropriate for longer texts but challenging with short texts such as Weibos and Tweets due to the sparse data and less focused topics. To overcome this problem, short texts may be grouped into long corpora according to individual users or locations [46,50]. However, grouping is not always suitable as it may be important to determine the thematic topic of each piece of short text. Some scholars have solved the problem of data sparseness by gaining external knowledge or combining other models, such as word2vec [51]. Although these methods somewhat improve the accuracy of the results, static word embedding does not consider either polysemy or context.
Recent attempts to solve this problem have included neural networks and a series of natural language processing models, such as ELMO (Embeddings from Language Models), GPT (Generative Pre-Training) and BERT (Bidirectional Encoder Representations from Transformers) [52][53][54]. By adding a context-aware representation, these models have greatly improved the results of natural language processing by employing attention mechanisms. For example, BERT uses a "masked language model" which masks a certain percentage of words in the sentences and learns to predict those masked works [52]. On the basis of these models, the ERNIE (Enhanced Representation through kNowledge IntEgration) language representation model was proposed, which is enhanced by knowledge [55]. The ERNIE model is a deep learning method for constructing language expression. The model architecture uses a bidirectional multilayer transformer as the basic encoder, followed by a self-attention mechanism to capture the contextual information of each word. Compared to BERT, which randomly masks some words from the input sentences, ERNIE adopts knowledge-masking strategies at the phrase and entity levels, and learns the prior knowledge of phrases and entities during the training stage. Thus, knowledge and long semantic information can be learned, such as the relationship between entities, the property of an entity, and the type of an event. In this way, ERNIE can learn the semantic relationship between entities and concepts, thereby greatly enhancing the ability of general semantic representation. ERNIE was produced by the Chinese company Baidu, and has exhibited a better performance than state-of-the-art models in Chinese language processing tasks [55]. As such, ERNIE has been employed to complete various Chinese tasks such as language inference, semantic similarity calculation, named entity recognition, emotion analysis, and question answering. Considering its ability to process Chinese language, ERNIE was adopted in this study to process Weibo social media texts.

Study Area and Data
The QTP is a unique part of the world that has an average elevation of over 4500 m. It represents the highest plateau of the world and is known as both "the roof of the world" and "the third pole". The uplift of the terrain and surrounding ranges has formed a geographically isolated region where the unique climate and physical environment boast not only splendid natural landscapes, but also distinct national cultures. Therefore, studying the semantic characteristics in the QTP can help us better understand the culture and sentiments of indigenous people. Tibet and Qinghai are two provincial administrative units of China that are completely contained within the QTP, occupying a majority of the plateau; therefore, these two provincial administrative units were chosen as the study area ( Figure 1). Furthermore, tourists attracted by the beautiful scenery and exotic customs generate numerous geotagged Weibos, from which their feelings and perceptions of the QTP can be extracted and compared with those of indigenous peoples.
learns the prior knowledge of phrases and entities during the training stage. Thus knowledge and long semantic information can be learned, such as the relationship be tween entities, the property of an entity, and the type of an event. In this way, ERNIE ca learn the semantic relationship between entities and concepts, thereby greatly enhancin the ability of general semantic representation. ERNIE was produced by the Chinese com pany Baidu, and has exhibited a better performance than state-of-the-art models in Ch nese language processing tasks [55]. As such, ERNIE has been employed to complete var ious Chinese tasks such as language inference, semantic similarity calculation, named en tity recognition, emotion analysis, and question answering. Considering its ability to pro cess Chinese language, ERNIE was adopted in this study to process Weibo social medi texts.

Study Area and Data
The QTP is a unique part of the world that has an average elevation of over 4,500 m It represents the highest plateau of the world and is known as both "the roof of the world and "the third pole". The uplift of the terrain and surrounding ranges has formed a geo graphically isolated region where the unique climate and physical environment boast no only splendid natural landscapes, but also distinct national cultures. Therefore, studyin the semantic characteristics in the QTP can help us better understand the culture and sen timents of indigenous people. Tibet and Qinghai are two provincial administrative unit of China that are completely contained within the QTP, occupying a majority of the plat eau; therefore, these two provincial administrative units were chosen as the study are ( Figure 1). Furthermore, tourists attracted by the beautiful scenery and exotic custom generate numerous geotagged Weibos, from which their feelings and perceptions of th QTP can be extracted and compared with those of indigenous peoples. In this study, we obtained 1,279,455 geotagged Weibo posts, of which 419,157 wer posted in Tibet and 860,298 were posted in Qinghai. After data cleaning, such as tokenize removing emojis, duplicate posts and stop words, 333,420 and 710,475 posts remaine from Tibet and Qianhai, respectively (Table 1). Figure 1 displays the spatial distributio In this study, we obtained 1,279,455 geotagged Weibo posts, of which 419,157 were posted in Tibet and 860,298 were posted in Qinghai. After data cleaning, such as tokenize, removing emojis, duplicate posts and stop words, 333,420 and 710,475 posts remained from Tibet and Qianhai, respectively (Table 1). Figure 1 displays the spatial distribution of these Weibo posts. As Qinghai and Tibet differ in both their physical and social environments, we separated the Weibo texts into four datasets according to the type and origin of the user: local users in Tibet, tourists in Tibet, local users in Qinghai and tourists in Qinghai. This classification allowed us to compare the semantic differences between the two provinces and the cognitive differences between indigenous people and tourists.

Research Framework and Methodology
Each geotagged Weibo post has a thematic topic that refers to some semantic aspect of a place. The thematic topics of multiple Weibos in a specific region therefore represent the semantic structure of the region and reflect people's knowledge and perception of the region. First, we extracted the topic of each Weibo considered to represent a semantic description of the place where the Weibo was located. Then, we analyzed the spatial distributions of different semantic descriptions. Finally, the geospatial semantics was compared at the city level, and cities were grouped according to their semantic similarity ( Figure 2). of these Weibo posts. As Qinghai and Tibet differ in both their physical and social environments, we separated the Weibo texts into four datasets according to the type and origin of the user: local users in Tibet, tourists in Tibet, local users in Qinghai and tourists in Qinghai. This classification allowed us to compare the semantic differences between the two provinces and the cognitive differences between indigenous people and tourists.

Research Framework and Methodology
Each geotagged Weibo post has a thematic topic that refers to some semantic aspect of a place. The thematic topics of multiple Weibos in a specific region therefore represent the semantic structure of the region and reflect people's knowledge and perception of the region. First, we extracted the topic of each Weibo considered to represent a semantic description of the place where the Weibo was located. Then, we analyzed the spatial distributions of different semantic descriptions. Finally, the geospatial semantics was compared at the city level, and cities were grouped according to their semantic similarity ( Figure 2).

Weibo Thematic Topic Extraction
The natural language processing model ERNIE1.0, proposed by Baidu, was used for thematic topic extraction. The experiment was based on Python 3.7, the Baidu AI studio platform, and Arcgis10.2. The model input of Weibo embeddings and output the probability distribution over topics.

Weibo Thematic Topic Extraction
The natural language processing model ERNIE1.0, proposed by Baidu, was used for thematic topic extraction. The experiment was based on Python 3.7, the Baidu AI studio platform, and Arcgis10.2. The model input of Weibo embeddings and output the probability distribution over topics.
ERNIE is a pre-trained deep learning natural language model that can fulfill many natural language tasks including topic prediction and text classification. However, to improve the accuracy of topic prediction for our datasets, we employed a small number of annotated Weibos to fine-tune the ERNIE model. There are three steps required to extract the Weibo thematic topics ( Figure 2). First, we chose and annotated a training set to fine-tune the model by selecting 1200 Weibos from Tibet and labeling their topics manually. According to the hot topic tags of Sina Weibo, 39 thematic topics were finally determined and used to label the 1200 Weibos. Then, the training set was used to fine-tune the model parameters, and cross entropy was used as the loss function to evaluate the result of the model: here, p is the probability distribution of the expected output of topics, and p is the probability distribution of the actual output of topics. The smaller the cross entropy, the closer the two probability distributions. Due to the limited number of labeled data, a 10-cross validation was adopted to train and test the model. During fine-tuning, the weight decay was set to 0.1, the learning rate was set to 5 × 10 -5 , and the batch size was 64. The loss function converged after approximate 50 epochs, and the overall accuracy of the model reached a maximum of approximately 78%. Finally, the fine-tuned model was used to classify the Weibo texts and identify the thematic topic of each Weibo.

Spatial Distribution of Place Semantics
Due to the large area of the QTP and the uneven distribution of Weibos, the distribution of different thematic topics was concentrated in specific areas. It is difficult to determine the importance of semantic descriptions in a certain region and compare the spatial distributions of different semantic descriptions by the number of Weibos containing certain topics. However, the proportion of Weibos containing certain topics in a certain space range indicates the strength of that topic within that space. If there are n topics and m regions, the proportion of Weibos containing topic t in region r can be represented as follows: where N t is the number of Weibos with topic t in region r. If topic t is more significant in region r than in any other regions, then p r t > p j t (1 ≤ j ≤ m and j = r). Thus, we employed the distribution of a topic's proportion to analyze the distribution of semantic descriptions.
Specifically, we divided the space into 20 × 20 km grids and then calculated the proportion of each topic in each grid. Then, spatial interpolation was conducted according to the topic proportion to identify and compare the continuous spatial distribution of different thematic topics.

Geospatial Semantic Differences and Clustering
According to the proportion of each thematic topic in region r, the geospatial semantics of the region can be represented by a vector composed of the proportion of each topic: Then, the semantic similarity between two regions, r1 and r2, can be revealed by the angle between two vectors, calculated with cosine similarity: where · is the mode of the vector.
Taking prefecture-level cities as a unit, we represented the geospatial semantics of a city as a vector composed of the proportion of each thematic topic. Chi-square statistics were used to test for significant differences between the geospatial semantics of different cities ( Figure 3). Cosine similarity was used to estimate the geospatial semantic similarity between cities; those with similar semantics structures were grouped using the hierarchical clustering method based on the cosine similarity. Hierarchical clustering is an unsupervised classification method that groups similar cities into clusters according to the similarity ISPRS Int. J. Geo-Inf. 2021, 10, 682 7 of 16 between cities and generates a tree indicating the hierarchy of the clusters. The semantic patterns in the QTP were then displayed. city as a vector composed of the proportion of each thematic topic. Chi-square statistics were used to test for significant differences between the geospatial semantics of different cities ( Figure 3). Cosine similarity was used to estimate the geospatial semantic similarity between cities; those with similar semantics structures were grouped using the hierarchical clustering method based on the cosine similarity. Hierarchical clustering is an unsupervised classification method that groups similar cities into clusters according to the similarity between cities and generates a tree indicating the hierarchy of the clusters. The semantic patterns in the QTP were then displayed.

Weibo Thematic Topics in the QTP
The topic of each Weibo in the four datasets was extracted using ERNIE. The results show that Weibos related to life and emotional expression represent the majority of all four datasets. According to the statistics of Weibo topics in all four datasets, the 20 topics mentioned in the largest number of Weibos were selected for further analysis. Figure 3 shows the distribution and standardized residuals of the top 20 topics in the four datasets. The Chi-square statistics imply significant differences in the distribution of topics between the four datasets. From the perspective of local users, thematic topics related to travel, scenery, food, religion, and photography are more common in Tibet than in Qinghai, whereas topics related to beauty salons, work, and emotional expression are significantly more common in Qinghai than in Tibet. From the perspective of tourists, thematic topics related to life and emotional expression are more common in Qinghai, whereas travel, scenery, religion and food topics are more common in Tibet, indicating that tourism resources are more attractive to tourists in Tibet than in Qinghai. Spatially, there is a significant difference between the topics of concern for tourists and residents in Tibet. In Tibet, the Weibos of tourists are more related to travel, beauty salons, religion and toponyms but less related to life, emotional expression, food, festivals, and photography than the Weibos of residents. In Qinghai, the Weibos of tourists are more related to life, emotional expression, music, film and TV but less related to beauty salons and work than the Weibos of residents. However, there is little difference in the proportion of travel, scenery, and religion topics between tourists and residents, which again indicates less interest in tourism resources in Qinghai.
Thus, travel, scenery, religion and food are more important in Tibet than in Qinghai, whereas emotional expression, work and beauty salons are less important topics, which reflects that fact that Tibet has stronger tourism characteristics than Qinghai. The relatively low modernization level in Tibet, in addition to the influence of cultural, historical and geographical factors, leads to greater lifestyle differences between Tibet and other provinces, and therefore a greater perception of tourism among Weibo users in Tibet. The geographical location, culture, and lifestyle of Qinghai is very close to that of other inland

Weibo Thematic Topics in the QTP
The topic of each Weibo in the four datasets was extracted using ERNIE. The results show that Weibos related to life and emotional expression represent the majority of all four datasets. According to the statistics of Weibo topics in all four datasets, the 20 topics mentioned in the largest number of Weibos were selected for further analysis. Figure 3 shows the distribution and standardized residuals of the top 20 topics in the four datasets. The Chi-square statistics imply significant differences in the distribution of topics between the four datasets. From the perspective of local users, thematic topics related to travel, scenery, food, religion, and photography are more common in Tibet than in Qinghai, whereas topics related to beauty salons, work, and emotional expression are significantly more common in Qinghai than in Tibet. From the perspective of tourists, thematic topics related to life and emotional expression are more common in Qinghai, whereas travel, scenery, religion and food topics are more common in Tibet, indicating that tourism resources are more attractive to tourists in Tibet than in Qinghai. Spatially, there is a significant difference between the topics of concern for tourists and residents in Tibet. In Tibet, the Weibos of tourists are more related to travel, beauty salons, religion and toponyms but less related to life, emotional expression, food, festivals, and photography than the Weibos of residents. In Qinghai, the Weibos of tourists are more related to life, emotional expression, music, film and TV but less related to beauty salons and work than the Weibos of residents. However, there is little difference in the proportion of travel, scenery, and religion topics between tourists and residents, which again indicates less interest in tourism resources in Qinghai.
Thus, travel, scenery, religion and food are more important in Tibet than in Qinghai, whereas emotional expression, work and beauty salons are less important topics, which reflects that fact that Tibet has stronger tourism characteristics than Qinghai. The relatively low modernization level in Tibet, in addition to the influence of cultural, historical and geographical factors, leads to greater lifestyle differences between Tibet and other provinces, and therefore a greater perception of tourism among Weibo users in Tibet. The geographical location, culture, and lifestyle of Qinghai is very close to that of other inland provinces in China, which may explain the greater perception of lifestyle-related topics among people and the lower emphasis on travel-related topics in Qinghai.

Spatial Distribution of Place Semantics
Travel, scenery, religion, food, emotions, and work are the main topics extracted from Weibos in the QTP. Figures 4 and 5 show the spatial distributions of these topics for ISPRS Int. J. Geo-Inf. 2021, 10, 682 8 of 16 residents and tourists, respectively. Each topic represents a form of geospatial semantics; therefore, the spatial distribution of different topics reflects the distribution of geospatial semantics. Figure 6 displays the continuous spatial distribution of these topics after interpolation, exhibiting the spatial variations in the perception strength of different topics for residents and tourists (left and middle columns, respectively). The right column indicates the difference between the two, reflecting the difference in semantic cognition between residents and tourists. The green color indicates that residents feel more strongly than tourists about a topic, whereas the red color indicates that tourists feel more strongly than residents about a topic. among people and the lower emphasis on travel-related topics in Qinghai.

Spatial Distribution of Place Semantics
Travel, scenery, religion, food, emotions, and work are the main topics extract Weibos in the QTP. Figures 4 and 5 show the spatial distributions of these topics dents and tourists, respectively. Each topic represents a form of geospatial sem therefore, the spatial distribution of different topics reflects the distribution of ge semantics. Figure 6 displays the continuous spatial distribution of these topics aft polation, exhibiting the spatial variations in the perception strength of different to residents and tourists (left and middle columns, respectively). The right column in the difference between the two, reflecting the difference in semantic cognition b residents and tourists. The green color indicates that residents feel more strong tourists about a topic, whereas the red color indicates that tourists feel more strong residents about a topic.  According to the spatial distributions of travel-related semantics, tourists greater perception of travel than residents in the QTP region, especially along t

Spatial Distribution of Place Semantics
Travel, scenery, religion, food, emotions, and work are the main topics extracte Weibos in the QTP. Figures 4 and 5 show the spatial distributions of these topics f dents and tourists, respectively. Each topic represents a form of geospatial sem therefore, the spatial distribution of different topics reflects the distribution of geo semantics. Figure 6 displays the continuous spatial distribution of these topics afte polation, exhibiting the spatial variations in the perception strength of different to residents and tourists (left and middle columns, respectively). The right column in the difference between the two, reflecting the difference in semantic cognition b residents and tourists. The green color indicates that residents feel more strong tourists about a topic, whereas the red color indicates that tourists feel more strong residents about a topic.  According to the spatial distributions of travel-related semantics, tourists greater perception of travel than residents in the QTP region, especially along th  . Right column shows the difference between the two, where the green color indicates that residents feel more strongly than tourists about a topic, and the red color indicates that tourists feel more strongly than residents about a topic. Figure 6. Spatial distributions of the perception strength of different semantics for residents (left column) and tourists (middle column). Right column shows the difference between the two, where the green color indicates that residents feel more strongly than tourists about a topic, and the red color indicates that tourists feel more strongly than residents about a topic.

Geospatial Semantic Differences among Cities
According to the spatial distributions of travel-related semantics, tourists have a greater perception of travel than residents in the QTP region, especially along the road network. The spatial distribution of travel-related semantics is more scattered for residents, except for an obvious hot spot at the northern margin of the Tsaidam Basin in Qinghai (Figure 6a). Indeed, most Weibos in this region are travel advertisements, indicating the strong desire to develop tourism. Regarding scenery-related semantics, residents' perception of scenery is greatest close to large residential areas, whereas that of tourists is greatest along the road network, especially along the Qinghai-Tibet Railway and the national highway between Shigatse and Ali (Figure 6b). Regarding food-related semantics, the hot spots for residents are predominantly distributed in Qinghai and the eastern part of Tibet, with most Weibos related to special local products, such as wolfberry, cordyceps sinensis and dried yak meat. Food-related hot spots are more scattered for tourists (Figure 6c). Weibo texts indicate that the food-related topics of most concern to tourists in Qinghai and eastern Tibet are special local products, such as cordyceps sinensis and dried yak meat, in Qinghai and eastern Tibet, whereas those in western Tibet are more general food topics due to the lack of restaurants on the road. The hot spots of emotion-related semantics for both residents and tourists are typically in densely populated areas such as eastern Qinghai and southern Tibet. However, residents have a much stronger perception of sense of emotion-related semantics than tourists (Figure 6d). Hot spots of work-related semantics are found in urban areas for residents; however, these hot spots are only found in Naqu and Haixi for tourists. Indeed, the Weibo texts reveal that the publishers are migrant workers from other provinces (Figure 6e). Regarding religion-related semantics, obvious hot spots occur for both residents and tourists in southern Yushu and northern Changdu, with stronger perception among local residents, indicating a strong religious environment in these areas (Figure 6f).
In general, place semantics exhibit clear spatial variations, as do tourists' and residents' perceptions of these semantics. Tourists feel more travel-and scenery-related semantics, whereas residents feel more emotion-and religion-related semantics. The spatial variation of semantics can reveal important regional characteristics. For example, people in northern Qinghai are attempting to develop tourism by attracting more tourists on social media platforms; western Tibet attracts tourists but does not provide good food services; Naqu in Tibet is home to many migrant workers.

Geospatial Semantic Differences among Cities
Due to the different concerns of Weibo users, the distribution of Weibo topics is uneven in the QTP. For example, travel accounts for the majority of topics in various regions. However, the proportion of the travel-related topics differs among different regions, which reflects regional semantic structure differences. Figure 7 shows the distribution and standardized Chi-square residuals for the top 20 topics extracted from the Tibetan resident dataset in each Tibetan city. Chi-square statistics show significant differences in the distribution of different cities, which implies different semantic structures for different cities. Travel-related semantics is strongest in Nyingchi, followed by Ngari, Lhasa, and Shannan, which reflects the substantial tourism attraction in these areas. Conversely, the level of tourism interest in Naqu is low for Tibetan residents due to the overall high altitude and tough natural conditions. The level of tourism interest for Tibetan residents is lowest in Changdu due to its location in the border region between Han culture and Tibetan culture, and its similarity to cities in other provinces of China owning to its crowded buildings and people; therefore, Changdu holds little attraction for Tibetan people. In addition, Changdu is characterized by a high proportion of beauty salon-related semantics; the Weibo texts reveal many micro businesses in Changdu selling cosmetic products via social media. The abundance of temples in Lhasa and Shigatse explains the high levels of religious semantics in these two cities. Nyingchi exhibits high levels of travel-, scenery-, and food-related semantics, and Shannan also exhibits high levels of travel-related semantics; however, the place semantics for both cities reveal low interest in religion. These two cities are located in southeastern Tibet, which boasts a good natural environment and rich natural landscape resources; thus, these cities are a notable attraction for Tibetan people. Moreover, Ngari is the birthplace of Tibetan culture and the original Bon religion, containing many famous mountains and holy lakes with religious significance, which are places of pilgrimage for Tibetans and Buddhists; this explains the high levels of emotion-related semantics. As for work-related semantics, no obvious difference is observed among cities.

021, 10, x FOR PEER REVIEW 11 of 16
holy lakes with religious significance, which are places of pilgrimage for Tibetans and Buddhists; this explains the high levels of emotion-related semantics. As for work-related semantics, no obvious difference is observed among cities.  Figure 8 shows the distribution and standardized residuals of the top 20 topics extracted from the Tibetan tourist dataset in each Tibetan city. Chi-square statistics also show significant differences among cities. Compared to residents, tourists in Tibet exhibit different cognition from a geospatial perspective. The level of travel-related semantics is highest in Changdu, which is the opposite result to that of Tibetan residents. This is because Changdu lies at the border of the QTP and is the place where most tourists enter Tibet. Moreover, the altitude is relatively low, which makes it a more attractive place for tourists. Religion-related semantics is most common in Lhasa, Ngari, and Shigatse for tourists; however, tourists do not have strong cognition of emotional semantics. This contrasts with Tibetan residents, who have strong cognition of emotion-related semantics in Shigatse and Ngari. This indicates that tourists pay more attention to the world outside, whereas residents pay more attention to their inner emotions. However, tourists have strong cognition of life-and emotion-related semantics in Naqu. Food-related semantics is high in Lhasa, whereas health-related semantics is relatively high in Shigatse. As the average altitude of Shigatse is rather high, at greater than 4000 m, many tourists feel uncomfortable coming to this area. Similar to residents, tourists have a strong perception of travel and scenery in Nyingchi and strong perception of scenery in Shannan and Ngari. This implies that the rich tourism resources in Nyingchi are highly attractive to tourists.  Figure 8 shows the distribution and standardized residuals of the top 20 topics extracted from the Tibetan tourist dataset in each Tibetan city. Chi-square statistics also show significant differences among cities. Compared to residents, tourists in Tibet exhibit different cognition from a geospatial perspective. The level of travel-related semantics is highest in Changdu, which is the opposite result to that of Tibetan residents. This is because Changdu lies at the border of the QTP and is the place where most tourists enter Tibet. Moreover, the altitude is relatively low, which makes it a more attractive place for tourists. Religion-related semantics is most common in Lhasa, Ngari, and Shigatse for tourists; however, tourists do not have strong cognition of emotional semantics. This contrasts with Tibetan residents, who have strong cognition of emotion-related semantics in Shigatse and Ngari. This indicates that tourists pay more attention to the world outside, whereas residents pay more attention to their inner emotions. However, tourists have strong cognition of life-and emotion-related semantics in Naqu. Food-related semantics is high in Lhasa, whereas health-related semantics is relatively high in Shigatse. As the average altitude of Shigatse is rather high, at greater than 4000 m, many tourists feel uncomfortable coming to this area. Similar to residents, tourists have a strong perception of travel and scenery in Nyingchi and strong perception of scenery in Shannan and Ngari. This implies that the rich tourism resources in Nyingchi are highly attractive to tourists. Figure 9 shows the distribution and standardized Chi-square residuals for the top 20 topics extracted from the Qinghai resident dataset in each Qinghai city. Chi-square statistics show significantly different distributions of semantic structures among Qinghai cities. Haibei, Hainan, and Haixi are regions with strong cognition of travel and scenery for Qinghai residents. There are many natural scenic spots in these three cities, which are obviously preferred by residents. Huangnan, Golog, and Yushu, which contain many temples, are areas where residents have strong cognition of religion. Xining, the capital city of Qinghai, is dominated by work-and beauty salon-related semantics for residents, reflecting the everyday concerns of residents in large cities. Food-related semantics is abnormally high among residents in Yushu and Haixi. Weibo texts show that many Weibos promote wolfberry and cordyceps sinensis in Haixi, and a "love lunch" activity was lunched in Yushu in 2017. Moreover, Huangnan is characterized by more charity-related semantics among residents. strong cognition of life-and emotion-related semantics in Naqu. Food-related semantics is high in Lhasa, whereas health-related semantics is relatively high in Shigatse. As the average altitude of Shigatse is rather high, at greater than 4000 m, many tourists feel uncomfortable coming to this area. Similar to residents, tourists have a strong perception of travel and scenery in Nyingchi and strong perception of scenery in Shannan and Ngari. This implies that the rich tourism resources in Nyingchi are highly attractive to tourists.   Qinghai residents. There are many natural scenic spots in these three cities, which are obviously preferred by residents. Huangnan, Golog, and Yushu, which contain many temples, are areas where residents have strong cognition of religion. Xining, the capital city of Qinghai, is dominated by work-and beauty salon-related semantics for residents, reflecting the everyday concerns of residents in large cities. Food-related semantics is abnormally high among residents in Yushu and Haixi. Weibo texts show that many Weibos promote wolfberry and cordyceps sinensis in Haixi, and a "love lunch" activity was lunched in Yushu in 2017. Moreover, Huangnan is characterized by more charity-related semantics among residents.     Qinghai residents. There are many natural scenic spots in these three cities, which are obviously preferred by residents. Huangnan, Golog, and Yushu, which contain many temples, are areas where residents have strong cognition of religion. Xining, the capital city of Qinghai, is dominated by work-and beauty salon-related semantics for residents, reflecting the everyday concerns of residents in large cities. Food-related semantics is abnormally high among residents in Yushu and Haixi. Weibo texts show that many Weibos promote wolfberry and cordyceps sinensis in Haixi, and a "love lunch" activity was lunched in Yushu in 2017. Moreover, Huangnan is characterized by more charity-related semantics among residents.

Geospatial Semantic Clustering
Considering the semantic structure of a city as a vector, we calculated the semantic similarity among cities using the cosine similarity and then grouped the cities in the QTP

Geospatial Semantic Clustering
Considering the semantic structure of a city as a vector, we calculated the semantic similarity among cities using the cosine similarity and then grouped the cities in the QTP using the hierarchical clustering method. The cities on the QTP can be clustered into three categories according to the geospatial semantic cognition of residents (Figure 11a). Yushu, Golog, and Huangnan in Qinghai exhibit high levels of religion-related place semantics but low levels of other semantics. Changdu exhibits high levels of beauty salon-related place semantics and so is placed in a separate category. Other cities exhibit particularly high levels of travel-and scenery-related place semantics. Additionally, three categories of cities are identified according to the geospatial semantic cognition of tourists (Figure 11b). Xining represents a category by itself due to strong work-, emotion-, and life-related place semantics but low levels of other semantics. Yushu, Golog, Huangnan, Haidong, Lhasa, and Naqu also show strong work-, emotion-, and life-related place semantics; however, these cities also exhibit strong religion-related place semantics, so are clustered into a separate category. Other cities exhibit prominent travel-and scenery-related place semantics but weak life-and emotion-related place semantics. Generally, most cities on the QTP exhibit strong travel-and scenery-related place semantics, especially for tourists. Yushu, Golog, and Huangnan exhibit strong religion-related place semantics. of cities are identified according to the geospatial semantic cognition of tourists ( Figure  11b). Xining represents a category by itself due to strong work-, emotion-, and life-related place semantics but low levels of other semantics. Yushu, Golog, Huangnan, Haidong, Lhasa, and Naqu also show strong work-, emotion-, and life-related place semantics; however, these cities also exhibit strong religion-related place semantics, so are clustered into a separate category. Other cities exhibit prominent travel-and scenery-related place semantics but weak life-and emotion-related place semantics. Generally, most cities on the QTP exhibit strong travel-and scenery-related place semantics, especially for tourists. Yushu, Golog, and Huangnan exhibit strong religion-related place semantics.

Conclusions
In this study, topics extracted from geotagged Sina Weibos posted in the QTP were used to analyze geospatial semantics in the QTP. By determining the spatial distribution of Weibo topics, the characteristics and regional differences of cognition in the QTP were analyzed from a geospatial semantic perspective.
First, residents' cognition is more focused on life, emotional expression, food, and festivals, whereas tourists' cognition is more focused on travel, scenery, and religion. The difference between the two is greater in Tibet than in Qinghai, reflecting the greater tourist appeal of Tibet.
Second, the spatial distribution of place semantics exhibits clear variability. Tourismand scenery-related place semantics are widely distributed, whereas religion-related place semantics are mainly distributed in Golog, Yushu, and Changdu. Hot spots of work-related place semantics are predominantly close to residential areas. Compared to Qinghai, Tibet exhibits stronger cognition of travel-and scenery-related semantics, whereas Qinghai exhibits stronger cognition of life-related semantics, such as emotion, work, and beauty salons, which indicates that Tibet has stronger tourism characteristics than Qinghai. The spatial variation of place semantics can reveal important regional characteristics such as the amount of migrant workers or food services.
Third, the cognition of geospatial semantics differs substantially between residents and tourists. For example, residents have greater cognition of travel-related semantics in Figure 11. Classification of cities in the Qinghai-Tibetan Plateau according to their semantic similarity for (a) residents and (b) tourists.

Conclusions
In this study, topics extracted from geotagged Sina Weibos posted in the QTP were used to analyze geospatial semantics in the QTP. By determining the spatial distribution of Weibo topics, the characteristics and regional differences of cognition in the QTP were analyzed from a geospatial semantic perspective.
First, residents' cognition is more focused on life, emotional expression, food, and festivals, whereas tourists' cognition is more focused on travel, scenery, and religion. The difference between the two is greater in Tibet than in Qinghai, reflecting the greater tourist appeal of Tibet.
Second, the spatial distribution of place semantics exhibits clear variability. Tourismand scenery-related place semantics are widely distributed, whereas religion-related place semantics are mainly distributed in Golog, Yushu, and Changdu. Hot spots of work-related place semantics are predominantly close to residential areas. Compared to Qinghai, Tibet exhibits stronger cognition of travel-and scenery-related semantics, whereas Qinghai exhibits stronger cognition of life-related semantics, such as emotion, work, and beauty salons, which indicates that Tibet has stronger tourism characteristics than Qinghai. The spatial variation of place semantics can reveal important regional characteristics such as the amount of migrant workers or food services.
Third, the cognition of geospatial semantics differs substantially between residents and tourists. For example, residents have greater cognition of travel-related semantics in Qinghai than in Tibet, whereas tourists have greater cognition of travel-related semantics in Tibet than in Qinghai. Tibet exhibits a greater the difference between residents and tourist. Residents have the lowest cognition of travel-related semantics in Changdu, whereas tourists have the highest cognition of travel-related semantics in Changdu. Residents have strong cognition of emotion-related semantics in Ngari and Shigatse. However, tourists do not exhibit the same level of cognition. On the contrary, tourists have greater cognition of health-related semantics in Shigatse. Moreover, there is less difference between the perspectives of residents and tourists in Qinghai.
Fourth, Weibo texts indicate that residents and tourists have different concerns about place semantics. Regarding travel, tourists enjoy trips, whereas residents are concerned with improving local tourism attractions. As for food, tourists care about dining on their trips, whereas residents are concerned with selling special local products.
Fifth, clustering results based on semantic similarity show that the cities of the QTP can be divided into approximate three types: tourism-focused cities, life-focused cities and religion-focused cities; however, the categories of some cities differ according to the cognition strength of residents and tourists. Generally, most cities in the QTP have a strong focus on tourism and scenery, especially for tourists, whereas Yushu, Golog, and Huangnan have a strong focus on religion for both residents and tourists.
This research can improve our understanding of the regional characteristics of the QTP. Furthermore, a better understanding of the geospatial semantic cognition differences between different groups of people can be used to improve publicity related to tourism and enhance the regional attractiveness of the QTP.