1. Introduction
Since the 20th century, China’s urbanization rate has been steadily increasing, rising from 18% to over 67% during 1978–2024 [
1]. However, as globalization continues to spread, urban development has long been centered on construction while neglecting human emotions, severely impacting human well-being [
2]. Therefore, accurately assessing the spatial mechanisms through which the urban environment influences human emotions is a core issue for sustainable urban development [
3,
4]. However, existing research predominantly evaluates urban well-being through spatial parameters of the physical environment, lacking in-depth exploration of people-centered public perception [
5]. The deepening of the transition from the material dimension to the perceptual dimension is key to understanding the Man–Land Relationship, thereby enhancing human well-being [
6]. Therefore, adopting “emotion” as a crucial lens for understanding human well-being, and emphasizing the exploration and generation mechanisms of public emotion, is key to comprehending and optimizing design decisions while enhancing the quality and image of cities.
Tourist emotion serves as an effective perspective for measuring human spatial well-being [
7]. Its superiority stems from the unique interaction between tourists and urban space, primarily manifested in two aspects. First, the relatively dispersed distribution of tourist trajectories constitutes a systematic sampling of urban space [
8], revealing patterns of human perception across different urban environments. Secondly, tourists are individuals who do not reside in the city long-term, and their emotional responses to urban spaces tend to be more immediate and intense, making them easier to perceive, collect, and analyze [
9]. Although scholars have explored the mechanisms linking emotions and urban spaces from residents’ perspectives [
5,
10,
11], research on mapping and interpreting urban emotion maps from tourists’ viewpoints remains relatively scarce [
7]. To this end, this study adopts tourist emotion as its entry point to reveal the mechanisms through which urban spaces influence human well-being.
The acquisition of emotion data sources has undergone a transformation from traditional methods to online data. Traditional research methods mainly relied on questionnaires to construct subjective indicators such as cognition [
12], or inferred public emotion by analyzing travelogs, interviews and other texts on tourism websites [
13].
The proliferation of user-generated content (UGC) data necessitates more efficient methods for quantifying emotion due to its massive scale [
7,
14,
15]. Early research primarily relied on lexicon-based methods (such as SentiWordNet and HowNet) and traditional machine learning approaches (such as SVM and TF-IDF), which exhibited limited generalization capabilities. The rise in deep learning technologies (such as CNNs and LSTMs) has enabled an end-to-end automatic learning of textual features, significantly enhancing models’ ability to understand deep semantic meanings. This has established them as the mainstream approach in emotion analysis. In recent years, pre-trained models such as BERT and ERNIE have achieved breakthrough progress in tasks like emotion analysis by acquiring powerful general semantic representation capabilities through self-supervised learning on ultra-large-scale corpora.
However, existing research primarily focuses on comparing model performance across general domains, with a lack of systematic evaluation and classification of different spatial features and emotion polarities. This limitation hinders the precise application and optimized selection of these advanced technologies in tourism emotion analysis practices.
This study aims to establish a deep learning model evaluation and selection framework for urban spatial emotion analysis. This framework will establish an optimized technical approach tailored to the characteristics of urban text data by systematically comparing differences in recognition performance, efficiency, and robustness across various architectural models. Using Harbin as a case study, this empirical research reveals the spatial distribution characteristics of its tourism emotion. It explores the potential connections between tourism emotion and urban space, providing data insights to enhance urban well-being.
Theoretically, this study establishes a model evaluation and selection methodology applicable to urban emotion assessment, bridging the methodological gap between existing spatial parameter research and big data emotion analysis. In practice, research findings can provide data support for urban design and planning, helping to identify key spatial elements that influence emotional well-being. This offers a scientific basis for optimizing public spaces, enhancing the allocation of service facilities, and improving the urban living environment, thereby promoting the realization of urban well-being among city users. This study demonstrates that emotion analysis technology can serve as an effective tool for bridging spatial design decisions with human subjective well-being, thereby advancing people-centered urban development. This study aims to address the following key issues:
How can we evaluate and select multiple deep learning models to establish a novel emotion analysis method suitable for urban well-being assessment?
What are the spatial differentiation patterns of urban emotion maps and the underlying mechanisms linking them to the built environment?
5. Discussion and Conclusions
5.1. Discussion
This study established a multidimensional evaluation framework that encompasses overall performance, category-level performance, robustness, and efficiency. It systematically compared the performance of seven technical approaches—ranging from traditional machine learning to pre-trained models—in the task of emotion analysis of city tourism reviews, and conducted an empirical analysis and visualization of the city’s emotion map based on the emotion prediction results. The study further analyzed the mechanisms linking built environment factors to emotions. The research confirmed the following.
First, the ERNIE model, leveraging its knowledge-enhanced pre-training strategy, demonstrated the best performance in tasks involving noise interference and domain adaptation, achieving an overall score of 0.7612, making it suitable for city emotion mapping in the tourism sector. This finding is consistent with the conclusions of Zhang B et al., who found that ERNIE performed best on Chinese emotion analysis tasks [
37]. However, the findings of this study differ from those of Rehman A U et al. [
48]. The latter argued that the CNN-RNN hybrid architecture performs strongly in text emotion classification, whereas this study found that traditional deep learning models such as R-CNN perform poorly in tourism emotion analysis. The primary reason for this discrepancy lies in the differences in the characteristics of the data sources. The CNN-RNN hybrid architecture performs exceptionally well on general-purpose text datasets with relatively balanced distributions, whereas travel review datasets exhibit typical characteristics such as significant class imbalance, colloquial expressions, and a mix of dialects and internet slang. These characteristics of travel review data place higher demands on the model’s robustness and its ability to learn from a small number of samples. Therefore, the methodological framework of this study is better suited to emotion analysis tasks in the specific field of travel reviews, enabling a more accurate capture of the emotional characteristics of urban tourism spaces.
Second, Harbin’s emotional map reveals a distinct spatial differentiation pattern characterized by a “core–periphery, east–west alternation” structure: positive emotions are highly concentrated in the Sun Island Natural Recreation Core and the Central Street historical and cultural core along the Songhua River, while negative emotions are scattered across the modern commercial and entertainment districts to the east and west, the themed park extension area, and the vicinity of urban landmarks. This finding is consistent with the results of studies on the perception of urban spatial emotions by Huang Shan et al. [
29], as well as the conclusions of Wang Meng’s study on the emotions of tourists in historic and cultural districts. The accurate identification of the aforementioned characteristics of emotion space differentiation confirms that the emotion analysis model developed in this paper is highly suited to tourism review data.
Third, this study confirmed that emotion scores are significantly positively correlated with road density, park density, and accommodation facility density, and weakly negatively correlated with POI diversity. In contrast, neither building density nor floor area ratio shows a significant correlation with emotion scores. The findings of this study are consistent with those of Mouratidis K regarding residents’ emotions, both indicating that access to green spaces is consistently linked to higher levels of subjective well-being among urban residents [
49]. However, different from the conclusion of this study that building density has no significant effect on visitors’ emotions, Frey, V.N. found that residents in high residential density areas were significantly more likely to suffer from poor mental health [
50]. This difference stems primarily from variations in the study subjects and the measurement scales used. Residents’ perceptions of building density are more closely tied to the convenience of daily life, whereas tourists place greater emphasis on the cultural atmosphere and visual experience of a space. Consequently, the mechanisms through which building density influences the emotions of these two groups differ. Furthermore, a study by LR Larson et al. found that park quantity and quality were positively associated with well-being [
51]. The study of Sahar Samavati et al. also demonstrated that the higher people’s satisfaction with green/natural visibility and traffic connectivity is, the stronger their happiness is, and the closer they are to POI-intensive places such as shops and the city center, the less happy they are [
52]. All of these support our finding that park density and road density were positively correlated with emotion scores while POI diversity was negatively correlated with it. However, research by Bina Ram et al. suggested that park accessibility and transportation convenience have no overall impact on mental health and well-being. This indicates that changes to the built environment alone are insufficient to improve mental health and well-being [
53].
Finally, this study provides a decision-making basis for planning management. Positive sentiments are concentrated in natural recreation cores and historical–cultural hubs. Planning efforts should prioritize the maintenance of high-quality facilities, effective crowd management, and the preservation of authentic cultural atmospheres in these areas to sustain visitor satisfaction. Negative sentiments are scattered around peripheral commercial entertainment zones and iconic landmarks. Recommendations include improving price transparency, reducing queuing times, enhancing wayfinding systems, and diversifying on-site experiences to mitigate visitor disappointment. Furthermore, the multi-model evaluation framework developed in this study can be embedded into a real-time or near-real-time dashboard. By periodically collecting and analyzing visitor reviews, destination managers can identify emerging negative hotspots and evaluate the effectiveness of implemented interventions, enabling a shift from reactive problem-solving to proactive, evidence-based planning.
5.2. Conclusions
This study systematically compares the overall performance of various emotion analysis models on tourism review data by constructing a multi-level, multidimensional model evaluation framework. The main findings are as follows:
- (1)
Comparison of Multi-Model Performance
The seven emotion analysis models show significant differences in performance, with ERNIE demonstrating the best overall performance and excelling across all evaluation metrics. However, each model exhibits distinct performance characteristics: RoBERTa and BERT demonstrate a clear advantage in negative emotion recognition, capturing negative emotional expressions with greater accuracy; SnowNLP plays a significant complementary role in evaluating positive emotion in city centers and is particularly sensitive to positive emotional responses in areas such as cultural districts; RoBERTa performs more accurately in emotion analysis in outlying areas and can effectively identify distinct emotional patterns in entertainment districts. This suggests that a single model is unlikely to fully address the complex task of evaluating the urban emotion map. To achieve a high-precision depiction of urban emotion spaces, it is necessary to develop a multi-model collaborative strategy that accounts for the differences in spatial location and emotion polarity.
- (2)
Distinctive Features of Emotion Maps
The study found that the predictions from pre-trained models such as ERNIE and BERT corroborate those from BiLSTM, revealing the typical spatial patterns of tourists’ emotion in Harbin. All the clusters are located on the north bank of the Songhua River, alternating between east and west along the Sun Island area. Positive emotion is highly concentrated in the two core areas: the Sun Island Natural Recreation Core and the Central Street historical and cultural core; negative emotion, on the other hand, is scattered across the modern commercial and entertainment zone on the west side (Sunac Cultural Tourism City, Ice and Snow World), the extended experience zone surrounding Sun Island (Russian-style town, Snow Expo, sightseeing cable car), and the Longta City Landmark Zone on the east side. In addition, two areas with relatively low negative emotion have emerged in distinctive commercial districts such as Gogol Street and Shanhetun, as well as in the vicinity of the Northeast Tiger Park. This “core–periphery, east–west alternating” distribution pattern reveals the intrinsic connection between tourists’ emotional responses and the type, location, and quality of the tourism resources, thereby confirming the critical role of model selection in ensuring the reliability of findings in the field of emotional geography.
- (3)
The Mechanism of Association with “Elements of the Built Environment”
Road density, green space density, and accommodation facility density show a significant positive correlation with emotion scores, while POI diversity shows a weak negative correlation; in contrast, building density and floor area ratio show no significant correlation with emotion scores. These findings reveal the mechanisms through which elements of the built environment influence the differentiation of emotional spaces in cities, providing empirical evidence for understanding the varying emotional responses of different spatial actors, and offering valuable insights for enhancing urban well-being.
5.3. Theoretical Contributions, Practical Implications and Limitations
This study makes theoretical contributions in three main aspects. First, it constructs a multidimensional model evaluation and selection framework for analyzing urban affective spaces. This provides a systematic methodological tool for evaluating and selecting emotion analysis models suitable for urban research scenarios. Second, integrating emotion analysis technology with spatial visualization bridges the methodological gap between spatial parameter research and big data emotion analysis, advancing urban studies from an “object-centered” to a “people-centered” paradigm. Third, through the fine-grained evaluation, this study validates the superiority of pre-trained models in urban vertical applications, revealing the underlying mechanisms of their semantic prior and noise resilience. This provides a theoretical foundation for their subsequent deep application in fields such as urban computing.
The practical significance of this study lies in its exploration of how emotion analysis techniques—integrating sentiment classification, spatial aggregation, and correlation analysis—can be applied to people-centered urban planning and tourism management. First, by generating emotional spatial distribution maps, the study provides a visual reference for identifying potential emotional hotspots and problem areas within cities. Second, attribution analysis based on emotional patterns offers insights for spatial quality optimization, shifting from holistic enhancement to targeted interventions. Finally, the study proposes a preliminary framework for routine emotion monitoring, which could contribute to a “perception–evaluation–optimization” loop, thereby supporting efforts to enhance the urban well-being of both residents and tourists.
This study still has several limitations. First, the study primarily compared the performance of standalone models and has not yet explored the potential of ensemble models, which may offer room for further improvement in classification performance and robustness. Second, the data used in the study is concentrated solely on Harbin-based travel reviews. Although the results are reliable for this particular context, caution should be exercised when extrapolating the findings to other cities or cultural backgrounds, as differences in local characteristics may influence emotional patterns and model performance. Furthermore, the study fails to deconstruct and analyze more granular emotional orientations. These limitations also point to potential directions for future research, including developing lightweight fusion models tailored to vertical domains, expanding validation across multiple cities and scenarios, and advancing fine-grained emotion spatial analysis frameworks.