Tourism Sentiment Chain Representation Model and Construction from Tourist Reviews

Li, Bosen; Li, Rui; Wang, Junhao; Song, Aihong

doi:10.3390/fi17070276

Open AccessArticle

Tourism Sentiment Chain Representation Model and Construction from Tourist Reviews

¹

State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China

²

Collaborative Innovation Center of Geospatial Technology, Wuhan University, Wuhan 430079, China

^*

Authors to whom correspondence should be addressed.

Future Internet 2025, 17(7), 276; https://doi.org/10.3390/fi17070276

Submission received: 14 May 2025 / Revised: 11 June 2025 / Accepted: 18 June 2025 / Published: 23 June 2025

Download

Browse Figures

Versions Notes

Abstract

Current tourism route recommendation systems often overemphasize popular destinations, thereby overlooking geographical accessibility between attractions and the experiential coherence of the journey. Leveraging multidimensional attribute perceptions derived from tourist reviews, this study proposes a Spatial–Semantic Integrated Model for Tourist Attraction Representation (SSIM-TAR), which holistically encodes the composite attributes and multifaceted evaluations of attractions. Integrating these multidimensional features with inter-attraction relationships, three relational metrics are defined and fused: spatial proximity, resonance correlation, and thematic-sentiment similarity, forming a Tourist Attraction Multidimensional Association Network (MAN-SRT). This network enables precise characterization of complex inter-attraction dependencies. Building upon MAN-SRT, the Tourism Sentiment Chain (TSC) model is proposed that incorporates geographical accessibility, associative resonance, and thematic-sentiment synergy to optimize the selection and sequential arrangement of attractions in personalized route planning. Results demonstrate that SSIM-TAR effectively captures the integrated attributes and experiential quality of tourist attractions, while MAN-SRT reveals distinct multidimensional association patterns. Compared with popular platforms such as “Qunar” and “Mafengwo”, the TSC approach yields routes with enhanced spatial efficiency and thematic-sentiment coherence. This study advances tourism route modeling by jointly analyzing multidimensional experiential quality through spatial–semantic feature fusion and by achieving an integrated optimization of geographical accessibility and experiential coherence in route design.

Keywords:

theme classification; viewpoint extraction; multidimensional correlation relationships; attraction association network; tourism route recommendation

1. Introduction

The rapid growth of tourism demand and the digital transformation of tourism social platforms have led to an exponential increase in user-generated content [1,2,3]. Tourist reviews not only reflect individual assessments of attraction services and experiences [4] but also reveal latent thematic similarity, sentiment coherence, and other implicit semantic associations through cross-scenic mentions [5,6,7]. In addition, attractions also exhibit spatial correlations based on their geographic proximity and transit cost [8]. Collectively, these spatial and semantic relationships shape tourists’ route planning behaviors [9]. Despite this, existing tourism recommendation systems predominantly rely on single-dimension optimization strategies, often resulting in suboptimal routes characterized by high transit cost or low thematic relevance. Such approaches fail to address the dual requirements of spatial efficiency and experiential coherence, underscoring the necessity for a unified framework that synergistically integrates spatial and semantic associations in route recommendation.

Tourism review texts have emerged as a critical resource for understanding tourist behavior, sentiment orientation, and regional tourism dynamics [10,11]. Nevertheless, their unstructured nature and fragmented content pose substantial challenges for semantic extraction and analysis [12]. While advances have been made in topic classification and sentiment polarity analysis, prevailing methodologies often suffer from misaligned analytical granularity and a lack of integration between thematic and emotional dimensions. Furthermore, current tourism route recommendation systems—whether attribute-driven or behavior-driven—are limited in their capacity to systematically model spatial and semantic correlations among attractions.

In summary, while recent studies have made significant progress in analyzing the semantic features of tourist attractions and improving route recommendation systems, existing approaches often neglect the interplay between thematic and sentiment features. Furthermore, there is a lack of systematic modeling of spatial–semantic associations among attractions. To achieve a balanced trade-off among transit cost, user behavior, and experiential quality, route recommendations should simultaneously consider spatial proximity, associative co-occurrence, and thematic-sentiment similarity among attractions. This study proposes the Tourism Sentiment Chain (TSC) model that integrates spatial and semantic information to quantify multidimensional relationships between attractions. Based on graph propagation and random walk strategies, the model generates tourism routes that jointly optimize geographical accessibility and experiential coherence, offering travelers cost-effective and coherent travel experiences.

2. Related Works

2.1. Text Analysis of Tourism Reviews

Current research primarily focuses on two analytical dimensions: (i) Textual theme classification—leveraging topic models or clustering techniques to uncover dominant tourists’ interests [13,14]. Wang Z. et al. applied the LDA to evaluate themes from multilingual reviews of the Chengdu Panda Base [15], while Gour A. et al. used K-means clustering on TripAdvisor data to reveal satisfaction patterns across destinations [16]. (ii) Sentiment polarity analysis—quantifying overall or attribute-specific sentiment using lexicon-based or deep learning models (DL) [17,18,19]. Fu, MZ et al. enhanced LSTM architecture for sentiment feature extraction [20], and Mou, T et al. integrated attention mechanisms with bidirectional RNNs to classify sentiment in Ctrip reviews [21].

Despite methodological advances, key limitations remain: (i) Coarse analytical resolution. Individual reviews often encompass multi-aspect evaluations (e.g., facilities, services, environment), yet prevailing approaches target sentence-level classification [22,23], overlooking fine-grained aspect-level insights (e.g., sentiments toward ticket prices or guided tours). (ii) Decoupled topic–sentiment modeling. Topic distributions and associated sentiments are frequently treated independently, lacking unified frameworks that jointly model thematic salience and affective orientation [24,25]. Consequently, the interplay between topic prominence and its sentiment valence is insufficiently captured. These shortcomings limit the availability of fine-grained data necessary for evaluating thematic-sentiment continuity across attractions, thereby constraining the development of semantic networks capable of informing optimized travel routes.

2.2. Methods for Recommending Tourist Routes

Tourism route recommendation remains a central research focus in tourism behavior studies [26,27]. Methodologically, existing approaches fall into two major categories: attraction attribute-driven [28,29] and tourist behavior-driven [30,31,32] strategies. (i) Attraction attribute-driven methods focus exclusively on transit costs, such as road distances and transit times between attractions [33]. Daniel C.B. et al. applied impedance models to compute shortest-path routes based on geographic coordinates [34]. While effective in reducing spatial inefficiencies, these methods often neglect thematic dissimilarities that influence experiential coherence. (ii) Tourist behavior-driven methods leverage visitation frequency and sequential patterns to inform recommendations [35]. Chen et al. introduced a travel sequence similarity measure by modeling tourist movement as event sequences, enabling the generation of novel itineraries through cross-city comparisons [36]. Hua et al. enhanced ant colony optimization algorithms by incorporating user sentiment and sequence similarity into heuristic functions, thereby recommending highly visited routes [37]. However, these popularity-centric strategies tend to overemphasize mainstream attractions, overlooking both spatial efficiency and the latent experiential value of niche destinations.

Recent efforts in tourism route recommendation have explored the integration of multi-modal information through techniques such as graph modeling, path search algorithms, and intelligent optimization strategies. For instance, Liang et al. proposed a weighted scoring framework that incorporates user-specific preferences and attraction attributes to generate personalized itineraries [30]. Fan et al. further developed a tourism route graph model integrated with a random walk mechanism to simulate transitions between attractions, employing an enhanced ant colony optimization algorithm for route planning [38]. Additionally, Simulated Annealing (SA), due to its robust global search properties, has been applied to combinatorial optimization in tourism routing; for example, Jewpanya et al. utilized an Adaptive Neighborhood Simulated Annealing (ANSA) algorithm to recommend sightseeing locations and activity suggestions tailored to individual tourist preferences [39].

Despite their contributions to multi-criteria modeling, these approaches suffer from three major limitations. First, they lack effective quantification of semantic–emotional associations—such as thematic coherence and affective alignment—between attractions. Second, their weighting schemes tend to be static and insufficiently responsive to the heterogeneous and evolving preferences of diverse tourists. Third, most optimization strategies are confined to local search paradigms, limiting their capacity to identify globally optimal paths within the solution space.

3. Materials and Methods

This study proposes a method for constructing a TSC based on the integration of spatial and semantic information. A Spatial–Semantic Integrated Model for Tourist Attraction Representation (SSIM-TAR) is first proposed. This model captures the comprehensive attributes of attractions through the joint representation of spatial and semantic features and facilitates the analysis of tourist multidimensional evaluations and sentiment tendencies. Subsequently, spatial, associative, and thematic-sentiment relationships among attractions are defined and quantified. A Multidimensional Association Network (MAN-SRT) is then constructed to reveal the varying strengths of inter-attraction linkages across multiple dimensions. Finally, a graph-based propagation algorithm is employed to generate an adaptive-weighted TSC, serving as a personalized travel route recommendation. The framework systematically transitions from “attribute representation of attractions” to “construction of the association network” and ultimately to “generation of the TSC”. The implementation workflow is summarized in Figure 1 and will be elaborated in detail in Section 3.1, Section 3.2 and Section 3.3.

3.1. Spatial–Semantic Integrated Model for Tourist Attraction Representation

During the tourism decision-making process, tourists’ choice of attractions is a complex process involving the integration of multiple dimensions and heterogeneous data. It encompasses not only objective attributes like service amenities and geographical accessibility but also subjective elements such as thematic interests and emotional inclinations. Despite this complexity, traditional studies predominantly focus on single-dimensional analyses, neglecting a systematic characterization of attractions’ comprehensive attributes. To bridge this gap, we introduce the Spatial–Semantic Integrated Model for Tourist Attraction Representation (SSIM-TAR). This model consolidates multidimensional information—including foundational services, geographical positioning, thematic categories, and sentiment polarities—into structured feature vectors. By doing so, SSIM-TAR facilitates refined modeling for tourism planning, supporting more nuanced itinerary development.

3.1.1. Spatial and Semantic Feature Representation of Tourist Attraction Entity

When planning a trip, tourists typically consider a range of factors such as the unique offerings, optimal visiting time, geographical accessibility, and peer-generated reviews of attractions. To holistically model the multifaceted influences in tourism decision-making, this study formulates the feature representation of tourist attractions as follows:

A t t r = \{B a s S e r v, S p t L o c, T h e m e, E m o t i o n\},

(1)

where

A t t r

is the attraction entity.

B a s S e r v

is the basic information of attraction, such as the name of the attraction, opening time, official phone number, preferential policies, etc.

S p t L o c

is the spatial features of attraction.

T h e m e

is the thematic features of attraction.

{E m o t i o n}_{i}

is the sentiment features of attraction.

(1): Tourist attraction spatial feature representation

The spatial location of an attraction critically influences its accessibility and the rationality of a tourist’s itinerary, ultimately shaping the overall experiential quality. The spatial features characterize an attraction’s geographic location and surrounding environment. Formally, the spatial feature

S p t L o c

is defined as follows:

S p t L o c = {S p t C o o r, S p t R a n g e, S p t A d r, T o p o C o n, T o p o S l a},

(2)

where

S p t C o o r

is the spatial coordinate, which indicates the center point position of the attraction.

S p t R a n g e

delineates the geographical boundaries of the attraction using a sequence of coordinates.

S p t A d r

is the detailed address of the attraction, which describes the attraction’s position in natural language.

T o p o C o n

and

T o p o S l a

denote the subordination or inclusion of the attraction in terms of topology or management pattern.

(2): Tourist attraction thematic feature representation

Tourist choice behavior is shaped not only by objective factors such as geographical accessibility and service offerings but also by deeper subjective considerations including cultural significance and thematic identity. Conventional approaches, however, often reduce thematic attributes to single labels (e.g., “natural landscape” or “historical site”), overlooking the nuanced interplay of multi-layered cultural contexts and thematic diversity. To comprehensively characterize the thematic profile of an attraction and reflect the multidimensional quality of visitor experience, we model thematic features as a distribution over theme categories along with associated weight vectors. Formally, the thematic feature

T h e m e

is defined as follows:

T h e m e = {T o p i c, d i s t}, T o p i c = [T^{(1)}, T^{(2)}, \dots, T^{(x)}, \dots, T^{(n_{t})}], d i s t = [d^{(1)}, d^{(2)}, \dots, d^{(x)}, \dots, d^{(n_{t})}],

(3)

where

n_{t}

is the number of topic types of the viewpoint phrase.

T o p i c

is the theme category vector, and each element

T^{(x)}

represents a specific theme category.

d i s t

is the Thematic Distribution Vector (TDV), and each element

d^{(x)}

is the relative distribution weight of the theme corresponding to the theme category

T^{(x)}

.

To enable characterization and comparative analysis of core attraction attributes and corresponding tourist feedback patterns, the thematic distribution vector is computed as follows:

{d i s t}_{i} = \frac{{D P}_{i}}{{‖{D P}_{i}‖}_{1}}, {D P}_{i} = [{d p}_{i}^{(1)}, {d p}_{i}^{(2)}, \dots, {d p}_{i}^{(x)}, \dots, {d p}_{i}^{(n_{t})}], {d p}_{i}^{(x)} = \frac{{v n}_{i}^{(x)}}{{v n}_{i}} / \frac{{V N}^{(x)}}{V N},

(4)

where

i

is the attraction identifier.

{D P}_{i}

denotes the relative frequency vector of themes, with each element

{d p}_{i}^{(x)}

representing the proportion of theme

T^{(x)}

within the overall thematic profile.

{v n}_{i}^{(x)}

refers to the count of viewpoint-bearing phrases associated with theme

T^{(x)}

in the reviews of

{A t t r}_{i}

.

This process systematically constructs a TDV for each tourist site, facilitating quantitative comparison of intrinsic features and experiential feedback across attractions. The method effectively eliminates bias arising from variations in review volume and establishes a standardized analytical framework for cross-attraction comparative studies.

(3): Tourist attraction sentiment feature representation

Emotional feedback plays a pivotal role in tourist decision-making. Nevertheless, conventional sentence-level sentiment analysis overlooks the complexity of sentiment targets and contextual variations. To address this, a thematic-aware sentiment modeling framework is proposed that maps viewpoint phrases from tourist reviews into multidimensional sentiment vectors, aligned with the attraction’s thematic profile. Formally, the sentiment feature

E m o t i o n

is defined as follows:

E m o t i o n = {T o p i c, w e i g h t}, w e i g h t = [w^{(1)}, w^{(2)}, \dots, w^{(x)}, \dots, w^{{(n}_{t})}],

(5)

where

w e i g h t

is the Sentiment Weight Vector (SWV), each element

w^{(x)}

is the sentiment weight of the corresponding theme category

T^{(x)}

.

To amplify the discriminative power of positive sentiment in evaluating service performance, we adopt an asymmetric weighting scheme: positive = 2, neutral = 1, and negative = 0. For a given theme, the sentiment weight is derived as follows via weighted averaging over all relevant opinion phrases:

w_{i}^{(x)} = {\bar{{V S e n t i}_{k}}}_{k = 1}^{{v n}_{i}^{(x)}},

(6)

where

i

is the attraction identifier.

{V S e n t i}_{k}

indicates the sentiment weight of each opinion phrase. Consequently,

w_{i}^{(x)} ϵ [0, 2]

, with higher values reflecting stronger positive emotional resonance in that thematic domain.

The method constructs a structured sentiment profile across themes, enabling precise quantification of affective user experiences. It mitigates context bias caused by neglecting thematic relevance in traditional sentiment analysis. The asymmetric weighting mechanism further emphasizes the strategic importance of positive sentiment, thereby improving the fidelity of emotional signals in capturing experiential quality. Additionally, the normalized sentiment scale enables cross-attraction benchmarking, empowering destination managers to pinpoint high-performing themes and prioritize improvements in underperforming areas.

3.1.2. Granular Semantic Parsing Model for Tourism Review Textual Entity

Semantic features of tourist attractions (service quality and pricing, etc.) constitute fundamental decision-making factors in itinerary planning and exhibit synergistic interactions with spatial characteristics like proximity and accessibility. Tourist reviews represent a rich source of such semantic insights. However, their inherently unstructured and fragmented nature requires advanced text analytics to extract granular thematic and affective signals. Conventional topic modeling and sentiment classification approaches predominantly operate at the review level, failing to capture nuanced opinion-level semantics (such as sentiment toward ticket price, tour guide performance, or recreational offerings). This results in an incomplete understanding of how thematic relevance interacts with emotional valence. To bridge this gap, we propose the Granular Semantic Parsing Model for Tourism Review Textual Entities (GSPM-TRTE). Leveraging the morphological and syntactic structures of viewpoint phrases, this model facilitates fine-grained characterization of both thematic and affective attributes of attractions. By formalizing the syntactic patterns in tourist reviews, we define three core textual entities that convert raw review content into a structured, hierarchical representation, enabling systematic semantic parsing and downstream analysis.

Characteristic Words are key lexical terms that describe attraction attributes or tourist experiences, including service offerings, tourist behaviors, and intensity of emotional responses. Specifically, the characteristic word entity is defined as follows:

W o r d = {W N a m e, W P o s, W T y p e},

(7)

where

W o r d

is the characteristic word entity.

W N a m e

is the content of the characteristic word.

W P o s

is the part of speech of the characteristic word.

W T y p e

is the type of the characteristic word.

Viewpoint phrases refer to syntactically structured combinations of characteristic words that express tourists’ evaluations and sentiment orientations toward specific aspects of attractions. These phrases encapsulate both evaluative content and associated affective strength. The viewpoint phrase entity is defined as follows:

V i e w = {V P h r a s e, V W o r d s, V T o p i c, V S e n t i m e n t},

(8)

where

V i e w

is the viewpoint phrase entity.

V P h r a s e

is the content of the viewpoint phrase.

V W o r d s

is the set of characteristic word entities that compose the viewpoint phrase.

V T o p i c

is the topic type of viewpoint phrase.

V S e n t i m e n t

is the sentiment polarity of the viewpoint phrase.

Comment texts represent full-sentence natural language reviews posted by users on travel platforms, encompassing one or more viewpoint phrases describing holistic travel experiences. The comment text entity is defined as follows:

C o m m e n t = {T e x t, T e x t T i m e, T e x t W o r d s, T e x t V i e w s},

(9)

where

C o m m e n t

is the comment text entity.

T e x t

is the content of the comment text.

T e x t T i m e

is the posting time of the comment.

T e x t W o r d s

is the set of the characteristic word entities contained in the comment.

T e x t V i e w s

is the set of the viewpoint phrase entities contained in the comment.

This study treats the viewpoint phrase as the minimal unit for semantic feature representation. Considering the syntactic structure and grammatical characteristics of tourism reviews, a BERT-BiLSTM-CRF model is employed to extract characteristic words from raw text. Subsequently, the HanLP toolkit is used to perform syntactic parsing, and characteristic words are combined based on predefined grammatical rules to construct structured viewpoint phrases. Given the short-text nature of viewpoint phrases, first, each viewpoint phrase is transformed into a distributed vector representation via Word2Vec embedding. Then, a PyTorch (version 2.5.1)-based text classification model is applied to identify its thematic category. Finally, the SnowNLP pre-trained model is leveraged to determine the sentiment polarity of the phrase. Through the multi-stage feature extraction and fusion, the fine-grained semantic parsing results are generated from unstructured review texts, as shown in Figure 2. This provides the data foundation and technical support for structured representation of attraction evaluations and quantitative modeling of tourist sentiment.

By defining and extracting attraction attributes, a tourist attraction entity representation is constructed that integrates spatial and semantic information. Spatial features are derived from location data obtained via the Amap API, along with topological relationship analysis. Meanwhile, semantic features (including thematic and sentiment dimensions) are captured through structured textual entities to investigate multidimensional evaluation differences across attractions. These two types of features are then systematically integrated, which lays the groundwork for subsequent analysis of multi-dimensional associations among attractions and facilitates the construction of an attraction association network.

3.2. Representation and Construction of Tourist Attraction Multidimensional Association Network

Existing studies on tourist attraction associations predominantly focus on singular dimensions, resulting in a disconnection between spatial and semantic data. This fragmentation limits the ability to fully capture the intricate patterns driving tourist decision-making processes, thereby constraining insights into the dynamic and multifaceted nature of tourist behavior. To bridge this gap, we propose and construct the Multidimensional Association Network for Tourist Attractions (MAN-SRT). By integrating spatial adjacency, explicit associative co-occurrences, and implicit thematic-sentiment similarities, MAN-SRT constructs a three-dimensional relational network that encompasses spatial, associative, and thematic-sentiment dimensions. This framework enables the quantitative characterization of comprehensive inter-attraction relationships, offering a solid theoretical basis for optimizing travel itineraries and predicting tourist behaviors. The formal definition of MAN-SRT is as follows:

G = (V, E_{s}, E_{R}, E_{T}, W), V = \{{A t t r}_{1}, {A t t r}_{2}, \dots, {A t t r}_{i}, \dots, {A t t r}_{n}\}, W = \{e_{i j}\}, i, j = 1, 2, 3, \dots, n, e_{i j} = [{S P}_{i j}, {R C}_{i j}, {T S}_{i j}],

(10)

where

V

is the set of tourist attractions.

E_{s}

,

E_{R}

, and

E_{T}

are the set of spatial, associative, and thematic-sentiment relationships, respectively.

W

is the weight matrix whose weight value

e_{i j}

represents the weight value of attractions

i

and

j

in terms of three-dimensional associative relationship.

To address the limitations of uni-dimensional association analysis, we propose distinct strength quantification models for spatial, associative, and thematic-sentiment relationships to facilitate the synergistic computation of multidimensional associations. Specifically, spatial relationships reflect the spatial correlations required for tourists to move between different attractions, including road distance and transit time. Leveraging the gravity models [40] and core tenets of spatial econometrics, the spatial proximity (SP) quantifies the geographical adjacency of attractions by integrating transit costs with a decay function:

{S P}_{i j} = \{\begin{matrix} N O R M (\frac{1}{{C o s t}_{i j}}), {C o s t}_{i j} < θ_{s p} \\ 0, {C o s t}_{i j} \geq θ_{s p} \end{matrix}, {C o s t}_{i j} = \min_{mϵ \{walk, bus, drive\}} (D_{i j}^{(m)} \times T_{i j}^{(m)}),

(11)

where

i

and

j

are attraction identifiers.

{C o s t}_{i j}

is the minimum transit cost between attractions.

θ_{s p}

is the transit cost threshold.

m

is the travel mode, including three types:

w a l k

,

b u s

, and

d r i v e

.

D_{i j}^{(m)}

is the road distance between attractions under

m

-mode.

T_{i j}^{(m)}

is the travel time between attractions under

m

-mode.

Associative relationships represent explicit semantic linkages formed when tourists mention, compare, or recommend additional attractions within online reviews. For example, in a review of Hankou Riverside that states, “It is just sightseeing with lights; all the mentioned attractions were visited, but without any explanation, I only recognized Yellow Crane Tower and Yangtze River Bridge myself, and I personally think the ticket price is not worth it,” two attractions—Yellow Crane Tower and Yangtze River Bridge—are mentioned. Therefore, an associative relationship is considered to exist among these three attractions (Hankou Riverside, Yellow Crane Tower, and Yangtze River Bridge). Rooted in the fundamental concepts of Apriori association rule mining [41,42], the resonance correlation (RC) quantifies the strength of semantic associations formed through co-occurrence or linkage in tourist reviews by calculating support and itemset frequency:

{R C}_{i j} = \frac{S_{I J}}{S_{I}}, S_{I} = \frac{{N A}_{I}}{N A},

(12)

where

I

and

J

represent the collections of association itemsets containing attractions

i

and

j

. For instance, (Hankou Riverside, Yellow Crane Tower, Yangtze River Bridge) constitutes a single association itemset.

S_{I}

is the support degree of itemset

I

, reflecting its frequency of occurrence across all reviews.

N

is the number of all association itemsets, and

C o u n t (I)

is the number of itemsets containing attraction

i

.

The thematic-sentiment relationship encapsulates the similarity in emotional responses evoked by the core characteristics of an attraction, constituting an implicit semantic association. This association is quantified by constructing a sentiment-weighted thematic feature vector (TSV), which integrates topic distribution vectors with sentiment weight vectors, and assessing thematic-sentiment similarity (TS) via the JS divergence [43], reflecting the degree of alignment in affective experiences across attractions:

{T S}_{i j} = R e L U (1 - \frac{J S ({T S V}_{i} | | {T S V}_{j})}{θ_{t s}}), {T S V}_{i} = \frac{{d i s t}_{i} ⊙ {w e i g h t}_{i}}{{‖{d i s t}_{i} ⊙ {w e i g h t}_{i}‖}_{1}},

(13)

where

{T S V}_{i}

is the sentiment-weighted modified thematic-sentiment feature vector;

{d i s t}_{i}

is the Thematic Distribution Vector (TDV) of attraction

i

.

{w e i g h t}_{i}

is the Sentiment Weight Vector (SWV) of attraction

i

.

θ_{t s}

is the thematic-sentiment distance threshold.

A multidimensional association quantification model is developed by integrating SP, RC, and TS, effectively transcending the constraints of traditional single-dimension approaches. The static spatial relationships are complemented by dynamic semantic associations derived from tourists’ cognitive and affective behaviors, enabling joint computation of spatial and semantic linkages among attractions and improving the model’s adaptability to behavioral dynamics. Additionally, an explicit weighting strategy is employed to construct the MAN-SRT framework, which lays the structural groundwork for subsequent travel route recommendation systems.

3.3. Hybrid Random Walk-Optimized Simulated Annealing for Tourism Sentiment Chain Modeling

Conventional travel route planning methodologies primarily rely on static associations derived from singular dimensions, such as spatial distances or textual co-occurrences, thus failing to capture the dynamic and multifaceted nature of tourist behavior. To address this gap, we propose the Hybrid Random Walk-Optimized Simulated Annealing for Tourism Sentiment Chain Modeling (RW-OSA-TSCM). This framework employs a dynamic weight adjustment mechanism [44] to facilitate the integrated modeling of SP, RC, and TS. By overcoming the static limitations inherent in traditional path planning techniques, RW-OSA-TSCM offers robust theoretical underpinnings for the design of personalized tourism experiences.

3.3.1. Tourism Sentiment Chain Representation

The Tourism Sentiment Chain (TSC) is a structured representation of tourists’ dynamic decision paths between attractions and is defined as follows:

C h a i n = \{P a t h, C A I s, F o r c e\}, P a t h = [{A t t r}_{0}, {A t t r}_{1}, \dots, {A t t r}_{i}, \dots, {A t t r}_{n_{c}}], C A I s = [{C A I}_{1}, {C A I}_{2}, \dots, {C A I}_{i}, \dots, {C A I}_{n_{c}}], F o r c e = [{f o r c e}_{1}, {f o r c e}_{2}, \dots, {f o r c e}_{i}, \dots, {f o r c e}_{n_{c}}],

(14)

where

i

is the attraction identifier.

n_{c}

is the length of TSC, that is, the number of Attraction Transitions.

P a t h

is the attraction sequence of the chain, which represents the tour sequence from

{A t t r}_{0}

to

{A t t r}_{n_{c}}

.

C A I s

is the sequence of comprehensive attractions.

F o r c e

is the sequence of driving factors in which

{f o r c e}_{i}

is the dominant correlation type of transferring

{A t t r}_{i - 1} \to {A t t r}_{i}

.

The proposed model facilitates the multidimensional dynamic feature modeling of tourist behavior by explicitly representing the spatial trajectories, transition probabilities, and underlying driving mechanisms of travel paths.

3.3.2. Tourism Sentiment Chain Generation Method

The RW-OSA-TSCM approach enables multidimensional association-driven travel route recommendations by integrating a dynamic weight adjustment mechanism with a global optimization strategy. Specifically, this methodology is executed through two critical steps: generating random walk paths and optimizing these paths via an advanced simulated annealing algorithm, as shown in Figure 3. In path generation, Attraction Transition probabilities are dynamically computed using weighted SP, RC, and TS metrics, simulating tourist decision-making processes and forming a synthetic path pool. The Comprehensive Attractiveness Index (CAI), derived from selection frequencies, quantifies the likelihood of preference across attractions. During optimization, the objective function maximizes the cumulative CAI along a route. Leveraging a multi-neighborhood perturbation scheme and adaptive cooling schedule, the algorithm performs a global search to identify the optimal Tourism Service Chain (TSC).

(1): Path Generation via Random Walk

An initial travel itinerary is generated using a random walk (RW) process initiated from a selected attraction. At each step, the transition probability between nodes is calculated by integrating three interrelated dimensions weighted by adaptive coefficients:

{p r o b}_{i j}^{(t)} = α^{(t)} \cdot {S P}_{i j} + β^{(t)} \cdot {R C}_{i j} + γ^{(t)} \cdot {T S}_{i j},

(15)

where

α^{(y)}

,

β^{(y)}

, and

γ^{(y)}

are, respectively, the weight values of the three kinds of correlation relationships when choosing the

t

-th attraction during a random walk, and their initial weight values are derived from tourist preference profiles or historical mobility patterns, constrained so that

α + β + γ = 1

.

{S P}_{i j}

,

{R C}_{i j}

, and

{T S}_{i j}

represent the quantitative measures of inter-attraction relationships introduced in Section 3.2.

The model dynamically adjusts these weights based on the dominant influence at each transition. The prevailing factor is determined by the following expression:

{f o r c e}_{i j}^{(t)} = \arg m a x \{α^{(t)} \cdot {S P}_{i j}, β^{(t)} \cdot {R C}_{i j}, γ^{(t)} \cdot {T S}_{i j}\} .

(16)

For instance, if

S P

emerges as the leading factor, the weights are recalibrated as follows:

\overset{´}{α} = α \times δ, \overset{´}{β} = β + ε, \overset{´}{γ} = γ + ε,

(17)

where

δ

acting as a decay parameter and

ε

serving as a redistribution coefficient. The updated weights are subsequently normalized to maintain unit sum. Similar update mechanisms apply when RC or TS prevail, thereby enabling real-time adaptation to shifting visitor preferences.

Through multiple RW iterations, a diversified set of candidate itineraries is constructed into a path repository. A Comprehensive Attractiveness Index (CAI) is then derived by computing the relative frequency of transitions between attraction pairs:

{C A I}_{i j} = \frac{C o u n t (i \to j)}{\sum_{k \in N (i)} C o u n t (i \to k)},

(18)

where

C o u n t (i \to j)

denotes the number of times

j

follows

i

in the path repository, and

N (i)

represents the ensemble of all paths involving

i

.

(2): Path Optimization via Optimized Simulated Annealing

To mitigate premature convergence in conventional route planning and elevate the global quality of Tourism Sentiment Chains, the Optimized Simulated Annealing (OSA) is proposed, which combines a dynamic temperature control scheme with a multi-neighborhood perturbation strategy to iteratively refine candidate itineraries derived from random walks.

Within the OSA framework, path quality is quantified using an objective function built upon the Comprehensive Attractiveness Index (CAI). A multi-mode neighborhood search is introduced to balance exploration and exploitation, thereby enhancing both solution diversity and local refinement. Structural integrity is preserved through constraint enforcement aligned with the MAN-SRT graph model. Temperature reduction follows an exponential decay trajectory. Mathematically, let the current path be represented as follows:

P^{(k - 1)} = [{A t t r}_{0}, {A t t r}_{1}, \dots, {A t t r}_{i}, \dots, {A t t r}_{n_{c}}],

(19)

where

P^{(k - 1)}

is the path before the

k

-th iteration.

{A t t r}_{i}

denotes the

i

-th visited attraction apart from the starting one. The path score

S c o r e (P)

is defined as the sum of the CAI values between all adjacent attractions along the path:

S c o r e (P) = \sum_{i = 1}^{n_{c}} {C A I}_{i - 1, i},

(20)

To mitigate the risk of converging to local optima, we implement a multi-mode neighborhood search strategy that generates diverse solution variants at each iteration—such as Swap, Reverse, and Replace mutations—each embodying a distinct path-adjustment mechanism. This approach enhances both the diversity and adaptability of the evolutionary process. Additionally, path validity constraints are enforced to guarantee that all derived routes conform to the connectivity requirements of the MAN-SRT graph structure.

\overset{´}{P} = S t r a t e g y (P^{(k - 1)}, i, j), S t r a t e g y \in \{S w a p, R e v e r s e, R e p l a c e\},

(21)

where

S t r a t e g y

represents the mutation method at the

k

-th iteration, and

i

and

j

are the mutation positions. In terms of temperature control, OSA adopts an exponential attenuation method to reduce the temperature:

T^{(k)} = T^{(k - 1)} \cdot η,

(22)

where

T^{(k)}

represents the current temperature after the

k

-th iteration.

η \in (0, 1)

denotes the cooling rate. The acceptance probability of the new path is calculated based on the Metropolis criterion:

a c c e_{p r o} = \{\begin{matrix} 1, & i f S c o r e (\overset{´}{P}) > S c o r e (P^{(k - 1)}) \\ e x p (\frac{Δ S c o r e}{T}), & o t h e r w i s e \end{matrix},

(23)

where

Δ S c o r e = S c o r e (\overset{´}{P}) - S c o r e (P^{(k - 1)})

. When the score is higher, it will be accepted unconditionally; that is

P^{(k)} = \overset{´}{P}

. Otherwise, accept the poorer solution with a certain probability to jump out of the local optimal region.

The algorithm terminates upon reaching either a predefined minimum temperature

θ_{T}

or the iteration limit. The highest-scoring itinerary is designated as the TSC, representing the final route recommendation. The TSC formulation simultaneously accounts for geographical accessibility, thematic relevance, and affective alignment with traveler sentiment. Leveraging the adaptive weight mechanism, the system supports real-time responsiveness to preference shifts, thereby enabling highly personalized and emotionally resonant travel experiences.

4. Results

4.1. Data Collection

This study analyzes the top 178 renowned tourist attractions in Wuhan, China, ranked on the Ctrip travel platform (https://www.ctrip.com/), as research subjects, as shown in Figure 4. As an important center city in the central region, Wuhan boasts abundant natural and cultural tourism resources, including historical landmarks, natural sceneries, and modern urban landscapes. Additionally, high living standards of residents and their strong tourism consumption capacity contribute to a vibrant tourism market. Consequently, this dynamic market enables researchers to collect extensive tourist experience data and diverse visitor feedback through various tourism platforms.

(1): Geospatial Data

Geographic coordinates and detailed addresses for tourist attractions are obtained using the Amap keyword query API. Furthermore, data on road distances, travel times, and costs between attractions are collected through the path planning API.

(2): Tourism Review Text Data

Introduction information, review texts, review dates, and ratings for attractions are gathered from the Ctrip travel platform. Reviews posted between January 2020 and October 2024 underwent preprocessing steps, including removal of HTML tags and emojis, conversion from traditional to simplified Chinese, and elimination of non-relevant content. This resulted in a final dataset of 58,051 cleaned reviews. The number of reviews per attraction varied significantly based on popularity, ranging from 15 to 5000, with an average of 357 reviews per attraction (see Figure 5a). Review lengths ranged from 4 to 300 Chinese characters, averaging at 29 characters (see Figure 5b).

4.2. Attribute Extraction and Analysis of Tourist Attractions

This section derives semantic features of tourist attractions by analyzing tourists’ opinions and affective orientations embedded in online travel reviews. Leveraging the GSPM-TRTE framework developed in Section 3.1.2, viewpoint phrases are systematically extracted and analyzed from the textual corpus. Utilizing their frequency distributions, both the TDV and SWV for each attraction are quantitatively characterized.

4.2.1. Extraction of Textual Entity Information

The analysis of tourism review texts is conducted in three sequential steps: (i) identification and extraction of characteristic words; (ii) composition and extraction of viewpoint phrases; and (iii) thematic classification of viewpoint phrases.

(1): Identification and Extraction of Characteristic Words

Drawing from typical sentiment expression patterns observed in tourist reviews, this study categorizes characteristic words into five distinct classes: Object, Property, Degree, Action, and Other. Utilizing the BIO labeling scheme, a corpus of 3000 non-redundant reviews was compiled through stratified sampling and divided into training, validation, and test sets at a 6:2:2 ratio. Drawing on the GSPM-TRTE methodology proposed in Section 3.1.2, a BERT-BiLSTM-CRF model architecture is implemented. Empirical analysis reveals model convergence within 12–15 training epochs; hence, the training process is extended to 20 epochs to ensure full optimization. Performance evaluation is conducted on the test set, with results summarized in Table 1.

The evaluation results show that the model demonstrates strong performance in extracting all characteristic word categories, with F1-scores ranging from 0.85 to 0.96. Especially, it achieves particularly high precision (0.94–0.98) for Property, Degree, and Action types, indicating reliable identification. In contrast, the Object and Other categories exhibit relatively lower F1-scores due to the presence of specific product names related to attraction features. The balanced micro-average F1-score of 0.89 suggests robust overall performance. Precise Object and Property identification (F1 ≥ 0.87) enables accurate subject–attribute pairing in viewpoint mining.

(2): Composition and Extraction of Viewpoint Phrases

Based on the syntactic and semantic structures of opinion phrases in travel review texts, we designed mapping rules between syntactic structures and viewpoint phrase combinations according to the GSPM-TRTE method described in Section 3.1.2. Take the result of extracting characteristic words as an external dictionary. The HanLP toolkit was employed to perform syntactic analysis on the review texts for the extraction of viewpoint phrases. The extracted results were compared with manually annotated benchmarks, and the performance on selected attraction datasets is summarized in Table 2.

Evaluation results indicate that the model performs well on the overall dataset, effectively covering most real-world instances of opinion phrases. The F1-scores across different datasets are relatively close and all exceed 0.8, suggesting that the rule-based design is robust to imbalanced sample distributions. The weighted recall (0.9455) is slightly higher than the weighted precision (0.8949), indicating that the system prioritizes high coverage to capture as many potential opinion phrases as possible, at the cost of a small number of false positives. A weighted accuracy of 0.9190 demonstrates that the proposed HanLP-based syntactic rule approach effectively extracts tourists’ opinions from travel reviews, laying a solid foundation for subsequent semantic feature analysis of scenic attractions.

(3): Thematic Classification of Viewpoint Phrases

Based on the “4E” model of tourism experience [45], the thematic types of viewpoints are categorized into ten categories: “Transportation and Travel”, “Environment and Landscape”, “Services and Facilities”, “Food and Shopping”, “Price and Consumption”, “Crowd Flow and Density”, “History and Culture”, “Family and Education”, “Animals and Plants”, and “Leisure and Entertainment”. Given that the core terms within viewpoint phrases strongly indicate their thematic affiliations, we annotated topic labels for high-frequency core words (appearing over 10 times across all attractions), resulting in a labeled dataset of 88,145 viewpoint phrases. Based on the GSPM-TRTE methodology outlined in Section 3.1.2, a PyTorch-based text classification model was developed to enable automatic topic identification of viewpoint phrases extracted from tourism reviews. The model is evaluated on the test set, with the accuracy rates detailed in Table 3.

The results show that the model performs excellently in most categories. Specifically, the categories “Transportation and Travel”, “Environment and Landscape”, and “Price and Consumption” exhibit high F1 scores of 0.9089, 0.9388, and 0.9651, respectively. These categories have lower lexical diversity and focus on core vocabulary, allowing the model to more accurately capture their characteristics. In contrast, the categories “History and Culture”, “Services and Facilities”, and “Animals and Plants” have slightly lower classification accuracy, with F1 scores of 0.8118, 0.8894, and 0.8908, respectively. These categories present greater challenges for the model due to higher lexical diversity and significant differences in characteristic words across attractions. Overall, the model achieves a weighted average F1 score of 0.9181, indicating its effectiveness in identifying thematic categories of viewpoints and providing a solid foundation for subsequent semantic association tasks.

4.2.2. Attraction Entity Attribute Analysis

(1): Thematic Features

Following the extraction of textual entity features, an analysis was conducted on the distribution ratios of various types of opinion phrases present in the review texts for all tourist attractions under investigation, as illustrated in Figure 6. “Environment and Landscape”, “Services and Facilities”, and “Leisure and Entertainment” are the primary themes of interest for tourists. In contrast, “Transportation and Travel”, “Food and Shopping”, and “Price and Consumption” receive less attention.

After extracting thematic features of tourist attractions using Equation (4) from Section 3.1.2, the thematic distribution vectors of the top 9 attractions in Wuhan’s popular attractions list were analyzed to demonstrate that the indicators align with the cognitive expectations. In each subplot, the sector azimuth encodes the ten thematic dimensions of the attraction, representing the multidimensional experiential quality perceived by visitors. The radial length quantifies the proportion of visitor attention (%) allocated to each theme, as shown in Figure 7. Both the Yellow Crane Tower and Qingchuan Pavilion demonstrate over 25% attention toward the “History and Culture” theme, with attention levels hovering around 10% across other dimensions—highlighting the “History and Culture” experience as their dominant appeal. Likewise, Happy Valley and Maya Beach Water Park emphasize “Leisure and Entertainment”. Wuhan University, Haichang Polar Ocean World, and East Lake Ocean Paradise focus on “Family and Education” and “Animals and Plants”. Wuhan University discusses “History and Culture” more than the other two, while East Lake Ocean Paradise is more noted for its “Price and Consumption”—Wuhan Garden Expo Park and Mulan Prairie show balanced performances across various themes without any dominant one. The construction and analysis of thematic distribution vectors yield results that are highly consistent with the actual functional positioning and tourist perceptions of attractions. Significant differences among attractions indicate that thematic distribution vectors effectively capture core thematic features and subtle variations.

(2): Sentimental Features

Upon extraction of the textual entity feature from tourism review texts, an analysis was subsequently performed on the proportion of different sentiment polarity types of opinion phrases present in the review texts for all tourist attractions under investigation, as depicted in Figure 8. Globally, positive sentiments account for over 70% of the “Animals and Plants”, “Family and Education”, and “Price and Consumption” categories, indicating high satisfaction with natural resources, good family experiences, and reasonable pricing in Wuhan. However, positive sentiments account for only 23% of the “Crowd Flow and Density” category, suggesting frequent dissatisfaction due to overcrowding or long waiting times. Positive sentiments in the “Services and Facilities” category are also low at 50%, indicating inconsistent service quality requiring improvement.

To assess variations in tourists’ emotional inclinations across different thematic dimensions of attractions, we employ the Emotional Superiority Index (ESI), which quantifies the deviation between an individual attraction’s sentiment weight vector (SWV) and the average SWV of all attractions. Following the extraction of thematic characteristics for scenic areas using the methodology detailed in Section 3.1.2, we analyzed the ESI for the top nine attractions listed on the popular Wuhan scenic spots ranking. The analysis results are presented in Figure 9. For each subplot, the sector azimuth encodes the ten thematic dimensions of the attraction, representing the multidimensional experiential qualities perceived by visitors. The radial length signifies the intensity of emotional superiority, where greater absolute values denote more pronounced emotional leanings—positive values reflect favorable sentiments, whereas negative values indicate unfavorable biases. Notably, all examined popular attractions demonstrate positive thematic emotional tendencies, with Wuhan University exhibiting the highest ESI, particularly in “Transportation and Travel” and “History and Culture”, aligning with its convenient metro access and rich historical heritage. Other attractions show varying sentiment advantages in different themes. For instance, Maya Beach Water Park and Qingchuan Pavilion excel in “Transportation and Travel” and “Services and Facilities”. East Lake Ocean Paradise stands out in “Price and Consumption” and “Leisure and Entertainment”. Mulan Prairie only exceeds the average in “Environment and Landscape”, indicating a need for improvements in other aspects. Tourists interested in “History and Culture” or “Services and Facilities” may prefer excluding Mulan Prairie from their travel plans to avoid suboptimal experiences.

4.3. Construction and Analysis of Multi-Dimensional Correlation Network of Tourist Attractions

Upon acquiring the spatial and semantic features of attractions, the strength of inter-attraction associations is quantified using the formulas detailed in Section 3.2. This step facilitates the construction of a Multi-Association Network for Scenic Route Travel (MAN-SRT). Subsequent analysis of the network’s structural characteristics aims to uncover underlying connections and patterns among tourist attractions, thereby providing robust data support for personalized recommendation systems.

4.3.1. Quantification of Multi-Dimensional Correlation Strength

Referring to the “2024 Annual Commuting Monitoring Report of China’s Major Cities” [46], 5 km and 20 min are set as the thresholds for commuting distance and time. Under this threshold, 2110 proximity relationships are obtained, which account for about 7% of the all-connected network.

Based on the review texts and their entity recognition results, a dictionary of attraction names and keywords is constructed. Association and co-occurrence frequencies are then extracted using regular expression matching. Finally, the association resonance degree between each pair of attractions is calculated. To excavate fine-grained association relations, low-frequency associations are also taken into consideration, and the minimum support degree (Support) is set to 0.0001. To ensure practical value, the minimum lift degree (Lift) is set to 2.0, and the confidence degree (Confidence) of the association rules is taken as the resonance degree of the associations. This yields 836 association relations, accounting for about 2.6% of the whole connection network.

The Jensen–Shannon (JS) divergence quantifies the dissimilarity between two sentiment-weighted thematic feature vectors, followed by computing thematic-sentiment similarity via Equation (13) in Section 3.2. Given that POI types inherently reflect the functional essence of attractions and tourists’ primary visitation intentions, and noting the long-tailed distribution of global JS divergence values, the similarity threshold is derived based on intra-POI homogeneity. This strategy ensures semantic consistency while preserving inter-category heterogeneity [47]. The average Jensen–Shannon divergence among attractions with the same scenic POI type serves as the thematic-sentiment distance threshold to calculate the thematic-sentiment similarity between attractions. Consequently, 10,684 thematic-sentiment correlations are identified, accounting for about 34% of the fully connected network.

4.3.2. Analysis of the Structural Characteristics of MAN-SRT

Within the Multidimensional Associative Network (MAN-SRT), the spatial proximity network, resonance correlation network, and Thematic Sentiment Network interlace via their unique structural features and functional properties, engendering a notable synergy. Figure 10 delineates the network topology across these dimensions. The spatial proximity network, characterized by SP edge weights, encapsulates spatial clustering tendencies constrained by geographic distances. The resonance coherence network, defined by RC edge weights, reflects cross-regional semantic linkages derived from tourists’ cognitive associations. The thematic sentiment network, quantified through TS edge weights, elucidates thematic similarities driven by affective dynamics.

To demonstrate the morphological heterogeneity and functional complementarity of the multidimensional network, we performed a structural analysis on the three thematic layers of attraction associations (Table 4). Fundamental metrics (such as edge density and average path length) quantify network connectivity efficiency, thereby indicating the level of regional tourism integration. Centrality indices identify key hubs and core attractions, while modularity scores assess the statistical significance of functional communities, uncovering emergent spatial or semantic groupings within the tourism network. For each dimensional layer, we derived a composite influence score for each attraction through a weighted aggregation of multiple centrality measures. Subsequent structural analysis was conducted to further interpret the topological characteristics and functional roles of each network dimension.

(1): Spatial Proximity Network

In Figure 10a, the spatial proximity network primarily captures the geographical accessibility among attractions, reflecting tourists’ propensity to select travel routes based on transportation convenience and road distances. Table 5 identifies the top 10 attractions based on composite influence scores within the spatial proximity network. Hongshan Square leads with a score of 0.7257, underscoring its role as a pivotal hub node. It also functions as an interchange station for Metro Lines 2 and 4, positioning it as a critical transportation nexus in urban infrastructure. The majority of these top 10 attractions are concentrated in the central urban district and distributed along the banks of the Yangtze River, corroborating the influence of the “confluence of two rivers” geographic configuration on the spatial organization of tourism activities [48].

The spatial proximity network exhibits high modularity and global clustering coefficients, indicative of multiple distinct communities wherein attractions display robust internal connections, indicative of localized clustering phenomena. However, with a low average degree of centrality and a high network diameter, including 12 connected subgraphs, this network shows considerable dispersion, as most attractions are linked to only a few neighboring locations. The extensive geographical spread of Wuhan imposes constraints on the accuracy of tourism behavior modeling through route planning based on road network distances.

(2): Resonance Correlation Network

Figure 10b depicts the resonance correlation network, illustrating the semantic co-occurrence relations and tourists’ subjective cognitive associations among attractions, thereby revealing tourists’ mental perceptions of these sites. Table 6 highlights the top 10 attractions by composite influence within this network. Hankou Riverside tops the list with a score of 0.6993, establishing itself as a pivotal node in the resonance correlation network. As a critical component of Wuhan’s riverside scenic belt, it serves both as a favored recreational area for residents and the primary choice for tourists seeking to experience the city’s urban charm. East Lake, scoring 0.5984, ranks second, renowned for its expansive lake and picturesque landscapes, while also being steeped in cultural heritage, including landmarks like the Qu Yuan Shrine and Chu Tian Terrace, underscoring its significance in tourist experiences. Jianghan Road Pedestrian Street, positioned third, emphasizes its role as the heart of Wuhan’s commercial center.

The resonance coherence network exhibits high modularity but low global clustering coefficients, suggesting pronounced community structures with weak internal connectivity and minimal inter-community links, manifesting a notable “semantic isolation”. This indicates that attractions with resonance and coherence connections form distinct semantic clusters. Extremely low average degree centrality and long average shortest path lengths signify a loosely connected network with inefficient information dissemination, severely limiting its utility for broad recommendation systems or path planning tasks.

(3): Thematic Sentiment Network

In Figure 10c, the thematic sentiment network underscores thematic coherence and emotional resonance among attractions, emphasizing the continuity and coherence of tourists’ emotional experiences. Table 7 identifies the top 10 attractions based on composite influence within this network. Wuhan Garden Expo Park, Hankou Riverside, and Zhongshan Park rank prominently due to their multidimensional thematic and emotional representations. High rankings of sites like Jiuzhen Mountain and Happy Jungle Cherry Blossom Theme Park indicate a prevalence of natural landscapes and leisure activities as dominant themes across Wuhan. Nevertheless, attractions such as Jiuzhen Mountain, Jinligou, and Yaojia Mountain, owing to their remote geographical locations, may not be ideal candidates for unconditional inclusion in tourism route recommendations. Thus, while thematic sentiment connections effectively capture experiential consistency among attractions, they should not serve as the sole criterion for tourism route planning.

High average degree centrality and global clustering coefficients, combined with exceptionally short average shortest path lengths, demonstrate a densely interconnected network structure. Low modularity and degree distribution power-law exponents indicate ambiguous community boundaries, characterized by substantial overlap and thematic–emotional similarities across attractions, with indistinct divisions between thematic categories.

A comparative structural analysis of the three networks reveals several key insights. The spatial proximity network demonstrates limited connectivity, constraining its applicability in modeling cross-regional tourist behavior. While the resonance correlation network captures subjective cognitive patterns among tourists, its sparse connectivity hampers performance in large-scale recommendation tasks. Conversely, the thematic sentiment network excels in generating immersive and emotionally coherent travel routes, offering distinct advantages in affective sequence modeling and personalized recommendation systems.

Notably, the geographic constraints embedded in the spatial proximity network provide a foundational framework for the resonance correlation network, ensuring that cognitive associations remain grounded in real-world transportation feasibility. Furthermore, the high connectivity of the thematic sentiment network effectively bridges the 12 disconnected subgraphs identified in the spatial proximity network through thematic–emotional similarity, thereby introducing affective drivers into cross-regional route formation. Crucially, the interplay between the resonance correlation network’s individual-level cognitive preferences and the thematic sentiment network’s population-level emotional consensus establishes a complementary micro–macro analytical framework. This dual-perspective integration facilitates group-aware path planning that harmonizes individual behavioral tendencies with collective experiential trends. Accordingly, the development of a multi-dimensionally driven tourism emotion system offers significant potential for enhancing tourist experiences while optimizing resource allocation across tourism destinations.

4.4. Tourism Sentiment Chain Generation and Scenario Validation

Building upon the TSC generation methodology outlined in Section 3.3.2, this section first employs an RW-based approach to route generation, computing the integrated attractiveness between attractions to construct the TSC through OSA. The adjacency relationships between attractions in the resulting TSC are then benchmarked against attraction recommendations from two major online platforms, “Qunar” and “Mafengwo”. This comparative analysis highlights the superiority of TSC in two key aspects: geographical accessibility and emotional experience coherence.

4.4.1. Path Generation and Comprehensive Attractiveness Calculation Based on RW

Adhering to the methodology detailed in Section 3.3.2 (1), each attraction is sequentially designated as the starting point with uniform initialization parameters (

α = β = γ = \frac{1}{3}

). Utilizing the mean diameter of the spatial proximity and resonance correlation networks (set at 10) as the path length, 1000 random walk simulations are conducted to mitigate stochastic variability and ensure statistical significance. This process yielded 2577 Comprehensive Attractiveness Index (CAI) relationships across 178 attractions, as illustrated in Figure 11. Constructing a CAI association network with attractions as nodes and CAI as edge weights, we performed an exhaustive network characteristic analysis, summarized in Table 8.

The CAI association network exhibits markedly higher connection density compared to the spatial proximity and resonance correlation networks, underscoring a richer and more diversified interconnection landscape among scenic spots. With a modularity value of 0.5540, the network maintains discernible community structures. The average betweenness centrality of 0.0106 indicates the absence of dominant hub nodes, suggesting a balanced nodal hierarchy and enhanced network resilience. The degree distribution power-law exponent of 0.4143 reveals the existence of a limited number of highly connected core nodes, confirming the network’s scale-free nature without forming a fully connected graph.

By synthesizing spatial proximity, resonance correlation, and thematic sentiment similarity, the CAI network effectively addresses the limitations inherent in single-dimensional frameworks. It preserves local clustering tendencies while ensuring path connectivity and mitigating excessive segregation. Consequently, this integrative approach enhances the comprehensiveness and coherence of TSC recommendations, providing a robust foundation for optimizing tourist experiences and resource allocation.

4.4.2. Tourism Sentiment Chain Realization and Comparative Analysis

Following the procedure outlined in Section 3.3.2 (2), we configured the simulated annealing parameters with an initial temperature of 5, a cooling rate of 0.95, and 1000 iterations. This setup ensures strong global exploration in the early stages of optimization while allowing gradual convergence toward locally optimal solutions. Given that short-distance itineraries typically encompass 4–8 attractions and considering the CAI network diameter of 8, we employed the OSA method to generate TSCs of fixed length 8. Taking the Tourism Emotional Chain starting from the Yellow Crane Tower as an example, as shown in Figure 12, it can be observed that the TSC exhibits certain spatial continuity, ensuring its accessibility.

The CAI between adjacent scenic spots within this TSC, along with their rankings among all potential options, underscores the holistic integration of multi-dimensional associations, as elaborated in Table 9. Centered around the Yellow Crane Tower, the entire route exhibits pronounced geographic proximity, exemplified by connections like “Yellow Crane Tower → Shouyi Park” and “Turtle Mountain → Tiejiamen Pass”, which demonstrate excellent geographical accessibility. Approximately half of the neighboring attractions exhibit notable resonance correlation, while several segments show high thematic sentiment similarity, such as “Shouyi Park → Ziyang Park” and “Jianghan Road Pedestrian Street → Hankou Main Street”. This coherence fosters enhanced psychological engagement and identification among tourists.

Unlike routes based solely on one type of association, this itinerary integrates multi-dimensional CAI metrics, offering tourists a rich tapestry of interconnected experiences at various stages. Starting from the Yellow Crane Tower, the TSC progressively transitions from historical gravitas to nostalgic commercial reminiscence. The route strikes a balance between well-known and niche attractions, encompassing iconic landmarks like the Yellow Crane Tower and Jianghan Road alongside lesser-known but culturally significant sites such as Tiejiamen Pass and Turtle Mountain. The itinerary is structured logically, flowing through historical landmarks → park green spaces → mountain vistas → ancient city gates → commercial streets, creating a coherent and engaging tourist pathway with distinct thematic transitions.

To assess the efficacy of the proposed TSC in accounting for route distances between attractions, the intensity of tourist associative connections, and the smooth transition of thematic elements, a comparative analysis was conducted. The adjacent attractions recommended by the TSC were juxtaposed against those suggested by leading tourism platforms, specifically Qunar (https://flight.qunar.com/) and Mafengwo (https://www.mafengwo.cn/). This comparison aimed to validate the TSC’s capability to optimize itinerary planning by balancing geographical contiguity, cognitive resonance among tourists, and thematic coherence across different attractions.

To evaluate geographical accessibility, we compared the spatial distances between recommended attractions generated by the TSC and those suggested by popular tourism platforms, Qunar and Mafengwo. As depicted in Figure 13, shorter spatial distances indicate higher accessibility. The TSC-recommended scenic spots have an average spatial distance of 2435 m, markedly superior to Qunar’s 4472 m and Mafengwo’s 2814 m. Furthermore, 69.38% of the TSC-recommended scenic spots fall within a 3 km radius, significantly exceeding Qunar’s 24.68% and Mafengwo’s 59.23%. These findings underscore the TSC’s superior geographical accessibility compared to both Qunar and Mafengwo. Notably, nearly 86% of the TSC-recommended scenic spots are located within a 5 km radius, thereby enhancing spatial efficiency and providing tourists with more feasible and enjoyable travel experiences.

To assess thematic coherence, we compared the theme sentiment feature vector differences between adjacent scenic spots recommended by the TSC and those suggested by the Qunar and Mafengwo platforms. As shown in Figure 14, smaller differences in sentiment vectors reflect greater thematic coherence. The TSC achieves an average theme sentiment vector difference of 0.1235—markedly lower than Qunar’s 0.2047 and Mafengwo’s 0.1983. Furthermore, 73.71% of TSC-recommended spot pairs exhibit a JS divergence below 0.15, surpassing Qunar’s 36.96% and Mafengwo’s 39.80%. These results demonstrate that the TSC significantly outperforms both platforms in maintaining thematic coherence across itineraries. The gradual and stable evolution of theme sentiment features along the TSC ensures a cohesive and immersive tourist experience, effectively enhancing experiential coherence and emotional engagement throughout the journey.

In conclusion, the TSC proposed in this study exhibits substantial improvements in geographical accessibility and thematic coherence of recommended attractions by integrating multiple association dimensions. In contrast to the Qunar platform, which tends to generate homogeneous recommendations dominated by popular attractions, and Mafengwo, whose overreliance on spatial adjacency weakens thematic relevance, the TSC effectively balances spatial proximity and thematic–emotional similarity. This dual optimization ensures both logistical efficiency and experiential coherence in travel itineraries. By enabling smooth transitions across both geographic and affective dimensions, the TSC provides a novel and effective solution to overcome the limitations of conventional tourism recommendation systems, paving the way for more intelligent, user-centered travel planning frameworks.

5. Conclusions

The TSC model proposed in this paper provides an innovative solution for tourism planning by integrating spatial and semantic multi-dimensional associations. Compared with traditional recommender systems that rely only on a single dimension (e.g., spatial proximity or popularity), the model ensures the accessibility of the itinerary through the spatial correlation network and, at the same time, captures the tourists’ deeper needs for the theme, emotion, and experience of the attractions with the help of text mining technology to construct routes that conform to the logic of geography as well as tourists’ psychological expectations. This multiple-constraints mechanism not only improves the scientific nature of the recommendation results but also provides a basis for attraction cooperation. Neighboring attractions can design linked products through complementary themes, forming a synergistic effect of regional tourism ecology.

Despite its contributions, the current study has several limitations that warrant further investigation. These include incomplete characterization of tourist attraction entity features, limited automation in association construction, and the absence of temporal dynamics in the TSC. Future enhancements could incorporate real-time data streams to enable dynamic weight adjustment of emotional chains, aligning with the growing demand for personalized and fine-grained tourism management. To address these challenges, future research should prioritize expanding data sources through multi-modal integration of scenic spot entities with rich human mobility and behavioral datasets. Enhanced incorporation of deep learning architectures and large language models could further improve feature extraction and semantic understanding. Particular emphasis should be placed on temporal modeling of scenic spot semantics to strengthen both the cultural sensitivity and time relevance of emotional chain representations.

Author Contributions

Conceptualization, R.L. and J.W.; methodology, B.L.; validation, B.L. and J.W.; formal analysis, B.L.; resources, R.L.; data curation, A.S.; writing—original draft preparation, B.L.; writing—review and editing, R.L. and J.W.; visualization, B.L.; supervision, R.L. and A.S.; project administration, R.L. and A.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China under Grant No. U20A2091.

Data Availability Statement

The datasets used in this study are publicly available and have been cited appropriately in the References section. No new data were generated or collected during this study.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Qu, F.; Ji, K.M.; Morgan, N. The iterative innovation of tourism digital platforms: A new framework. Curr. Issues Tour. 2024, 1–8. [Google Scholar] [CrossRef]
Liu, Y.; Hsiao, A.; Ma, E. Segmenting Tourism Markets Based on Demand Growth Patterns: A Longitudinal Profile Analysis Approach. J. Hosp. Tour. Res. 2021, 45, 967–997. [Google Scholar] [CrossRef]
Hua, Z. A Study on the Management Model of Smart Tourism Industry under the Era of Big Data. In Proceedings of the 2018 International Conference on Information Science and System (ICISS 2018), Jeju, Republic of Korea, 27–29 April 2018; pp. 102–106. [Google Scholar]
Qi, S.; Chen, N. Understanding Macao’s Destination Image through User-generated Content. J. China Tour. Res. 2019, 15, 503–519. [Google Scholar] [CrossRef]
Minhui, C.; Kim, N.J. An Analysis on the Relationship between the Image of Tourism Destinations, and the Perceived Usefulness of Online Tourism Information Using Data Mining Techniques. J. Tour. Leis. Res. 2022, 34, 5–28. [Google Scholar] [CrossRef]
Xu, H.; Lv, Y. Mining and Application of Tourism Online Review Text Based on Natural Language Processing and Text Classification Technology. Wirel. Commun. Mob. Comput. 2022, 2022, 9905114. [Google Scholar] [CrossRef]
Li, Q.; Li, S.; Zhang, S.; Hu, J.; Hu, J. A Review of Text Corpus-Based Tourism Big Data Mining. Appl. Sci. 2019, 9, 3300. [Google Scholar] [CrossRef]
Yu, Y. Analysis and Study on Intelligent Tourism Route Planning Scheme Based on Weighted Mining Algorithm. Sci. Program. 2022, 2022, 5495822. [Google Scholar] [CrossRef]
Chang, L.; Sun, W.; Zhang, W.; Bin, C.; Gu, T. Review of tourism route planning. CAAI Trans. Intell. Syst. 2019, 14, 82–92. [Google Scholar]
Xu, D.; Deng, L.; Fan, X.; Ye, Q. Influence of travel distance and travel experience on travelers’ online reviews: Price as a moderator. Ind. Manag. Data Syst. 2022, 122, 942–962. [Google Scholar] [CrossRef]
Xue, F.; Dong, L.; Gao, B.; Yu, Z.; Taras, V. Understanding the relationships between distances and herd behavior in online reviews: The moderating effects of hospitality experience. Int. J. Contemp. Hosp. Manag. 2020, 32, 3295–3314. [Google Scholar] [CrossRef]
Chen, F.-W.; Guevara Plaza, A.; Alarcon Urbistondo, P. Automatically extracting tourism-related opinion from Chinese social media. Curr. Issues Tour. 2017, 20, 1070–1087. [Google Scholar] [CrossRef]
Laily, I.L.; Budi, I.; Santoso, A.B.; Putra, P.K. Mining Indonesia Tourism’s Reviews to Evaluate the Services Through Multilabel Classification and LDA. In Proceedings of the 2020 International Conference on Electrical Engineering and Informatics (ICELTICS 2020), Aceh, Indonesia, 27–28 October 2020; IEEE: New York, NY, USA, 2020; pp. 70–76. [Google Scholar]
Lee, H.; Kang, Y. Mining tourists’ destinations and preferences through LSTM-based text classification and spatial clustering using Flickr data. Spat. Inf. Res. 2021, 29, 825–839. [Google Scholar] [CrossRef]
Wang, Z.; Udomwong, P.; Fu, J.; Onpium, P. Destination image analysis and marketing strategies in emerging panda tourism: A cross-cultural perspective. Cogent Bus. Manag. 2024, 11, 2364837. [Google Scholar] [CrossRef]
Gour, A.; Aggarwal, S.; Erdem, M. Reading between the lines: Analyzing online reviews by using a multi-method Web-analytics approach. Int. J. Contemp. Hosp. Manag. 2021, 33, 490–512. [Google Scholar] [CrossRef]
Mishra, R.K.; Urolagin, S.; Jothi, J.A.A.; Neogi, A.S.; Nawaz, N. Deep Learning-based Sentiment Analysis and Topic Modeling on Tourism During COVID-19 Pandemic. Front. Comput. Sci. 2021, 3, 775386. [Google Scholar] [CrossRef]
Ren, G.; Hong, T. Investigating Online Destination Images Using a Topic-Based Sentiment Analysis Approach. Sustainability 2017, 9, 1765. [Google Scholar] [CrossRef]
Li, Q.; Li, S.; Hu, J.; Zhang, S.; Hu, J. Tourism Review Sentiment Classification Using a Bidirectional Recurrent Neural Network with an Attention Mechanism and Topic-Enriched Word Vectors. Sustainability 2018, 10, 3313. [Google Scholar] [CrossRef]
Fu, M.; Pan, L. Sentiment Analysis of Tourist Scenic Spots Internet Comments Based on LSTM. Math. Probl. Eng. 2022, 2022, 5944954. [Google Scholar] [CrossRef]
Mou, T.; Wang, H. Online comments of tourist attractions combining artificial intelligence text mining model and attention mechanism. Sci. Rep. 2025, 15, 1121. [Google Scholar] [CrossRef]
Lin, P.; Chen, L.; Luo, Z. Analysis of Tourism Experience in Haizhu National Wetland Park Based on Web Text. Sustainability 2022, 14, 3011. [Google Scholar] [CrossRef]
Wang, Y.; Luo, P.; Liu, Y.; Zheng, W. The Study on Tourist Satisfaction of Meizhou Island Based on Network Comments. J. Fujian Norm. Univ. Nat. Sci. Ed. 2018, 34, 83–92. [Google Scholar]
Tu, S.-F.; Hsu, C.-S.; Lu, Y.-T. Improving RE-SWOT Analysis with Sentiment Classification: A Case Study of Travel Agencies. Future Internet 2021, 13, 226. [Google Scholar] [CrossRef]
Alaei, A.; Wang, Y.; Bui, V.; Stantic, B. Target-Oriented Data Annotation for Emotion and Sentiment Analysis in Tourism Related Social Media Data. Future Internet 2023, 15, 150. [Google Scholar] [CrossRef]
Huang, F.; Xu, J.; Weng, J. Multi-Task Travel Route Planning with a Flexible Deep Learning Framework. IEEE Trans. Intell. Transp. Syst. 2021, 22, 3907–3918. [Google Scholar] [CrossRef]
Tlili, T.; Krichen, S. A simulated annealing-based recommender system for solving the tourist trip design problem. Expert Syst. Appl. 2021, 186, 115723. [Google Scholar] [CrossRef]
Aoyagi, S.; Le, Y.; Shimizu, T.; Takahashi, K. Mobile Application to Provide Traffic Congestion Estimates and Tourism Spots to Promote Additional Stopovers. Future Internet 2020, 12, 83. [Google Scholar] [CrossRef]
Zhou, X.; Peng, J.; Wen, B.; Su, M. Navigation Route Planning for Tourism Intelligent Connected Vehicle Based on the Symmetrical Spatial Clustering and Improved Fruit Fly Optimization Algorithm. Symmetry 2024, 16, 159. [Google Scholar] [CrossRef]
Liang, K.; Liu, H.; Shan, M.; Zhao, J.; Li, X.; Zhou, L. Enhancing scenic recommendation and tour route personalization in tourism using UGC text mining. Appl. Intell. 2024, 54, 1063–1098. [Google Scholar] [CrossRef]
Hu, G.; Qin, Y.; Shao, J. Personalized travel route recommendation from multi-source social media data. Multimed. Tools Appl. 2020, 79, 33365–33380. [Google Scholar] [CrossRef]
Ravish, R.; Rangaswamy, S.; Arpitha, V.; Vasuprada, U. User preference-based intelligent road route recommendation using SARSA and dynamic programming. J. Control Decis. 2023, 10, 443–453. [Google Scholar] [CrossRef]
Potsiou, C.; Ioannidis, C.; Soile, S.; Boutsi, A.-M.; Chliverou, R.; Apostolopoulos, K.; Gkeli, M.; Bourexis, F. Geospatial Tool Development for the Management of Historical Hiking Trails-The Case of the Holy Site of Meteora. Land 2023, 12, 1530. [Google Scholar] [CrossRef]
Daniel, C.B.; Manju, V.S. WebGIS enabled route planning system for tourists-a case study. In Proceedings of the Emerging Trends in Engineering, Science and Technology for Society, Energy and Environment, Thrissur, India, 18–20 January 2018; pp. 157–163. [Google Scholar]
Santamaria-Granados, L.; Mendoza-Moreno, J.F.; Ramirez-Gonzalez, G. Tourist Recommender Systems Based on Emotion Recognition—A Scientometric Review. Future Internet 2021, 13, 2. [Google Scholar] [CrossRef]
Chen, J.; Huang, L.; Wang, C.; Zheng, N. Discovering Travel Spatiotemporal Pattern Based on Sequential Events Similarity. Complexity 2020, 2020, 6632956. [Google Scholar] [CrossRef]
Hua, Z. An RFID-Enabled IoT-Based Smart Tourist Route Recommendation Algorithm. Mob. Inf. Syst. 2022, 2022, 9866086. [Google Scholar] [CrossRef]
Fan, Q.H. Study on travel route recommendation method based on improved ant colony optimisation algorithms. Int. J. Comput. Appl. Technol. 2024, 74, 107–114. [Google Scholar] [CrossRef]
Jewpanya, P.; Nuangpirom, P.; Pitjamit, S.; Nakkiew, W. Optimized Travel Itineraries: Combining Mandatory Visits and Personalized Activities. Algorithms 2025, 18, 110. [Google Scholar] [CrossRef]
Roy, S.; Map, A. High-Speed Rail Station Location Optimization Using Customized Utility Functions. IEEE Intell. Transp. Syst. Mag. 2023, 15, 26–35. [Google Scholar] [CrossRef]
Ren, X. Application of Apriori Association Rules Algorithm to Data Mining Technology to Mining E-commerce Potential Customers. In Proceedings of the IWCMC 2021: 2021 17th International Wireless Communications & Mobile Computing Conference (IWCMC), Harbin, China, 28 June–2 July 2021; IEEE: New York, NY, USA, 2021; pp. 1193–1196. [Google Scholar]
Masa, P.; Rauch, J. Enhanced Association Rules and Python. In Proceedings of the Machine Learning, Optimization, and Data Science, LOD 2022, PT II, Certosa di Pontignano, Italy, 18–22 September 2022; pp. 123–138. [Google Scholar]
Chen, P. Effects of the entropy weight on TOPSIS. Expert Syst. Appl. 2021, 168, 114186. [Google Scholar] [CrossRef]
Wu, J.; Wang, X.; Feng, F.; He, X.; Chen, L.; Lian, J.; Xie, X.; Assoc Comp, M. Self-supervised Graph Learning for Recommendation. In Proceedings of the SIGIR ‘21—44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual, 11–15 July 2021; pp. 726–735. [Google Scholar]
Güzel, F.Ö. The Dimensions of Tour Experience, Emotional Arousal, and Post-experience Behaviors: A Research on Pamukkale in Turkey. Procedia Soc. Behav. Sci. 2014, 150, 521–530. [Google Scholar] [CrossRef]
China Academy of Urban Planning & Design and Baidu Maps, 2024 Annual Report on Commuting Monitoring of Major Cities in China. 2024. Available online: https://huiyan.baidu.com/boswebsite/cms/report/2024tongqin/ (accessed on 17 June 2025).
Long, Y.; Wang, X.; Zhou, Z.; Wang, R.; Yi, H. Research on Text Time Window Partition Based on LDA Model. Sci. Focus 2024, 19, 34–45. [Google Scholar]
Wan, X.; Liu, S.; Wu, Q. Spatial-temporal Differences of Scenic Spot Development in Wuhan Metropolitan Area. Resour. Environ. Yangtze Basin 2013, 22, 1426–1432. [Google Scholar]

Figure 1. Flowchart of the research methodology.

Figure 2. Flowchart of fine-grained semantic parsing method.

Figure 3. Tourism Sentiment Chain generation method.

Figure 4. Study region and objects.

Figure 5. Distribution of travel review text data. (a) Distribution of the number of reviews for tourist attractions. (b) Distribution of review text length.

Figure 6. The proportion of theme categories of all attractions.

Figure 7. Thematic distribution vector of the top nine attractions.

Figure 8. The sentiment distribution at theme categories of all attractions.

Figure 9. ESIs of the top nine attractions.

Figure 10. Diagrams of Single-Dimension Association Network. (a) Spatial Proximity Network. (b) Resonance Correlation Network. (c) Thematic-Sentiment Network.

Figure 11. Example of CAI association.

Figure 12. Tourism Sentiment Chain of Yellow Crane Tower.

Figure 13. Road distance comparison of recommended attractions.

Figure 14. JS divergence comparison of recommended attractions.

Table 1. Evaluation of characteristic word entity extraction accuracy.

Characteristic Word Type	Precision	Recall	F1-Score	Support
Object	0.88	0.86	0.87	799
Property	0.94	0.91	0.92	573
Degree	0.98	0.94	0.96	68
Action	0.94	0.84	0.89	295
Other	0.89	0.81	0.85	21
micro avg	0.91	0.88	0.89	1756
macro avg	0.93	0.87	0.90	1756
weighted avg	0.91	0.88	0.89	1756

Table 2. Accuracy assessment of viewpoint phrase entity extraction.

Attraction Dataset	Number of Reviews	Precision	Recall	F1-Score	Support
WS Hot Snow Miracle	149	0.9116	0.9538	0.9322	227
East Lake Eye	85	0.9024	0.9652	0.9327	111
Dulu Dulu Cute Pet Science Theme Park	121	0.9119	0.9565	0.9337	176
Guiyuan Temple	116	0.8442	0.9489	0.8935	130
Hubei Art Museum	39	0.8698	0.7843	0.8248	40
micro avg		0.8941	0.9434	0.9181	684
macro avg		0.8880	0.9217	0.9034	684
weighted avg		0.8949	0.9455	0.9190	684

Table 3. Evaluation of viewpoint phrase thematic classification accuracy.

Thematic Classification	Precision	Recall	F1-Score	Support
Transportation and Travel	0.9134	0.9045	0.9089	618
Environment and Landscape	0.9250	0.9530	0.9388	5223
Services and Facilities	0.9044	0.8749	0.8894	2390
Food and Shopping	0.8971	0.8824	0.8897	425
Price and Comsumption	0.9651	0.9651	0.9651	1779
Crowd Flow and Density	0.9355	0.9348	0.9352	1211
History and Culture	0.8346	0.7902	0.8118	696
Family and Education	0.9031	0.9417	0.9220	1029
Animals and Plants	0.8994	0.8824	0.8908	1216
Leisure and Entertainment	0.9183	0.9032	0.9107	3036
micro avg			0.9184	17,629
macro avg	0.9096	0.9032	0.9062	17,629
weighted avg	0.9181	0.9184	0.9181	17,629

Table 4. Comparison of network structural features under different edge weights.

Evaluation Metrics	Spatial Proximity	Resonance Correlation	Thematic Similarity
Node count	178	178	178
Egde count	2110	836	10,684
Modularity	0.6093	0.6449	0.2471
Connectivity	No/12	No/23	Yes
Average degree centrality	0.1339	0.0531	0.6782
Average closeness centrality	0.1765	0.2101	0.5808
Average betweenness centrality	0.0086	0.0119	0.0043
Global clustering coefficient	0.6042	0.3430	0.6642
Power-law exponent of degree distribution	0.4618	−0.0517	0.2186
Average shortest path length	3.3723	3.7341	1.7573
Network diameter	11	8	4

Table 5. The top 10 attractions in terms of composite influence score in the spatial proximity network.

Attraction	Composite Influence Score
HongShan Square	0.7257
Chu River and Han Street	0.6765
Hubei Provincial Library	0.6713
Han Show Theater	0.6484
DaYu Mythical Garden	0.6172
HuaLou Street	0.6067
ZhongShan Avenue	0.6047
TieMenGuan	0.5964
WS Snow Wonder	0.5929
GuQin Terrace	0.5851

Table 6. The top 10 attractions in terms of composite influence score in the resonance correlation network.

Attraction	Composite Influence Score
Hankou Riverside	0.6993
East Lake	0.5984
Jianghan Road Pedestrian Street	0.5903
GuiYuan Temple	0.5568
East Lake Beach Scenic Area	0.5509
Wuhan Zoo	0.5340
GuDe Temple	0.5233
LiHuangPi Road	0.4983
Turtle Mountain	0.4880
East Lake Greenway	0.4738

Table 7. The top 10 attractions in terms of composite influence score in the thematic sentiment network.

Attraction	Composite Influence Score
Wuhan Garden Expo Park	2.1145
JiuZhen Mountain	2.0671
Hankou Riverside	2.0536
Happy Jungle Cherry Blossom Theme Park	1.9830
ZhongShan Park	1.9769
Hanyang Riverside	1.9606
JinLiGou	1.9485
YaoJia Mountain	1.9355
ZiYang Park	1.9264
Yellow Crane Tower	1.9260

Table 8. Structural characteristics of CAI associaton network.

Evaluation Metrics		Evaluation Metrics
Node count	178	Egde count	2577
Modularity	0.5540	Connectivity	Yes
Average degree centrality	0.1636	Average closeness centrality	0.3476
Average betweenness centrality	0.0106	Global clustering coefficient	0.5232
Power-law exponent of degree distribution	0.4143	Average shortest path length	2.8344
Network diameter	8

Table 9. Ranking of correlation strength of TSC for Yellow Crane Tourism.

Attr_Former	Attr_Next	CAI	Force	Weight	Percentile Rank
Yellow Crane Tower	ShouYi Park	0.0412, 44%	TS	[0.2409, 0.0064, 0.2980]	[16%, 75%, 62%]
ShouYi Park	ZiYang Park	0.1027, 10%	TS	[0.1296, 0, 0.5840]	[34%, -, 15%]
ZiYang Park	She Mountain	0.1838, 0%	RC	[0.1353, 0.5000, 0]	[26%, 7%, -]
She Mountain	Turtle Mountain	0.1439, 9%	TS	[0, 0.3333, 0.4939]	[-, 24%, 27%]
Turtle Mountain	TieMenGuan	0.0986, 18%	SP	[1.0, 0.0667, 0.1464]	[0%, 57%, 75%]
TieMenGuan	Jianghan Road Pedestrian Street	0.0070, 91%	SP	[0.0958, 0, 0]	[50%, -, -]
Jianghan Road Pedestrian Street	Hanzheng Street	0.0385, 47%	TS	[0, 0.0198, 0.9122]	[-, 76%, 1%]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, B.; Li, R.; Wang, J.; Song, A. Tourism Sentiment Chain Representation Model and Construction from Tourist Reviews. Future Internet 2025, 17, 276. https://doi.org/10.3390/fi17070276

AMA Style

Li B, Li R, Wang J, Song A. Tourism Sentiment Chain Representation Model and Construction from Tourist Reviews. Future Internet. 2025; 17(7):276. https://doi.org/10.3390/fi17070276

Chicago/Turabian Style

Li, Bosen, Rui Li, Junhao Wang, and Aihong Song. 2025. "Tourism Sentiment Chain Representation Model and Construction from Tourist Reviews" Future Internet 17, no. 7: 276. https://doi.org/10.3390/fi17070276

APA Style

Li, B., Li, R., Wang, J., & Song, A. (2025). Tourism Sentiment Chain Representation Model and Construction from Tourist Reviews. Future Internet, 17(7), 276. https://doi.org/10.3390/fi17070276

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Tourism Sentiment Chain Representation Model and Construction from Tourist Reviews

Abstract

1. Introduction

2. Related Works

2.1. Text Analysis of Tourism Reviews

2.2. Methods for Recommending Tourist Routes

3. Materials and Methods

3.1. Spatial–Semantic Integrated Model for Tourist Attraction Representation

3.1.1. Spatial and Semantic Feature Representation of Tourist Attraction Entity

3.1.2. Granular Semantic Parsing Model for Tourism Review Textual Entity

3.2. Representation and Construction of Tourist Attraction Multidimensional Association Network

3.3. Hybrid Random Walk-Optimized Simulated Annealing for Tourism Sentiment Chain Modeling

3.3.1. Tourism Sentiment Chain Representation

3.3.2. Tourism Sentiment Chain Generation Method

4. Results

4.1. Data Collection

4.2. Attribute Extraction and Analysis of Tourist Attractions

4.2.1. Extraction of Textual Entity Information

4.2.2. Attraction Entity Attribute Analysis

4.3. Construction and Analysis of Multi-Dimensional Correlation Network of Tourist Attractions

4.3.1. Quantification of Multi-Dimensional Correlation Strength

4.3.2. Analysis of the Structural Characteristics of MAN-SRT

4.4. Tourism Sentiment Chain Generation and Scenario Validation

4.4.1. Path Generation and Comprehensive Attractiveness Calculation Based on RW

4.4.2. Tourism Sentiment Chain Realization and Comparative Analysis

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI