Next Article in Journal
Evaluating the Impact of Agricultural Credit Access on Smallholder Maize Farmers’ Productivity in the Northwest Region of Cameroon
Previous Article in Journal
Chemical Speciation and Ecological Risk of Heavy Metals in Municipal Sewage Sludge from Bangkok, Thailand
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Integrating Image Recognition, Sentiment Analysis, and UWB Tracking for Urban Heritage Tourism: A Multimodal Case Study in Macau

School of Architecture and Urban Planning, Shenzhen University, Shenzhen 518060, China
*
Author to whom correspondence should be addressed.
Sustainability 2025, 17(17), 7573; https://doi.org/10.3390/su17177573
Submission received: 20 July 2025 / Revised: 13 August 2025 / Accepted: 19 August 2025 / Published: 22 August 2025

Abstract

Amid growing demands for heritage conservation and precision urban governance, this study proposes a multimodal framework to analyze tourist perception and behavior in Macau’s Historic Centre. We integrate geotagged social media images and text, ultra-wideband (UWB) pedestrian trajectories, and a LiDAR-derived 3D digital twin to examine the interplay among spatial configuration, movement, and affect. Visual content in tourist photos is classified with You Only Look Once (YOLOv8), and sentiment polarity in Weibo posts is estimated with a fine-tuned Bidirectional Encoder Representations from Transformers (BERT) model. UWB data provide fine-grained trajectories, and all modalities are georeferenced within the digital twin. Results indicate that iconic landmarks concentrate visual attention, pedestrian density, and positive sentiment, whereas peripheral sites show lower footfall yet strong emotional resonance. We further identify three coupling typologies that differentiate tourist experiences across spatial contexts. The study advances multimodal research on historic urban centers by delivering a reproducible framework that aligns image, text, and trajectory data to extract microscale patterns. Theoretically, it elucidates how spatial configuration, movement intensity, and affective expression co-produce experiential quality. Using Macau’s Historic Centre as an empirical testbed, the findings inform heritage revitalization, wayfinding, and crowd-management strategies.

1. Introduction

Historic districts—vital repositories of urban cultural heritage—embody layered historical memory and symbolic meaning while functioning as arenas for public governance and spatial regeneration [1]. Under rapid urbanization and rising tourist inflows, these areas face a dual challenge: preserving heritage authenticity while accommodating evolving functional demands [2]. Balancing integrity with everyday usability has therefore become a central concern in contemporary urban governance.
The United Nations’ 2030 Agenda for Sustainable Development calls for strengthened heritage protection, enhanced urban resilience and inclusiveness, and highlights the critical role of data-driven innovation in enabling precision urban governance [3]. Global initiatives—such as WalkNYC in New York and the pedestrian network in Seoul—demonstrate that optimizing tourist spatial experience is increasingly central to both heritage conservation and city management [4]. These efforts underscore that the enhancement of tourist experiences is not merely a technical concern but also a strategic intervention in public governance and identity-making.
From the perspectives of tourism behavior and environmental psychology, tourist experience is multidimensional, intertwining cognition and emotion. Traditional surveys suffer from sampling bias, limited representativeness, and weak temporal sensitivity, making them inadequate for real-time perceptions and fine-scale mobility [5]. With social media proliferation and AI advances, user-generated content (UGC)—including images, text, and geotagged metadata—enables analysis of sentiments, visual attention, and cultural identification through computer vision and natural-language processing [6,7].
Macau’s Historic center, inscribed as a UNESCO World Heritage Site in 2005, is distinguished by its dense urban morphology and rich intercultural architectural heritage, reflecting a unique East–West synthesis. Landmark sites such as the Company of Jesus Square, A-Ma Temple, and Senado Square function not only as core tourist destinations but also as hybridized cultural, religious, and everyday spaces. However, the district’s average daily tourist volume far exceeds its resident population, resulting in both urban vitality and spatial saturation [8]. While existing studies have focused on macro-level tourism patterns or specific data types, there remains a lack of fine-grained micro-spatial research integrating diverse data sources [9].
Previous research has addressed tourism experiences, movement trajectories, and cultural perception within historic districts. However, these studies have largely operated at the city or site level and have predominantly relied on single-source data, such as text or trajectory datasets. As a result, they fall short of uncovering the complex interdependencies between multimodal data, affective responses, and spatial structure [10,11,12].
To fill this gap, the present study proposes an integrated analytical framework combining image recognition, sentiment analysis of textual content, UWB-based movement tracking, and digital twin modeling. This research addresses the following key questions:
  • How can multimodal social media data be leveraged to accurately characterize tourists’ spatial behaviors and emotional responses in historic districts?
  • Are there identifiable patterns of coupling among spatial structure, movement behavior, and emotional expression?
  • How can such empirical insights be translated into actionable strategies for heritage protection and urban governance?
The primary aim of this study is to contribute to the understanding of tourist behavior and perception in historic urban centers through a multimodal, microscale framework. The methodological aim is to operationalize cross-modal alignment (vision–text–trajectories) in a geospatially registered digital twin with transparent thresholds for pattern extraction. The practical aim is to translate the empirical insights into actionable guidance for heritage conservation and precision urban governance, with Macau serving as the empirical testbed for assessing transferability.

2. Literature Review

Since its introduction by Lynch, spatial image theory has provided a foundation for urban spatial cognition. Individuals construct cognitive maps by perceiving landmarks, paths, edges, nodes, and districts. In tourism contexts, these elements shape spatial experience, influence movement trajectories, and affect emotional responses [13]. Over time, integration with space syntax and behavioral geography has supplied robust tools for understanding tourist flows, spatial preferences, and perceptual mapping [14].
Traditional surveys capture declared preferences but struggle with temporal sensitivity and micro-spatial granularity. In contrast, social media footprints and fine-scale tracking provide continuous, in-situ traces of how tourists see, feel, and move, especially in congested heritage settings where small-footprint design affects comfort and wayfinding. This motivates a multimodal approach that triangulates visual semantics, sentiment, and trajectories to uncover the mechanisms behind perceived quality and crowding. Recent studies have moved from single-modality mining (texts or images) to multimodal fusion that aligns visual semantics with linguistic sentiment and spatial footprints. This shift enables fine-grained characterization of the seeing–feeling–moving process and partly mitigates sampling biases inherent in conventional surveys. Nevertheless, gaps remain in standardized cross-modal alignment, microscale georegistration to the streetscape, and robust validation against in situ behavioral measurements.
The proliferation of social media platforms—including Weibo, Instagram, and TripAdvisor—has dramatically transformed the way tourists document and share their experiences. These platforms generate large volumes of geo-tagged UGC, providing rich multimodal data for investigating fine-grained behavioral and perceptual patterns [15]. Earlier studies often focused on single-modality analyses, such as text-based sentiment mining or image recognition, to identify spatial hotspots and tourist preferences. However, recent advances in deep learning have fostered a shift toward multimodal data fusion [16]. For example, the YOLO framework enables efficient extraction of visual elements such as landscapes, crowds, and urban features from images [17]. Meanwhile, sentiment classification models based on BERT (Bidirectional Encoder Representations from Transformers) offer high accuracy in detecting affective tendencies and value judgments in textual content [18,19]. The integration of these models makes it possible to interpret tourist behaviors and emotional states within a shared semantic framework, thereby revealing the complex interplay between visual perception, linguistic expression, and spatial behavior [20].
Substantial progress spans urban tourism experience, emotional geography, and cultural-landmark preferences [21]. For example, Hu mapped tourist mobility networks using graph-based models [22]; Lin examined temporal patterns via spatial clustering and Markov chains [23]; Van der Zee applied social-network analysis to interest-based connections [24]; Miah developed a multi-source tourism decision-support system [25]. These studies show that social media data capture spatial behavioral patterns and help construct emotional linkages between tourists and urban environments [26].
In parallel, digital twin technologies expand urban simulation and governance capacities [27]. By integrating LiDAR point clouds, real-time trajectory sensing, and environmental monitoring, such platforms build high-fidelity 3D models to simulate dynamic scenarios driven by behavioral inputs [28]. In heritage conservation and tourism management, they support virtual reconstruction, scenario testing, and emergency simulation, thereby enhancing responsiveness and decision-making. Linking physical and digital spaces via social media data further deepens understanding of tourist interactions across hybrid environments [29].
Building on spatial cognition and space syntax, researchers increasingly analyze how visibility, network accessibility, and landmark salience shape pedestrian densities and affective appraisals. Yet few frameworks provide quantitative, triadic coupling among visual semantics (what is seen), stay behavior (where and how people linger), and sentiment expression (how it feels) at site or corridor scales. Addressing this gap requires jointly modeled indicators and consistent spatial units for comparison.
Beyond descriptive hotspot mapping, multimodal social sensing—integrating computer vision from user-generated images, sentiment signals from short texts, and fine-grained pedestrian trajectories—can operationalize carrying-capacity management, promote equitable redistribution of visitor flows, and sustain cultural vitality at the streetscape scale [30,31,32]. Nevertheless, methodological and theoretical limitations persist, and the integration of social sensing—UGC-derived perception and affect—remains underdeveloped, limiting the actionability of digital twin interventions at the microscale. First, multimodal integration often remains parallel rather than semantically aligned and geospatially registered [33,34]. Second, few studies quantitatively link subjective experiences in emotional geography with urban form and behavioral trajectories [35,36]. Third, most work stays at macro-scales, neglecting micro-scale dynamics of behavior, spatial imagery, and cultural perception at block or streetscape levels [37,38]. Addressing these gaps requires a unified framework that integrates image recognition, emotional semantics, behavioral tracking, and 3D spatial modeling to advance dialogue among tourism geography, perceptual urbanism, and spatial cognition [39].

3. Materials and Methods

3.1. Study Area

This study focuses on eight urban squares in Macau—Barra Square, St. Dominic’s Square, Lilau Square, Cathedral Square, St. Augustine’s Square, Company of Jesus Square, Senado Square, and Camoes Square—that are inscribed as part of the UNESCO World Heritage Site (Figure 1). Each square accommodates a mix of urban functions—religious, commercial, residential, and public—and is characterized by dense spatial configurations and complex morphology. Their cultural hybridity and spatial diversity make them exemplary settings for examining tourist perceptions and behavioral patterns through multimodal data. The selected squares share three notable attributes: (1) distinct variations in tourist visitation density, forming zones of high, medium, and low vitality; (2) diverse spatial typologies, including religious sites, commercial districts, green areas, and key urban nodes; and (3) substantial representation in social media activity, enabling robust digital data collection.

3.2. Data Sources

We integrate three primary data types: geotagged social media content (text and images), UWB pedestrian trajectories, and LiDAR-based 3D spatial models (Figure 2).
  • Social Media Data: Using the Weibo Open Platform API, we collected geotagged posts within the Historic Centre of Macau from March 2023 to March 2024. After manual de-duplication (hash + near-duplicate screening), semantic relevance filtering, and location validity checks within the study polygon, 7932 text posts and 1910 images were retained for analysis. These final counts replace prior approximate wording and ensure full auditability.
  • Tourist Trajectory Data: UWB anchors were strategically deployed at eight key sites. A total of 150 volunteers with diverse demographic backgrounds participated in trajectory tracking experiments conducted on both public holidays and regular weekdays. Each session lasted 10–30 min with a 10 Hz sampling rate, resulting in high-resolution, high-frequency movement traces capturing a wide variety of behavioral paths.
  • 3D Spatial Modeling Data: High-precision point cloud data of streets and open spaces within the historic district were captured using the Leica ScanStation P20 terrestrial laser scanner (accuracy: 3 mm at 10 m). The collected data underwent stitching, noise filtering, and mesh generation in modeling software, producing a comprehensive 3D digital twin base map that supports the alignment and visualization of multimodal data sources.

3.3. Method

3.3.1. Social Media Data Processing

Social media data were processed in two parts: image recognition and sentiment analysis.
For image recognition, the YOLOv8 was employed for object detection. We drafted a candidate ontology informed by prior studies and a locally curated inventory of salient elements in Macau’s Historic center. The initial set comprised nine collectively exhaustive classes: Historical Architecture, Modern Architecture, Public Facilities, Environmental Landscape, Transportation, Commercial Facilities, Food and Beverage, Cultural Facilities, Urban Ecology (Figure 3 and Figure 4). The classification framework draws on Lynch’s seminal insights into the recognition of urban elements, Conzen’s town-plan analysis regarding the functional division of buildings and land use, and Batty’s perspectives on urban digitalization and multi-component spatial integration. By integrating morphological characteristics with functional attributes, the scheme captures both historical and cultural significance as well as modern service and ecological values, providing a multidimensional structure for analyzing urban landscapes in tourist perception. This ontology was implemented in a pre-trained YOLO-based image recognition model, and manual inspection was conducted to verify the consistency and exclusivity of category assignments across the dataset.
For sentiment analysis, fine-tuned BERT model was using tourism-related Weibo texts. Posts were labeled into three sentiment categories: positive, neutral, and negative. Topics included food, service quality, landmarks, and urban atmosphere. Sentiment results were integrated with the image classification outputs and stored in a PostgreSQL database for subsequent spatial analysis and cross-modal coupling [40].

3.3.2. UWB Trajectory Experiment Design

The UWB trajectory experiment consisted of four stages: anchor deployment, device testing, sampling execution, and data preprocessing. (1) Anchor Deployment: Five UWB anchors were placed at each of the eight squares based on field visibility and occlusion analysis, ensuring that every point was covered by at least three anchors. (2) Device Testing: Static signal tests and error calibration were conducted prior to sampling to filter out occlusion-induced noise. (3) Sampling Execution: All participants provided informed consent. The trajectory data were anonymized. (4) Data Processing: Noise points were removed, and trajectory smoothing was applied. The cleaned data were converted into standard GIS formats for spatial analysis.

3.3.3. Multimodal Data Integration

Multimodal datasets were integrated within a 3D digital twin environment developed in Blender. The integration workflow comprised three sequential stages:
(1)
Spatial registration. Outputs from YOLO-based image classification and ultra-wideband (UWB) pedestrian trajectories were georeferenced and aligned to the LiDAR-derived coordinate system to ensure spatial consistency.
(2)
Data integration framework. A spatial–emotional–behavioral triadic matrix was constructed to couple location, affective expression, and movement dynamics.
  • Stay-point detection. A trajectory sample was classified as a stay when the instantaneous speed was below 0.3 m/s for a continuous duration of ≥10 s.
  • Behavioral intensity. Dwell-time–weighted kernel density estimates were calculated on a uniform analysis grid, and the resulting surfaces were z-standardized.
  • Sentiment intensity. Probabilities from the BERT sentiment classifier were mapped to +1 (positive), 0 (neutral), −1 (negative), spatially smoothed, and z-standardized.
  • Coupling rules. High and low categories were defined according to the upper and lower terciles of the respective z-scores. A cell was categorized as Low behavior–High emotion (LB–HE) if its behavior z-score fell in the lower tercile while its sentiment z-score fell in the upper tercile; the inverse combination was classified as High behavior–Low emotion (HB–LE).
  • Comparability and visualization. Since data for individual squares were collected on different dates and within varying time windows, raw intensity values were not directly comparable across sites. To ensure consistency, all thematic surfaces were displayed using a common standardized scale with tercile cutoffs explicitly reported in the legends. Comparisons should therefore be interpreted as within-site standardized contrasts rather than absolute cross-time differences.
(3)
Coupling typologies. Based on the co-occurrence patterns of behavioral and sentiment intensities, spatial typologies—High behavior–High emotion and Low behavior–High emotion—were identified. Cross-intensity rules yield a typology reported in Section 4.3.

4. Results

4.1. Spatial Perception Patterns Based on Social Media

4.1.1. Image Recognition and Distribution of Spatial Imagery

We summarize image content into nine predefined classes (see Table 1, Figure 5). Analyses use the cleaned image set defined in Section 3.2, key findings are as follows.
(1) Historical Architecture:
Historical buildings accounted for 14.75% of all recognized elements, representing the dominant visual category in the historic center. Notably, Company of Jesus Square (29.60%), Barra Square (25.22%), and St. Dominic’s Square (24.93%) showed significantly higher proportions than the average, indicating their strong historical recognizability and tourist visual prominence. These sites serve as symbolic anchors within the urban spatial structure. In contrast, Senado Square (10.74%) and St. Augustine’s Square (14.86%) exhibited relatively low mention rates, suggesting a need to strengthen their historical identity through spatial storytelling and enhanced interpretive systems.
(2) Modern Architecture:
This category had the lowest visual representation at just 7.26%, with no site exceeding 10%. The consistently low percentages reflect the historic district’s high degree of architectural continuity and traditional urban fabric. While this reinforces cultural identity, it also highlights the importance of integrating modern elements cautiously in future functional upgrades—balancing visual coherence with practical needs within a historical context.
(3) Public Facilities:
Public facilities accounted for an average of 8.34% across all locations. St. Augustine’s Square (13.37%) and Cathedral Square (11.83%) were notably prominent, likely due to the density and frequent use of features such as signage, seating, and lighting. In contrast, Company of Jesus Square registered only 7.87%, suggesting a disconnect between its cultural significance and service infrastructure—indicating an opportunity to improve foundational visitor-support systems.
(4) Environmental Landscapes:
Representing only 1.82% of image content, environmental features were underrepresented. Exceptions included Cathedral Square (8.94%) and Barra Square (7.85%), which likely benefited from attractive elements such as greenery, fountains, or historical wall decor. The overall low proportion reveals systemic inadequacies in green infrastructure and ecological design within the historic district. Enhancements through pocket parks, vertical planting, or micro-landscape strategies are recommended for future improvements.
(5) Transportation:
Transport-related imagery accounted for 9.29% of total content. Camoes Square (20.53%) and Barra Square (17.22%) exhibited the highest representation, affirming their function as critical gateways and transit nodes. Conversely, Company of Jesus Square (8.45%)—despite being a key visual landmark—had relatively low transport visibility, suggesting a disconnect between spatial perception and access clarity. This highlights the need to improve the integration of wayfinding systems to align visual landmarks with mobility functions.
(6) Commercial Facilities:
Commercial spaces were the most frequently identified visual element, comprising 27.00% of all images. Concentrated particularly in Senado Square (36.11%), Company of Jesus Square (32.90%), and St. Dominic’s Square (27.38%), these zones represent the district’s core consumer landscape. While this affirms their function as commercial hubs, the intensity of commercial imagery also raises concerns about potential dilution of cultural symbolism due to over-commercialization.
(7) Food and Beverage:
Dining-related elements were the second most represented category at 18.26%. St. Augustine’s Square (23.35%) and Senado Square (22.61%) stood out for their strong association with culinary experiences. In contrast, Company of Jesus Square, despite its cultural prominence, showed only 6.63%, suggesting insufficient integration of food services within this historically significant area—highlighting opportunities for enhancing localized gastronomic representation.
(8) Cultural Facilities:
Cultural facilities comprised 8.26% of the imagery, with Cathedral Square (9.47%) and St. Dominic’s Square (8.93%) leading, likely due to proximity to churches, museums, and heritage sites. Senado Square had the lowest percentage at 3.19%, despite its heavy foot traffic. This disparity underscores the need to reinforce cultural visibility through signage, exhibitions, or interactive installations.
(9) Urban Ecology:
Urban ecological elements were the least frequently identified, comprising just 1.62% of total image content. However, Camoes Square (12.07%), Lilau Square (11.97%), and Cathedral Square (12.33%) demonstrated relatively high ecological visibility—possibly reflecting localized green interventions, ventilation corridors, or open spatial configurations. Overall, the scarcity of ecological imagery suggests the need for integrated strategies such as vertical greening or boundary landscaping to enhance ecological legibility within the historic core.

4.1.2. Sentiment Polarity Analysis

Sentiment polarity analysis based on the BERT model indicates that overall emotions expressed in Macau’s historic center are predominantly positive (58%), followed by neutral (35%) and negative (7%) sentiments. Notably, emotional tendencies vary significantly across different squares:
Senado Square exhibits intense and polarized emotional expression, manifesting a classic ‘dual-peak’ pattern. Negative sentiment keywords include “crowded,” “noisy,” and “queuing,” while positive expressions are dominated by “romantic,” “lighting,” and “aesthetic.”
St. Augustine’s Square and Camoes Square are characterized by low-density but high-quality emotional environments, often described with terms such as “tranquil,” “photo spot,” and “surprising,” indicating their potential as alternative experiential spaces.
Barra Square is frequently associated with “solemn,” “peaceful,” and “historical,” reflecting a blend of cultural and religious identity.
Company of Jesus Square, while visually prominent, exhibited a relatively neutral emotional profile, suggesting opportunities to enhance its affective resonance.
Additionally, the analysis of sentiment-related term frequency reveals a strong correlation between emotional expression and spatial experience. Common verb–emotion pairs include “queuing–annoying,” “check-in–worth it,” “photo–beautiful,” and “walking–tiring.” Temporal patterns also emerged: negative sentiment increases significantly during holidays, whereas positive sentiment becomes more prevalent during off-peak periods. This demonstrates the close relationship between spatial capacity and visitors’ emotional responses.

4.1.3. High-Frequency Keywords Analysis

Based on word frequency statistics and contextual analysis of the Weibo corpus, twenty high-frequency keywords were identified, spanning historical landmarks, local cuisine, tourism activities, and spatial imagery (Table 2). The co-occurrence relationships and spatial semantics of these terms reveal four primary dimensions of tourist perception in Macau’s historic center:
(1) Heritage Cognition Dimension:
Terms such as “Company of Jesus Square” (16.77%), “A-Ma Temple” (15.61%), “church,” and “cultural heritage” function as key cultural anchors. These words are often accompanied by evaluative terms like “solemn,” “peaceful,” and “worth seeing,” reflecting a deep emotional connection to religious structures, collective memory, and localized traditions. These sites not only serve as visual focal points but also act as symbolic pillars in the cognitive mapping of the historic area—constituting what may be referred to as Macau’s “cultural archetypes.”
(2) Sensory Experience Dimension:
Keywords such as “local cuisine” (14.03%), “Portuguese egg tart” (11.37%), “beverage,” and “buffet” construct a vivid gastronomic image of Macau. These terms frequently appear alongside expressions like “delicious,” “must-try,” and “every visit,” indicating that food serves as both a sensory pleasure and a cultural medium. Dining venues often co-occur with “streets” and “tourist photo,” underscoring their role as hybrid spaces for both social interaction and spatial experience—reinforcing Macau’s identity as a “City of Gastronomy.”
(3) Scenario Interaction Dimension:
Terms such as “Macau Tower,” “fireworks,” “tourist photo,” “scenic spot,” and “travel” highlight an experience-oriented structure rooted in eventful, immersive settings. Seasonal imagery—such as “fireworks” and “night view”—serve as spatial-temporal memory anchors, while attractions like the Macau Tower reflect a strong preference for viewing, photographing, and interaction. These patterns suggest a shift from passive sightseeing to active participation, illustrating the rise of immersive and personalized tourism.
(4) Urban Atmosphere Dimension:
Words like “night view” (9.84%), “street” (8.23%), “shopping” (8.50%), and “crowded” (9.65%) portray the overall mood and sensory tone of the urban landscape. Although “crowded” carries a negative emotional connotation, it also symbolizes energy, vibrancy, and the celebratory spirit of place. These keywords reflect tourists’ collective impression of Macau as a dynamic and emotionally resonant cultural space marked by urban intensity and spatial rhythm.

4.1.4. Co-Occurrence Analysis of Perceptual Elements

To further elucidate tourist perception patterns of Macau’s key attractions, culinary offerings, and recreational experiences, this study conducted a co-occurrence frequency and semantic relevance analysis of Weibo text. The results reveal three major cognitive models of tourist experience structured around space, emotion, and behavior (Figure 6):
(1) Cultural Landmarks and Emotional Projection:
Iconic heritage sites such as the Company of Jesus Square and A-Ma Temple frequently co-occurred with emotionally charged terms like “photo,” “solemn,” “historic,” “prayer,” and “incense.” These spaces function as symbolic anchors for both spiritual engagement and social media “check-in” behavior, reinforcing identity and collective memory. In contrast, modern landmarks like Macau Tower were associated with terms such as “bungee jumping,” “thrill,” and “panoramic view,” reflecting a desire for adventure and visual conquest. Together, these sites represent emotional environments where tourists project their personal expectations onto symbolic spaces.
(2) Gastronomic Experience and Place Identity:
Local specialties—such as Portuguese egg tarts, pork chop buns, and double-skin milk—were closely linked with sensorial and emotional descriptors like “delicious,” “recommended,” “classic,” “refreshing,” and “milky aroma.” This underscores the role of food as a cultural interface: not only fulfilling physiological needs but also facilitating emotional comfort, memory formation, and local identity. Through gustatory engagement, tourists construct a pathway from sensory perception to cultural belonging.
(3) Activity Spaces and Consumer Behavior:
Functional spaces—such as casinos, shopping areas, and selfie hotspots—exhibited high semantic association with terms like “luxury,” “discount,” “duty-free,” “excitement,” and “tourist photo.” These keywords reflect behavior patterns centered on consumption and self-expression. Gambling is linked with sensory indulgence, shopping with material identity, and photography with social presentation. Together, they form a behavioral trajectory of “material consumption–emotional expression–social display.”

4.2. Tourist Trajectory Patterns and Environmental Emotion

4.2.1. High-Precision Trajectory Data and Spatial Stay Distribution

High-precision UWB trajectories reveal distinct vitality patterns across the historic center: yellow indicates high-vitality cores, red indicates medium-vitality transition zones, and blue indicates low-vitality peripheries (Figure 7 and Figure 8).
The distribution of UWB (Ultra-Wideband) high-frequency trajectory data confirms this vitality segmentation. The historic center exhibits pronounced spatial heterogeneity, with core zones receiving concentrated foot traffic and extended stays, particularly near major landmarks and commercial corridors. Transitional areas serve as movement passages and short-term resting points, while peripheral regions show minimal tourist engagement, indicating spatial marginalization. From the 3D scanning models, we can draw the following analysis:
(1) High-Vitality Core Areas.
The Company of Jesus Square and its adjacent square represent one of the most iconic landmarks in Macau. Tourist vitality is particularly concentrated around the staircase and the open space at the base of the façade. A large number of visitors gather here to take photographs and engage in social media “check-in” behaviors, resulting in the highest density of pedestrian traffic directly in front of the ruins. This pattern is primarily driven by the site’s unique Portuguese Baroque architectural style and its profound historical significance. The open square design provides ideal viewing and photography angles, further reinforcing spatial congregation.
Senado Square, serving as Macau’s political and cultural hub, is surrounded by key buildings such as the Leal Senado Building (former City Hall), the Holy House of Mercy, and the Cathedral. Pedestrian flows are relatively evenly distributed but tend to cluster around the central plaza in front of the Leal Senado. As a multifunctional space and a key transit hub, this area maintains consistently high foot traffic throughout the day. The convergence of diverse functions and convenient transportation attracts both local residents and tourists for shopping, transit, and leisure.
Company of Jesus Square, home to the ruins of St. Paul’s and the nearby Na Tcha Temple, exhibits heightened vitality during weekend cultural fairs. Pedestrian activity peaks around market stalls and the cathedral ruins, with the majority of movement concentrated near the St. Paul’s façade. The diversity and interactivity of the cultural fair significantly enhance the vitality of the square, particularly on the side closest to the landmark. These periodic events effectively blend historical heritage with contemporary life, transforming the space into a vibrant public arena.
(2) Medium-Vitality Transitional Zones.
Camoes Square is known for its well-maintained greenery and comfortable seating arrangements. The spatial distribution of visitors is relatively balanced and dispersed, reflecting successful public space design. Higher pedestrian density near the bus terminal highlights its role as a transportation hub, while the even distribution within the park area suggests an effective layout of leisure facilities, creating a conducive environment for longer stays.
St. Augustine’s Square, featuring the Dom Pedro V Theatre and the St. Joseph’s Seminary and Church, serves as a cultural node. During performance events and festivals, crowds tend to concentrate in front of the theatre and at the plaza’s central zone. This spatial clustering illustrates the square’s role as a cultural activity center and underscores the architectural appeal of its surrounding heritage buildings. Pedestrian vitality radiates outward from the center, forming a culturally anchored activity pattern.
Camoes Square, centered around the St. Anthony’s Church (Igreja de Santo António) and the Holy House of Mercy Museum, exhibits a vitality structure shaped by both tourism and religious activity. Tourists primarily gather in front of the church to take photos, while devotees are distributed within the church interior. The highest foot traffic occurs at the small plaza in front of the church, followed by adjacent streets. This dual-use pattern reflects the integration of sacred and commercial functions, with the religious architecture serving as the spatial core, generating continuous public engagement through the synergy of worship and tourism.
(3) Low-Vitality Peripheral Zones.
Despite its historical significance, Barra Square (in front of A-Ma Temple) experiences relatively low foot traffic due to its peripheral location. Tourists mainly concentrate around the main temple hall and the entrance to the Maritime Museum. This spatial pattern highlights the attraction of religious and cultural facilities while also revealing accessibility limitations, which may be the primary factor contributing to its overall low vitality.

4.2.2. Coupling Between Stay Behavior and Spatial Experience

Using the integrated dataset defined in Section 3.3.3, we spatially joined image classes with dwell-weighted stays to compare redistributions (Table 3) against the raw image shares (Table 1).
Trajectory data further confirm that spatial experiences in the historic center exhibit clear hierarchical and heterogeneous patterns. Core landmarks such as the Ruins of St. Paul’s and Senado Square show dense and complex pedestrian flows, forming a “dual-core center” of visual and activity perception. Other squares display varying degrees of marginalization or functional clustering. For instance, Cathedral Square shows a high proportion of historic building recognition (40.3%), signifying strong cultural identity. In contrast, Lilau Square excels in landscape (23.6%) and public facility recognition (20.9%), indicating its role as a visually dominant leisure space.
The linear aggregation of trajectory paths also reinforces spatial cognition networks. The cultural corridor linking Senado Square and the Company of Jesus Square represents a continuous spatial perception in tourist narratives and deepens the experience of Macau’s historical texture through high path density. In contrast, spaces like Barra Square, which are rich in transport infrastructure but weak in cultural expression, act as “transitional passages”—accessible but emotionally detached.
Finally, the distribution of movement patterns reveals an evolving rhythm of spatial and psychological transition. Tourists typically move from bustling main axes to quieter cultural and ecological nodes such as Lilau Square and St. Lazarus’ Theatre, forming experiential rhythms like “dense–sparse–dense” or “dynamic–static–dynamic.” Particularly, green spaces such as Lilau Square show low-density but evenly distributed flows and high ecological recognition, creating a pleasant, accessible, and calming spatial imagery. These areas complement commercial zones by offering psychological respite and enhancing the overall diversity and rhythm of tourist experiences.

4.3. Trinary Coupling of Space, Emotion, and Behavior Under Multimodal Integration

(1) Spatial Overlap of Visual Attention and Behavioral Density.
At key nodes—such as the Company of Jesus Square and Senado Square—image tags (e.g., “historical architecture,” “commercial facility”) align with UWB convergence, forming a high-attention/high-density pattern. These nodes frequently co-occur with positive sentiments (e.g., “spectacular,” “worth visiting”), creating a perception–behavior–emotion feedback loop. This coupling is consistent with space-syntax claims that visual accessibility shapes aggregated behavior and, in turn, emotional activation.
(2) Synergistic Enhancement Between Behavioral Intensity and Emotional Expression.
Trajectory analysis reveals that areas with high behavioral intensity are generally accompanied by a greater prevalence of positive emotional expression, while transitional or low-retention spaces yield relatively neutral sentiment. For example, although Senado Square is characterized by dense foot traffic, positive expressions such as “vibrant” and “good atmosphere” dominate, reflecting the social aggregation effect that enhances emotional experiences in high-traffic areas. However, these zones also frequently feature negative terms like “long queue” and “crowded,” indicating that the relationship between behavior and emotion is not linear but rather exhibits an “optimal experience threshold.” When this threshold is exceeded, over-stimulation may lead to frustration and fatigue.
(3) Asymmetric Relationship Between Emotional Hotspots and Behaviorally Sparse Zones.
Certain locations—such as Barra Square—exhibit only moderate performance in image recognition and trajectory density but score relatively high in sentiment polarity analysis. Common descriptors include “tranquil,” “strong cultural atmosphere,” and “pleasant surprise.” These “low behavior—high emotion” sites reveal tourists’ heightened sensitivity to immersive and personalized spatial experiences. In particular, religious heritage sites often stimulate emotional resonance through symbolic elements, narrative settings, and contextual immersion, functioning as emotional anchors in the spatial memory landscape.
(4) Typology of Coupling Patterns.
Based on the cross-intensity of the three data layers—visual semantics, behavioral trajectories, and emotional expression—this study identifies three typical coupling models of space–emotion–behavior (Table 4):

5. Discussion

(1) Theoretical and Methodological Innovation.
Conventional studies of tourist behavior—often based on surveys, interviews, or single-source sensing—struggle to capture real-time responses and dynamic trajectories [41]. Inferring hotspots from photo density or check-ins may conflate visual salience with experiential quality, obscuring where visibility-driven appeal crosses comfort thresholds. Our triadic coupling framework—linking visibility, movement, and affect—clarifies this mismatch by separating vibrant-but-ambivalent corridors from low-traffic yet emotionally resonant nodes, reconciling discrepancies between UGC-derived maps and satisfaction evidence. This interpretation aligns with configurational accounts in space syntax in which visibility, integration, and landmark salience shape pedestrian flows beyond iconicity alone.
Methodologically, combining UWB trajectories with a digital-twin environment overcomes temporal granularity limits of social sensing and enables scenario testing for crowding mitigation and narrative enhancement. The identification of ambivalent hotspots—sites that are photographically salient but not uniformly satisfying—supports the proposition that visibility-driven attractions may saturate comfort thresholds, decoupling where people look from where they prefer to dwell [42,43,44]. Moderately improving accessibility (e.g., network continuity, micro-wayfinding) can raise perceived coherence when coupled with interpretive services and amenities, without necessarily amplifying negative crowding effects.
(2) Heritage Conservation and Cultural Spatial Revitalization.
Tourists show strong spatial attention and emotional resonance at iconic heritage landmarks such as Company of Jesus Square and A-Ma Temple, which function as symbolic anchors in collective experience [45]. Configurational structure helps explain why landmark corridors remain congested even when alternative scenic detours exist: path hierarchy and node integration steer flows beyond landmark iconicity [46]. Yet this focused attention generates congestion, service overload, and mixed emotions, exposing the vulnerability of a point-based tourism model [47].
We therefore advocate a point–line–surface approach. Instead of isolated site visitation, narrative itineraries should be developed and supported by pedestrian networks, visual wayfinding, and point-of-interest triggers to deepen cultural immersion [48]. Peripheral nodes (e.g., Camoes Square) can be revitalized via intangible cultural activities—local cuisine, handicrafts, and curated exhibitions—to redistribute flows and reduce pressure on core attractions [49].
(3) Urban Governance Strategies.
Joint analysis of behavior and affect highlights high-ambivalence areas (e.g., egg-tart streets, casino entrances) where sentiments such as “crowded but enjoyable” co-occur, indicating that tourist memories are shaped by intertwined positive and negative appraisals [50]. Governance should thus go beyond discomfort reduction to designing the perceptual journey [51]. Four priorities follow:
  • Flow redistribution along the main axis and into peripheries. The corridor from Company of Jesus Square to Senado Square dominates activity, while A-Ma Temple and Camoes Square—though less trafficked—offer rich cultural and ecological experiences. Strengthening accessibility and continuity, removing barriers, and enhancing wayfinding and amenities can integrate routes, nodes, and edges into a cohesive structure. In fragmented spaces, pedestrian–vehicle separation and threshold design can improve walkability and place identity [52].
  • Activation of “low-traffic/high-perception” areas. Alleys and peripheral zones—side streets around St. Dominic’s, green spaces near Camoes Square, the religious settings of A-Ma Temple—are well suited for placemaking (street art, night markets, ecological interpretation, cultural showcases) that enrich experience off the main circuit. Integrating culinary culture (egg tarts, pork-chop buns) with spatial storytelling strengthens cultural attachment and extends emotional memory [53,54].
  • Time- and cohort-sensitive crowd management. Peak-holiday loads strain spatial capacity at Company of Jesus Square and Senado Square. Soft guidance, real-time information, route modulation, and personalized itineraries by resident/tourist cohort can enhance differentiation and dispersion [55].
  • Cultural governance and participation. As a multicultural hub, Macau should balance heritage preservation with cultural expression, avoiding commodification and dilution. Policy incentives for local artisans, creative industries, and community-based food services can thicken the spatial economy and cultural atmosphere. Co-governance with residents, religious communities, and tourism stakeholders—learning from global historic cities—supports a model that preserves authenticity while embracing innovation [56,57].
(4) Digital-twin–enabled decision support.
Embedding these insights in a city-scale digital twin operationalizes triadic coupling for governance [58]. Managers can simulate alternative wayfinding networks to connect green spaces and cultural nodes; redistribute flows away from congested landmark corridors; and test microclimatic interventions (shading, ventilation, misting) to enhance comfort [59]. The model also evaluates pause nodes (seating, water points, viewpoints) and facility reconfiguration to reduce queuing bottlenecks. Enhancing cultural legibility—via narrative cues, discrete signage, AR prompts, and view-framing—can convert photographic salience into spaces where visitors choose to dwell [60]. A governance dashboard can monitor dwell-time-weighted vitality, sentiment deviations from baseline, and salience–experience mismatches, alerting managers when crowding or thermal thresholds are approached. Iterative scenario testing across times of day, routing options, and distributions of food, beverage, and amenities supports low-impact, reversible adjustments that respect authenticity while improving experience. Robustness requires multilingual, multi-platform monitoring, privacy-by-design, and coordination with heritage authorities [61].

6. Conclusions

Core findings. Tourist experience in the Historic Centre of Macau emerges from the intertwined dynamics of spatial visibility, movement intensity, and affective expression. The proposed coupling typology differentiates High behavior–High emotion, Low behavior–High emotion, and balanced immersion model, explaining why landmark corridors may be vibrant yet ambivalent, whereas peripheral nodes can sustain high emotional resonance despite modest footfall.
Theoretical implications. By jointly modeling seeing–feeling–moving at the streetscape scale, this study advances a micro-level account of how configurational properties and cultural symbolism co-produce experiential quality. It extends emotional geography with configurational metrics and clarifies the conditions under which visual salience amplifies—or undermines—willingness to dwell.
Methodological contributions. The integration of image recognition, BERT-based sentiment, and UWB trajectories within a georegistered digital-twin yields reproducible, fine-grained insights that single-source approaches cannot provide, enabling threshold detection and scenario testing for design and management.
This study has several limitations. Sentiment analysis was constrained by the capacity of the BERT model, suggesting that multidimensional and multilingual schemes could enhance classification accuracy. The trajectory dataset was limited by the composition of volunteers and the observation duration; integrating mobile application logs or Bluetooth-based positioning could expand spatial and temporal coverage. The one-year observation window did not capture responses to seasonal festivals or unexpected events. Reliance on Chinese social media facilitated data access but introduced language–platform bias, potentially underrepresenting non-Chinese visitor segments; integrating multilingual, multi-platform sources could improve representativeness and cross-cultural generalizability.
Practical guidance and future directions. For landmark corridors, pair visibility management with comfort safeguards (microclimate, queuing, resting niches) to prevent ambivalence; for peripheral nodes, deploy interpretive services and wayfinding to convert symbolic resonance into dwell time. Future work will compare cities and cultural contexts, integrate interviews and psychological measures to probe mechanisms, and leverage large language models and multimodal generators for real-time path prediction, emotion-aware services, and adaptive crowd management.

Author Contributions

Conceptualization, D.A.; methodology, D.A.; software, D.A.; validation, D.K. and Y.T.; formal analysis, D.K.; investigation, D.K.; resources, Y.T.; data curation, Y.T.; writing—original draft preparation, D.A.; writing—review and editing, F.Z.; visualization, D.A.; supervision, F.Z.; project administration, F.Z.; funding acquisition, D.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Shenzhen University (803-0000340939, 701-000381), NSFC (5210080700), Shenzhen Science and Technology Program (ZDSYS20210623101534001).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Lynch, K. The Image of the City; MIT Press: Cambridge, MA, USA, 1960. [Google Scholar]
  2. Batty, M. Digital twins. Environ. Plan. B Urban Anal. City Sci. 2018, 45, 817–820. [Google Scholar] [CrossRef]
  3. United Nations. Transforming Our World: The 2030 Agenda for Sustainable Development (A/RES/70/1); United Nations General Assembly: New York, NY, USA, 2015. [Google Scholar]
  4. Forsyth, A. What is a walkable place? Walkability Debate Urban design. Urban Des. Int. 2015, 20, 274–292. [Google Scholar] [CrossRef]
  5. Blumer, H. Symbolic Interactionism: Contemporary Sociological Theory; Wiley-Blackwell: Oxford, UK, 2012; p. 62. [Google Scholar]
  6. Crawford, K. The Atlas of AI: Power, Politics, and the Planetary Costs of Artificial Intelligence; Yale University Press: New Haven, CT, USA, 2021. [Google Scholar]
  7. Lin, M.S.; Liang, Y.; Xue, J.X.; Pan, B.; Schroeder, A. Destination image through social media analytics and survey method. Int. J. Contemp. Hosp. Manag. 2021, 33, 2219–2238. [Google Scholar] [CrossRef]
  8. Jiang, K. Meet West in East: New Routing of Old Footprints in the Heritage Tourism in Macao. Springer Proc. Bus. Econ. 2023, 2, 349–363. [Google Scholar] [CrossRef]
  9. Teng, Y.; Huang, Y.; Xie, Z.; Hu, Y. Research on the Largo and architectural landscape of Macau from the perspective of historical layering. Appl. Math. Nonlinear Sci. 2022, 7, 675–684. [Google Scholar] [CrossRef]
  10. Chang, D.; Penn, A. Integrated multilevel circulation in dense urban areas: The effect of multiple interacting constraints on the use of complex urban areas. Environ. Plan. B Plan. Des. 1998, 25, 507–538. [Google Scholar] [CrossRef]
  11. Hillier, B. Space Is the Machine: A Configurational Theory of Architecture; Space Syntax: London, UK, 1996. [Google Scholar]
  12. Turner, A.; Doxa, M.; O’sullivan, D.; Penn, A. From isovists to visibility graphs: A methodology for the analysis of architectural space. Environ. Plan. B Plan. Des. 2001, 28, 103–121. [Google Scholar] [CrossRef]
  13. Hillier, B.; Hanson, J. The Social Logic of Space; Cambridge University Press: Cambridge, UK, 1984. [Google Scholar]
  14. Li, Y.; Xiao, L.; Ye, Y.; Xu, W.; Law, A. Understanding tourist space at a historic site through space syntax analysis: The case of Gulangyu, China. Tour. Manag. 2016, 52, 30–43. [Google Scholar] [CrossRef]
  15. García-Palomares, J.C.; Gutiérrez, J.; Mínguez, C. Identification of tourist hot spots based on social networks: A comparative analysis of European metropolises using photo-sharing services and GIS. Appl. Geogr. 2015, 63, 408–417. [Google Scholar] [CrossRef]
  16. Xiao, X.; Fang, C.; Lin, H.; Chen, J. A framework for quantitative analysis and differentiated marketing of tourism destination image based on visual content of photos. Tour. Manag. 2022, 93, 104585. [Google Scholar] [CrossRef]
  17. Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, real-time object detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar] [CrossRef]
  18. Li, H.; Ma, Y.; Ma, Z.; Zhu, H. Weibo Text Sentiment Analysis Based on BERT and Deep Learning. Appl. Sci. 2021, 11, 10774. [Google Scholar] [CrossRef]
  19. Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the NAACL-HLT 2019, Minneapolis, MN, USA, 2–7 June 2019; pp. 4171–4186. [Google Scholar] [CrossRef]
  20. Heikinheimo, V.; Tenkanen, H.; Bergroth, C.; Järv, O.; Hiippala, T.; Toivonen, T. Understanding the use of urban green spaces from user-generated geographic information. Landsc. Urban Plan. 2020, 201, 103845. [Google Scholar] [CrossRef]
  21. Qanazi, S.; Leclerc, E.; Bosredon, P. Integrating Social Dimensions into Urban Digital Twins: A Review and Proposed Framework for Social Digital Twins. Smart Cities 2025, 8, 23. [Google Scholar] [CrossRef]
  22. Hu, F.; Li, Z.; Yang, C.; Jiang, Y. A graph-based approach to detecting tourist movement patterns using social media data. Cartogr. Geogr. Inf. Sci. 2018, 46, 368–382. [Google Scholar] [CrossRef]
  23. Lin, H.; Wen, H.; Zhang, D.; Yang, L.; Hong, X.; Wen, C. How Social Media Data Mirror Spatio-Temporal Behavioral Patterns of Tourists in Urban Forests: A Case Study of Kushan Scenic Area in Fuzhou, China. Forests 2024, 15, 1016. [Google Scholar] [CrossRef]
  24. Van Der Zee, E.; Bertocchi, D. Finding patterns in urban tourist behaviour: A social network analysis approach based on TripAdvisor reviews. Inf. Technol. Tour. 2018, 20, 153–180. [Google Scholar] [CrossRef]
  25. Jiang, W.; Xiong, Z.; Su, Q.; Long, Y.; Song, X.; Sun, P. Using Geotagged Social Media Data to Explore Sentiment Changes in Tourist Flow: A Spatiotemporal Analytical Framework. ISPRS Int. J. Geo Inf. 2021, 10, 135. [Google Scholar] [CrossRef]
  26. Gao, Z.; Zeng, H.; Zhang, X.; Wu, H.; Zhang, R.; Sun, Y.; Du, Q.; Zhao, Z.; Li, Z.; Zhao, F.; et al. Exploring tourist spatiotemporal behavior differences and tourism infrastructure supply–demand pattern fusing social media and nighttime light remote sensing data. Int. J. Digit. Earth 2024, 17, 2310723. [Google Scholar] [CrossRef]
  27. Nica, E.; Popescu, G.H.; Poliak, M.; Kliestik, T.; Sabie, O.-M. Digital Twin Simulation Tools, Spatial Cognition Algorithms, and Multi-Sensor Fusion Technology in Sustainable Urban Governance Networks. Mathematics 2023, 11, 1981. [Google Scholar] [CrossRef]
  28. Ai, D.; Wang, H.; Kuang, D.; Zhang, X.; Rao, X. Measuring pedestrians’ movement and building a visual-based attractiveness map of public spaces using smartphones. Comput. Environ. Urban Syst. 2024, 108, 102070. [Google Scholar] [CrossRef]
  29. Tai, K.-W.; Liao, M.; Liu, X. Exploring the Convergence of Cyber–Physical Space: Multidimensional Modeling of Overtourism Interactions. Trans. Gis. 2024, 28, 2425–2444. [Google Scholar] [CrossRef]
  30. Valls, F.; Roca, J. Visualizing Digital Traces for Sustainable Urban Management: Mapping Tourism Activity on the Virtual Public Space. Sustainability 2021, 13, 3159. [Google Scholar] [CrossRef]
  31. Huang, X.; White, M.; Langenheim, N. Towards an Inclusive Walking Community—A Multi-Criteria Digital Evaluation Approach to Facilitate Accessible Journeys. Buildings 2022, 12, 1191. [Google Scholar] [CrossRef]
  32. Hu, X.; Shen, X.; Shi, Y.; Li, C.; Zhu, W. Multidimensional Spatial Vitality Automated Monitoring Method for Public Open Spaces Based on Computer Vision Technology: Case Study of Nanjing’s Daxing Palace Square. ISPRS Int. J. Geo-Inf. 2024, 13, 48. [Google Scholar] [CrossRef]
  33. Burgos-Thorsen, S.; Munk, A.K. Opening alternative data imaginaries in urban studies: Unfolding COVID place attachments through Instagram photos and computational visual methods. Cities 2023, 141, 104470. [Google Scholar] [CrossRef]
  34. Liu, Y.; Gao, S.; Yuan, Y.; Zhang, F.; Kang, C.; Kang, Y.; Wang, K. Methods of Social Sensing for Urban Studies. In Urban Remote Sensing: Monitoring Synthesis, and Modeling in the Urban Environment, 2nd ed.; Yang, X., Ed.; John Wiley & Sons Ltd.: London, UK, 2021; pp. 71–89. [Google Scholar]
  35. Kovács, Z.; Vida, G.; Elekes, Á.; Kovalcsik, T. Combining Social Media and Mobile Positioning Data in the Analysis of Tourist Flows: A Case Study from Szeged, Hungary. Sustainability 2021, 13, 2926. [Google Scholar] [CrossRef]
  36. Xing, X.; Yu, B.; Kang, C.; Huang, B.; Gong, J.; Liu, Y. The synergy between remote sensing and social sensing in urban studies: Review and perspectives. IEEE Geosci. Remote Sens. Mag. 2024, 12, 108–137. [Google Scholar] [CrossRef]
  37. Hu, Y.; Zhan, X.; Ye, X. Mining tourist mobility patterns from geo-tagged social media data. Comput. Environ. Urban Syst. 2023, 98, 101950. [Google Scholar]
  38. Pei, F.; Jiang, R.; Zhuang, C.; Liu, J.; Yuan, M. Knowledge-graph-enhanced disturbance control in manufacturing systems: A state-of-the-art review. Int. J. Comput. Integr. Manuf. 2025, 1–27. [Google Scholar] [CrossRef]
  39. Dang, N.H.; Maurer, O. Place-Related Concepts and Pro-Environmental Behavior in Tourism Research: A Conceptual Framework. Sustainability 2021, 13, 11861. [Google Scholar] [CrossRef]
  40. Hu, Y.; Ding, J.; Dou, Z.; Chang, H. Short-Text Classification Detector: A Bert-Based Mental Approach. Comput. Intell. Neurosci. 2022, 2022, 8660828. [Google Scholar] [CrossRef]
  41. Shoval, N.; Ahas, R. The use of tracking technologies in tourism research: The first decade. Tour. Geogr. 2016, 18, 587–606. [Google Scholar] [CrossRef]
  42. Natapov, A.S.; Cohen, A.; Dalyot, S. Urban planning and design with points of interest and visual perception. Environ. Plan. B Urban Anal. City Sci. 2023, 51, 641–655. [Google Scholar] [CrossRef]
  43. Chen, J.; Shoval, N.; Stantic, B. Tracking tourist mobility in the big data era: Insights from data, theory, and future directions. Tour. Geogr. 2024, 26, 1381–1411. [Google Scholar] [CrossRef]
  44. Deng, N.; Quan, Y.; Cheng, X.; Qin, J. Seeing is visiting: Discerning tourists’ behavior from landmarks in ordinary photos. Curr. Issues Tour. 2022, 26, 2494–2512. [Google Scholar] [CrossRef]
  45. Camagni, R.; Capello, R.; Cerisola, S.; Panzera, E. The Cultural Heritage–Territorial Capital nexus: Theory and empirics/Il nesso tra Patrimonio Culturale e Capitale Territoriale: Teoria ed evidenza empirica. Il Capitale Cult. Stud. Value Cult. Herit. 2020, 11, 33–59. [Google Scholar] [CrossRef]
  46. Rasoolimanesh, S.M.; Lu, S.C. Enhancing emotional responses of tourists in cultural heritage tourism: The case of Pingyao, China. J. Herit. Tour. 2023, 19, 91–110. [Google Scholar] [CrossRef]
  47. Wan, Y.K.P. Tourist accessibility of heritage spaces through the lens of spatial justice. Curr. Issues Tour. 2023, 27, 636–652. [Google Scholar] [CrossRef]
  48. Szubert, M.; Warcholik, W.; Żemła, M. The Influence of Elements of Cultural Heritage on the Image of Destinations, Using Four Polish Cities as an Example. Land 2021, 10, 671. [Google Scholar] [CrossRef]
  49. Cao, S.; FA, A.S.; Xu, Y. Enhancing Cultural Heritage Tourism through Market Innovation and Technology Integration. Evol. Stud. Imaginative Cult. 2024, 8, 122–131. [Google Scholar] [CrossRef]
  50. Kim, J.-H.; Guo, J.; Wang, Y. Tourists’ negative emotions: Antecedents and consequences. Curr. Issues Tour. 2021, 25, 1987–2005. [Google Scholar] [CrossRef]
  51. Wang, D.; Wang, Y.; Fu, X.; Dou, M.; Dong, S.; Zhang, D. Revealing the spatial co-occurrence patterns of multi-emotions from social media data. Telemat. Inform. 2023, 84, 102025. [Google Scholar] [CrossRef]
  52. Wood, E.H. I Remember How We All Felt: Perceived Emotional Synchrony through Tourist Memory Sharing. J. Travel Res. 2020, 59, 1339–1352. [Google Scholar] [CrossRef]
  53. Cheer, J.M.; Mary, M.; Lew, A.A. Cultural ecosystem services and placemaking in peripheral areas: A tourism geographies agenda. Tour. Geogr. 2022, 24, 495–500. [Google Scholar] [CrossRef]
  54. Boffi, M.; Rainisio, N.; Inghilleri, P. Nurturing Cultural Heritages and Place Attachment through Street Art—A Longitudinal Psycho-Social Analysis of a Neighborhood Renewal Process. Sustainability 2023, 15, 10437. [Google Scholar] [CrossRef]
  55. Pugalis, L.; Pugalis, L.; Pugalis, L. The culture and economics of urban public space design: Public and professional perceptions. Urban Des. Int. 2009, 14, 215–230. [Google Scholar] [CrossRef]
  56. Fabbricatti, K.; Boissenin, L.; Citoni, M. Heritage Community Resilience: Towards new approaches for urban resilience and sustainability. City Territ. Archit. 2020, 7, 17. [Google Scholar] [CrossRef]
  57. Rivero Moreno, L.D. Sustainable city storytelling: Cultural heritage as a resource for a greener and fairer urban development. J. Cult. Herit. Manag. Sustain. Dev. 2020, 10, 399–412. [Google Scholar] [CrossRef]
  58. Shahat, E.; Hyun, C.T.; Yeom, C. City Digital Twin Potentials: A Review and Research Agenda. Sustainability 2021, 13, 3386. [Google Scholar] [CrossRef]
  59. Peldon, D.; Banihashemi, S.; LeNguyen, K.; Derrible, S. Navigating urban complexity: The transformative role of digital twins in smart city development. Sustain. Cities Soc. 2024, 111, 105583. [Google Scholar] [CrossRef]
  60. Bibri, S.E.; Huang, J.; Jagatheesaperumal, S.K.; Krogstie, J. The Synergistic Interplay of Artificial Intelligence and Digital Twin in Environmentally Planning Sustainable Smart Cities: A Comprehensive Systematic Review. Environ. Sci. Ecotechnol. 2024, 20, 100433. [Google Scholar] [CrossRef] [PubMed]
  61. Shehadeh, A.; Alshboul, O.; Arar, M. Enhancing Urban Sustainability and Resilience: Employing Digital Twin Technologies for Integrated WEFE Nexus Management to Achieve SDGs. Sustainability 2024, 16, 7398. [Google Scholar] [CrossRef]
Figure 1. Location of the study area.
Figure 1. Location of the study area.
Sustainability 17 07573 g001
Figure 2. Flow diagram of the research design.
Figure 2. Flow diagram of the research design.
Sustainability 17 07573 g002
Figure 3. Scene classification of social media images using the YOLO algorithm.
Figure 3. Scene classification of social media images using the YOLO algorithm.
Sustainability 17 07573 g003
Figure 4. Object detection of primary subjects in Weibo-sourced images using YOLO, which reveals that the functional attribute of this street is mainly associated with transportation facilities.
Figure 4. Object detection of primary subjects in Weibo-sourced images using YOLO, which reveals that the functional attribute of this street is mainly associated with transportation facilities.
Sustainability 17 07573 g004
Figure 5. Proportions of spatial elements derived from image recognition in the historic center.
Figure 5. Proportions of spatial elements derived from image recognition in the historic center.
Sustainability 17 07573 g005
Figure 6. Co-occurrence analysis of Weibo tourism-related terms.
Figure 6. Co-occurrence analysis of Weibo tourism-related terms.
Sustainability 17 07573 g006
Figure 7. 3D-scanned heat maps of tourist trajectories at the Barra Square, St. Dominic’s Square, Lilau Square, Cathedral Square (from left to right, the figure illustrates: a top-view satellite image, an isometric 3D digital model, and a top-view pedestrian trajectory map).
Figure 7. 3D-scanned heat maps of tourist trajectories at the Barra Square, St. Dominic’s Square, Lilau Square, Cathedral Square (from left to right, the figure illustrates: a top-view satellite image, an isometric 3D digital model, and a top-view pedestrian trajectory map).
Sustainability 17 07573 g007
Figure 8. 3D-scanned heat maps of tourist trajectories at the St. Augustine’s Square, Company of Jesus Square, Senado Square, Camoes Square (from left to right, the figure illustrates: a top-view satellite image, an isometric 3D digital model, and a top-view pedestrian trajectory map).
Figure 8. 3D-scanned heat maps of tourist trajectories at the St. Augustine’s Square, Company of Jesus Square, Senado Square, Camoes Square (from left to right, the figure illustrates: a top-view satellite image, an isometric 3D digital model, and a top-view pedestrian trajectory map).
Sustainability 17 07573 g008
Table 1. Shares of detected spatial elements in user-posted social media photos of Macau’s historic center.
Table 1. Shares of detected spatial elements in user-posted social media photos of Macau’s historic center.
CategoryBarra SquareSt. Dominic’s SquareLilau SquareCathedral SquareSt. Augustine’s SquareCompany of Jesus SquareSenado SquareCamoes SquareOverall Share
Historical Architecture (%)25.21624.92519.58319.96814.85829.60410.73816.42614.746
Modern Architecture (%)3.1072.3462.7691.9872.2832.7702.4192.7397.256
Public Facilities (%)8.7368.5919.11011.83213.3737.8668.6268.6168.344
Environmental Landscape (%)7.8531.7422.8488.9441.7882.1071.6412.3081.819
Transportation (%)17.2159.09412.7299.68910.2108.4489.99320.5339.294
Commercial Facilities (%)9.96727.38120.3569.00616.46332.90436.11213.99227.003
Food and Beverage (%)11.36712.14211.30812.54623.3456.63322.61413.87618.263
Cultural Facilities (%)6.8808.9326.1679.4727.9305.6483.1906.3728.260
Urban Ecology (%)6.5181.65811.97312.3295.5131.2291.35012.0661.620
Note. Calculated from object-detection counts in geotagged photos for each square (numerator: valid detections of a given element; denominator: total valid detections of all elements in that square); a single photo may contain multiple elements, so detection counts can exceed photo counts; column totals may be <100% due to rounding and a small “other/unidentified” remainder.
Table 2. High-Frequency Keywords Related to Macau in Weibo Posts.
Table 2. High-Frequency Keywords Related to Macau in Weibo Posts.
RankKeywordFrequency
1Company of Jesus Square16.77%
2A-Ma Temple15.61%
3Local Cuisine14.03%
4Historic center of Macau13.42%
5Travel12.58%
6Macau Tower12.00%
7Portuguese Egg Tart11.37%
8Fireworks11.00%
9Church10.56%
10Tourist Attraction10.19%
11Night View9.84%
12Crowded9.65%
13Culture9.35%
14Beverage9.26%
15Buffet9.05%
16Tourist Photo8.94%
17Museum8.71%
18Shopping8.50%
19Street8.23%
20Cultural Heritage7.90%
Table 3. Shares of trajectory stay-samples associated with mapped spatial elements in Macau’s historic center.
Table 3. Shares of trajectory stay-samples associated with mapped spatial elements in Macau’s historic center.
CategoryBarra SquareSt. Dominic’s SquareLilau SquareCathedral SquareSt. Augustine’s SquareCompany of Jesus SquareSenado SquareCamoes Square
Historical Architecture (%)36.231.523.840.334.738.914.26.8
Modern Architecture (%)2.82.35.73.511.25.811.33.7
Public Facilities (%)8.53.720.910.88.68.412.811.2
Environmental Landscape (%)12.710.823.612.514.319.210.514.6
Transportation (%)10.32.59.42.32.42.614.722.1
Commercial Facilities (%)2.524.23.80.710.514.328.63.5
Food & Beverage (%)2.70.87.50.59.814.55.84.6
Cultural Facilities (%)0.92.65.33.85.43.711.54.9
Urban Ecology (%)7.45.610.25.65.15.65.614.6
Note. Computed after a spatial join linking stay-samples to mapped elements within each square (numerator: dwell-time–weighted count of stay-samples associated with a given element; denominator: dwell-time–weighted total stay-samples in that square); some stay-samples may not meet the join criteria and are recorded and column totals may be <100% due to rounding and unassigned samples.
Table 4. Typology of Space–Emotion–Behavior Coupling Patterns.
Table 4. Typology of Space–Emotion–Behavior Coupling Patterns.
Coupling Model TypeSpatial CharacteristicsBehavioral ProfileDominant Sentiment PatternRepresentative Sites
High behavior–High emotion ModelLandmark-centered, high visibilityDense pedestrian aggregationPositive-Dominant (vibrant, iconic)Company of Jesus Square, Senado Square
Low behavior–High emotion ModelPeripheral but symbolicSparse but intentional visitationSymbolic-Reflective (solemn, tranquil)A-Ma Temple, Tap Seac Square
Balanced Immersion ModelWell-designed, accessibleModerate engagementNeutral to Mildly PositiveCamoes Square, St. Augustine’s Square
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ai, D.; Kuang, D.; Tao, Y.; Zeng, F. Integrating Image Recognition, Sentiment Analysis, and UWB Tracking for Urban Heritage Tourism: A Multimodal Case Study in Macau. Sustainability 2025, 17, 7573. https://doi.org/10.3390/su17177573

AMA Style

Ai D, Kuang D, Tao Y, Zeng F. Integrating Image Recognition, Sentiment Analysis, and UWB Tracking for Urban Heritage Tourism: A Multimodal Case Study in Macau. Sustainability. 2025; 17(17):7573. https://doi.org/10.3390/su17177573

Chicago/Turabian Style

Ai, Deng, Da Kuang, Yiqi Tao, and Fanbo Zeng. 2025. "Integrating Image Recognition, Sentiment Analysis, and UWB Tracking for Urban Heritage Tourism: A Multimodal Case Study in Macau" Sustainability 17, no. 17: 7573. https://doi.org/10.3390/su17177573

APA Style

Ai, D., Kuang, D., Tao, Y., & Zeng, F. (2025). Integrating Image Recognition, Sentiment Analysis, and UWB Tracking for Urban Heritage Tourism: A Multimodal Case Study in Macau. Sustainability, 17(17), 7573. https://doi.org/10.3390/su17177573

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop