Next Article in Journal
Prompt-Based Few-Shot Text Classification with Multi-Granularity Label Augmentation and Adaptive Verbalizer
Previous Article in Journal
Uncertainty-Aware Machine Learning for NBA Forecasting in Digital Betting Markets
Previous Article in Special Issue
Emotional Digital Storytelling as a Driver of Social Media Engagement in Higher Education: A Multi-Platform Analysis
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Visitor Satisfaction at the Macau Science Center and Its Influencing Factors Based on Multi-Source Social Media Data

by
Jingwei Liang
1,
Qingnian Deng
1,
Yufei Zhu
1,
Jiahai Liang
2,
Chunhong Wu
3,
Liang Zheng
1,* and
Yile Chen
1,*
1
Faculty of Humanities and Arts, Macau University of Science and Technology, Avenida Wai Long, Taipa, Macau 999078, China
2
Eastern Michigan Joint College of Engineering, Beibu Gulf University, 12 Binhai Avenue, Binhai New City, Qinzhou 535011, China
3
Department of Design, Chung-Ang University, 84 Heukseok-ro, Dongjak-gu, Seoul 06974, Republic of Korea
*
Authors to whom correspondence should be addressed.
Information 2026, 17(1), 57; https://doi.org/10.3390/info17010057
Submission received: 15 December 2025 / Revised: 5 January 2026 / Accepted: 5 January 2026 / Published: 8 January 2026
(This article belongs to the Special Issue Social Media Mining: Algorithms, Insights, and Applications)

Abstract

With the rise in experience economy and the popularization of digital technology, user-generated content (UGC) has become a core data source for understanding tourist needs and evaluating the service quality of venues. As a landmark venue that combines science education, interactive experience, and landscape viewing, the service quality of the Macau Science Center directly affects tourists’ travel experience and word-of-mouth dissemination. However, existing studies mostly rely on traditional questionnaire surveys and lack multi-technology collaborative analysis. In order to accurately identify the factors affecting satisfaction, this study uses 788 valid UGC data from five major platforms, namely Google Maps reviews, TripAdvisor, Sina Weibo, Xiaohongshu (Rednote), and Ctrip, from January 2023 to November 2025. It integrates word frequency analysis, semantic network analysis, latent Dirichlet allocation (LDA) topic modeling, and Valence Aware Dictionary and sEntiment Reasoner (VADER) sentiment computing to construct a systematic research framework. The study found that (1) the core attention dimensions of users cover the needs of parent–child and family visits, exhibitions and interactive experiences, ticketing and consumption services, surrounding environment and landscape, emotional evaluation, and recommendation intention. (2) The keyword association network has gradually developed from a loose network in the early stage to a comprehensive experience-dense network. (3) LDA analysis identified five main potential demand themes: comprehensive visiting experience and scenario integration, parent–child interaction and characteristic scenario experience, core venue facilities and ticketing services, visiting value and emotional evaluation, and transportation and surrounding landscapes. (4) User emotions were predominantly positive, accounting for 82.7%, while negative emotions were concentrated in local service details, and the emotional scores showed a fluctuating upward trend. This study provides targeted suggestions for the service optimization of the Macau Science Center and also provides a methodological reference for UGC-driven research in similar cultural venues.

1. Introduction

1.1. Research Background

With the rise in experience economy and the popularization of digital technology, user-generated content (UGC) has become a core data source for understanding tourist needs and evaluating the service quality of destinations or venues [1]. As an international destination that integrates historical culture and modern tourism, Macau’s tourism-related research focuses on historical heritage, festival activities, and smart tourism technology applications [2,3,4]. However, research on the factors influencing visitor satisfaction of cultural venues using a combination of multiple methods, including word frequency analysis, semantic networks, LDA (latent Dirichlet allocation) theme modeling, and VADER (Valence Aware Dictionary and sEntiment Reasoner) sentiment computing, is still insufficient. The Macau Science Center (Centro de Ciência de Macau), designed by Chinese American architect Ieoh Ming Pei, is a unique and iconic building that integrates science education, interactive experience, and landscape viewing [5,6]. Its service quality directly affects the overall travel experience of tourists and the spread of the venue’s reputation. As the core carrier of science popularization tourism in Macau, the Macau Science Museum’s operational quality is not only related to meeting the science popularization needs of residents but also affects international tourists’ comprehensive evaluations of Macau’s tourism. However, most existing studies on science popularization venues focus on traditional museums, lacking in-depth analysis of the influencing factors of satisfaction in composite venues featuring science popularization, interaction, and landscape and systematic analysis based on multi-source UGC data. Therefore, conducting multi-dimensional analysis on the operational activities of the Macau Science Museum can not only fill the research gap in the subfield but also provide data support for the venue’s service optimization, holding significant theoretical and practical value. However, existing research relies heavily on traditional questionnaire surveys, which makes it difficult to fully capture the implicit needs and emotional tendencies contained in UGC [7]. It is still necessary to combine multi-dimensional text analysis technology to accurately identify the driving factors of satisfaction from massive user feedback.

1.2. Literature Review

In tourism satisfaction research, scholars have confirmed that factors such as interactive experience, service facilities, and environmental atmosphere have a significant impact on tourist satisfaction [8,9]. For example, Pai et al. [10] found through their research on smart tourism technology in Macau that accessibility, interactivity, and other experience dimensions are directly related to tourist satisfaction and willingness to revisit. Leong et al. [11] pointed out that historical narratives and tour guide interactions indirectly promote the co-creation of value and satisfaction of cultural heritage tourists by enhancing the perception of local authenticity. In studies on venue-based tourism, Fan et al. [12] emphasized that generative needs and care and guidance for the next generation have a positive impact on the participation, experience quality, and psychological well-being of museum visitors, while the completeness of facilities and the professionalism of services are key to improving venue satisfaction [13]. Xi et al. [14] successfully identified the core dimensions of tourist perception through web text analysis in studies on ethnic minority village tourist destinations and verified the effectiveness of user-generated content (UGC) data in satisfaction research. Wei et al. [15] further demonstrated in their study on Xingping Ancient Town that UGC data can accurately capture the sources of tourists’ positive and negative perceptions. In the field of science museums, Zhang et al. [16] constructed a satisfaction evaluation system based on large-scale questionnaire data yet failed to fully leverage the advantages of dynamic feedback from UGC data. In research of other fields, for example, Ding [17] conducted a comment analysis of movie theme parks based on the latent Dirichlet allocation (LDA) model, and Tian et al. [18] carried out a study on the tourism service quality of Qingyan Ancient Town. Both studies excavated core influencing factors such as service quality and supporting facilities via UGC data, thus providing methodological references for research in the science museum field. Notably, Saoualih et al. [19] took the Majorelle Garden as the research subject, and integrated Valence Aware Dictionary and sEntiment Reasoner (VADER) and LDA technologies with UGC data. They not only identified the core influencing factors of tourist satisfaction (e.g., exhibit attractiveness, service quality, ticket price) but also revealed the correlation rules between each factor and emotional tendency. Their research findings indicated that UGC data combined with multi-technical analysis enables a more comprehensive and in-depth analysis of the satisfaction formation mechanism of cultural and tourism venues, which provides direct methodological reference and practical support for relevant research on the Macau Science Museum. However, existing research is mostly focused on historical museums or comprehensive scenic areas. For venues such as science museums that combine popular science and entertainment attributes, the specificity of the factors’ affecting satisfaction has not been fully explored.
In the processing and analysis of user-generated content (UGC) data, various text mining techniques have been widely applied, among which ROST CM6, semantic network analysis, the latent Dirichlet allocation (LDA) model, and Valence Aware Dictionary and sEntiment Reasoner (VADER) sentiment analysis are commonly used methods. For example, Jia et al. [20] analyzed the UGC of the Historic Center of Macau using ROST CM6 (Version 6.0) and identified four perceptual dimensions: culture, landscape, and emotion. Li et al. [21] combined LDA topic modeling and sentiment analysis to reveal that foreign tourists’ dissatisfaction with Chinese tourist attractions was mainly attributed to service management and infrastructure.
As a fundamental method of text mining, word frequency analysis is widely employed in UGC data research to identify users’ core areas of concern. Guo [22] identified core dimensions of concern—such as “natural landscapes” and “service experiences”—through word frequency analysis in his study on the image perception of mountain tourist destinations. Similarly, Dai et al. [23]’s research on the Xingwen Stone Forest Scenic Area demonstrated that word frequency analysis can quickly identify high-frequency keywords in user feedback, laying a foundation for subsequent analyses.
Semantic network analysis, as a vital tool for revealing the relational structure of keywords, can intuitively display the correlation intensity and network structure among keywords through elements such as nodes and links. In their study of Xingping Ancient Town, Wei et al. [15] adopted semantic network analysis and found that the keywords in tourist reviews presented a hierarchical circular structure, with the core nodes of natural landscapes and service quality showing the highest correlation intensity. In the analysis of customer suggestions, semantic network analysis revealed close correlations among keywords including service, staff, and training, providing clear directions for service optimization. This method enables researchers to understand the internal connections of users’ core concerns and holds significant application value in satisfaction research on tourist destinations and service venues.
The latent Dirichlet allocation (LDA) topic model, as an unsupervised probabilistic generative model, possesses unique advantages in latent topic mining. Based on the LDA model, Ding [17] analyzed tourist reviews of movie theme parks and successfully identified core topics such as exhibit experience, service quality, and supporting facilities. Similarly, Dai [23]’s research on the Xingwen Stone Forest Scenic Area demonstrated that the LDA model can extract topic dimensions—including “visiting experience” and “service guarantee”—from fragmented reviews. In research on historical and cultural venues, Li et al. [24] categorized tourist reviews of the Palace Museum into four major themes (i.e., entrance service, historical culture, experience perception, and heritage relics) via the LDA model, providing a reference for topic mining in similar venues. However, the application of the LDA model in existing studies mostly focuses on topic identification, lacking in-depth integration with emotional tendencies, which makes it challenging to reveal users’ emotional differences across different topics.
VADER (Valence Aware Dictionary and sEntiment Reasoner), as a lexicon- and rule-based sentiment analysis tool, is specifically designed for short texts such as social media content and reviews, featuring high computational efficiency and strong adaptability. Through comparative studies, LILI et al. [25] confirmed that VADER outperforms traditional machine learning models in short-text sentiment recognition; Youvan [26] adjusted to APA in-text citation format and systematically elaborated on the application value of VADER in cross-domain sentiment analysis. Saoualih et al. [19] employed the VADER tool to conduct sentiment analysis on TripAdvisor reviews of Majorelle Garden, classifying the reviews into three categories (i.e., positive, neutral, and negative). They revealed the changing trends of tourists’ emotions across different years through composite score calculation, and their findings indicated that VADER can accurately capture emotional tendencies in short texts. Furthermore, when combined with LDA topic modeling, it can further clarify the emotional distribution characteristics under different topics. However, in existing studies, VADER sentiment analysis is mostly used independently, lacking synergy with topic modeling and word frequency analysis. This deficiency makes it difficult to comprehensively reveal the correlation laws between emotional tendencies, specific influencing factors, and latent topics.
These studies have verified the effectiveness of UGC text analysis in tourism research, but research that combines word frequency analysis, semantic networks, LDA topic modeling, and VADER sentiment calculation to systematically explore the factors influencing satisfaction of a single venue remains relatively scarce. Nevertheless, the existing studies have several limitations: First, there are few cases of integrating multiple technologies into satisfaction research on science museums. Most existing studies adopt a single or two technologies, making it difficult to conduct multi-dimensional and in-depth analysis of UGC data. Second, there is a lack of systematic integrated analysis of the relational structure of keywords, latent topics, and emotional tendencies, resulting in research conclusions that lack precision in guiding practice. Third, specialized satisfaction research targeting the Macau Science Museum is relatively scarce, failing to meet the personalized service optimization needs of the venue.

1.3. Problem Statement and Objectives

This study takes the Macau Science Center as its research subject. Based on user-generated content (UGC) data, it integrates multiple techniques such as word frequency analysis, semantic network analysis, LDA topic modeling, and VADER sentiment analysis to construct a systematic research framework. The aim is to accurately identify users’ core attention dimensions regarding the Macau Science Center, uncover the underlying needs hidden behind the text, and clarify the correlation patterns between various influencing factors and users’ emotional tendencies. Ultimately, it provides targeted practical suggestions for optimizing the venue’s services and improving visitor satisfaction, while also offering methodological references for UGC-driven research at similar cultural venues.
Furthermore, the research questions focused on four aspects: (1) What are the core dimensions of user attention to the Macau Science Center? (2) What is the correlation strength and network structure of the core attention keywords? (3) What are the underlying needs and themes hidden behind user feedback? (4) What are the correlation patterns between various influencing factors and user emotional tendencies?
This study not only provides data support for the service decision-making of the Macau Science Center but also constructs a closed-loop of “social data collection—knowledge mining—decision application”. It offers a practical framework for cultural venues to optimize operational decisions using social media data, highlighting the core value of information in management decision-making.
To make the research logic clearer, the subsequent structure of this paper is arranged as follows: Section 2 introduces the overview of the study area, data sources and processing procedures, and core analysis techniques; Section 3 presents the empirical results of word frequency analysis, semantic network analysis, LDA topic modeling, and VADER sentiment analysis; Section 4 discusses the core findings in combination with existing research, explains the influencing mechanisms, proposes optimization strategies, and analyzes the research limitations; Section 5 summarizes the key conclusions and looks forward to future research directions.

2. Study Area and Methodology

2.1. Study Area: Macau Science Center

The research area of this study is the Macau Science Center, which opened to the public on 25 January 2010. Dedicated to popularizing scientific knowledge, especially for children and teenagers, it has been listed as a “National Science Popularization Education Base” and a “National Science Popularization and Education Base” by the China Association for Science and Technology (CAST), making it a significant science popularization venue in Macau [27,28]. The Macau Science Museum was selected as the research object based on the following multi-dimensional rationality: (1) Local Dimension: As the only iconic venue integrating science popularization education, interactive experience, and landscape appreciation in Macau, its positioning as a National Science Popularization Education Base makes it the core carrier of the integration of tourism and science popularization in Macau, and its service quality directly affects the overall reputation of Macau’s tourism. In addition, its UGC data covers both domestic and foreign platforms, which can reflect the differences in demand among different tourist groups. (2) International Dimension: Designed by renowned architect I.M. Pei, the venue features the integration of Chinese and Western cultures, and its operational model has reference value for international counterparts of science popularization venues. Meanwhile, as an international tourist city, Macau attracts tourists from a wide range of sources, and the diversity and richness of UGC data provides sufficient samples for multi-source data analysis. (3) Methodological Adaptability: The diverse functions of the venue, including science popularization, interactive experience, and landscape appreciation, make its user feedback cover multiple dimensions such as service, experience, and environment. This is highly consistent with the multi-dimensional analysis framework of word frequency–semantic network–LDA–VADER in this study, which can fully verify the effectiveness of this methodology in the satisfaction research of complex venues.
It is located on Avenida Dr. Sun Yat Sen in the Zona Nova de Aterros do Porto Exterior (NAPE) of the Macau Special Administrative Region, adjacent to the Macau Cultural Center and Macau Fisherman’s Wharf (Doca dos Pescadores de Macau). The site is bordered by land to the north and the sea to the south, situated at the intersection of natural and urban landscapes (Figure 1). The building has a total floor area of approximately 20,000 square meters and consists of three parts: an exhibition center, a planetarium, and a conference center. The surrounding transportation network is well-developed and convenient, with a bus stop at the main entrance leading directly to the Macau Science Center. Parking is also available for visitors arriving by car.

2.2. Data Sources and Processing

The data sources and processing section, as shown in Figure 2, comprises three parts: data sources, data processing, and data analysis.

2.2.1. Data Sources

The original data came from five major platforms: Google Maps reviews (https://www.google.com/maps/place/%E6%BE%B3%E9%97%A8%E7%A7%91%E5%AD%A6%E9%A6%86/@22.1862371,113.5554998,17.55z/data=!4m8!3m7!1s0x34017ac426801ad9:0x8f4e0b04342949b9!8m2!3d22.1860646!4d113.5568421!9m1!1b1!16s%2Fm%2F0fqnhhl?entry=ttu&g_ep=EgoyMDI1MTIwOS4wIKXMDSoASAFQAw%3D%3D, accessed on 14 December 2025), TripAdvisor (https://www.tripadvisor.com.hk/Attraction_Review-g664891-d2005224-Reviews-Macau_Science_Center-Macau.html, accessed on 14 December 2025), Sina Weibo, Xiaohongshu (Rednote), and Ctrip. A web crawler script was written using Python (Version 3.14.2) to specifically crawl user-generated content related to the Macau Science Center, including text data such as visitor reviews, experience sharing, and check-in notes, to ensure comprehensiveness and representativeness of the data.
Google Maps reviews are user feedback on businesses and locations, including ratings (1–5 stars), written experiences, and uploaded photos/videos [29]. These reviews appear on maps and in search results, serving as a crucial basis for business reputation and ranking. Users can write reviews, share photos, like reviews, or respond with emojis (such as “thumbs up,” “great,” etc.) [29,30]. Businesses can also reply to reviews to demonstrate their appreciation for customer feedback. In addition, TripAdvisor is an international travel review website that provides travel-related information such as hotels, attractions, and restaurants around the world, as well as interactive travel forums [31]. Since Macau is an international tourist city, and leaving reviews on Google Maps is common among international tourists, it became a significant source of overseas data for this study.
Sina Weibo is a user-relationship-based platform for information sharing, dissemination, and access [32]. Users can post updates, upload photos and videos, or live stream videos via web pages, WAP pages, and mobile applications, enabling instant sharing and interactive communication. As of the second quarter of 2024, Sina Weibo had 583 million monthly active users (MAU) and 256 million daily active users (DAU), boasting a massive and steadily growing user base [33]. Its continued focus on vertical sectors and top-tier content has solidified its leading position in China’s social media landscape. Similarly, Xiaohongshu (Rednote) is a lifestyle sharing community and social e-commerce platform primarily based on UGC [34,35]. Users share their shopping experiences and lifestyle insights into fashion, beauty, travel, food, and other fields through images, text, and short videos. They can also directly discover and purchase products on the platform. It combines the interactivity of social media with the convenience of e-commerce, making it particularly popular among young female users and earning it the reputation of a “product recommendation” mecca. Xiaohongshu boasts a massive user base, with 320 million MAU and approximately 120 million DAU as of mid-2024 [36]. Young women make up most of its users, with over 70% born after 1995 and a primary concentration in first- and second-tier cities. The platform offers rich content and strong search demand, with nearly 600 million daily searches in 2024. This naturally makes it one of the sources of comment data. Ctrip is a major online travel platform in China, offering comprehensive services [37]. Its review section is an important reference for understanding real travel experiences. Ctrip travel reviews mainly focus on its platform convenience (user-friendly interface, powerful search), product diversity (one-stop service for flights, hotels, train tickets, etc.), and service experience (guide quality, tour member quality, and itinerary arrangement). In summary, these five platforms encompass genuine reviews from tourists both domestically and internationally, providing a source of evaluation data for the Macau Science Center.

2.2.2. Data Preprocessing

To ensure the accuracy and effectiveness of subsequent analysis, the raw data was cleaned and standardized in multiple dimensions. The specific steps are as follows:
(1)
Field processing: Useless fields irrelevant to the research were deleted, including redundant identifiers and irrelevant system parameters. Core fields such as text content, publication time, and platform source were normalized to unify the data format.
(2)
Date filtering: The data time range was limited to January 2023 to November 2025 to ensure data timeliness.
(3)
Noise removal: Emojis, special symbols, topic tags, and web links were removed in batches from the text. Invalid content consisting of pure images without text was also filtered to reduce noise interference.
(4)
Deduplication: Duplicate data within and across platforms was removed to avoid data redundancy.
(5)
Data merging: Since only three valid data entries remained after filtering on the TripAdvisor platform, the sample size was too small for separate analysis. The main reason is that there are relatively few reviews on TripAdvisor from January 2023 to November 2025, which may be related to travel usage after the COVID-19 outbreak. Therefore, data on TripAdvisor reviews was merged with data on the Google Maps platform reviews to form a unified dataset.
(6)
This study uses Google Translate (https://translate.google.com/?hl=zh-TW&sl=auto&tl=en&op=translate, accessed on 4 January 2026) for the English translation of Chinese texts, and language detection is manually verified on the Google Translate platform. Due to differences in the consistency of text languages across different platforms (e.g., Chinese platforms are predominantly in Chinese, while international platforms are mainly in English), the target language for translation is uniformly set to English.

2.2.3. Data Statistics

After the above preprocessing, the effective data volumes from each platform are as follows: 303 entries from the merged dataset of Google Map and TripAdvisor, 180 entries from Sina Weibo, 86 entries from Xiaohongshu, and 409 entries from Ctrip. Subsequently, the effective data from all platforms were aggregated and integrated, and all Chinese text was uniformly translated into English using Google Translate, ultimately yielding 788 effective data entries that meet the requirements of subsequent analysis. This provides high-quality data support for subsequent word frequency analysis, semantic network analysis, LDA topic modeling, and VADER sentiment calculation. Table 1 shows the UGC original texts and translation matching example.

2.3. Analysis Techniques

2.3.1. ROST CM6.0 Word Frequency Analysis

The construction process of the custom stopword file (Stopwords.txt) in this study is as follows: First, refer to general English stopword lists, such as the NLTK default stopword list, to screen basic function words (e.g., “the”, “and”, “in”); second, in combination with the characteristics of tourism review texts, manually add meaningless redundant words, such as numbers and time fragments like “14” and “21:”; third, exclude words related to the research topic, such as core terms like “macau”, “science”, and “center”, and finally form a custom list. The complete list is provided in Appendix A.
Word frequency analysis is a fundamental method in text mining. By statistically analyzing the frequency of words in a text, the core issues and key information that users are interested in can be quickly identified [19]. This study uses the ROST CM6.0 software for word frequency analysis. This tool is commonly used in text analysis and has mature applications in UGC-related research, such as tourism destination perception and policy text interpretation. It can efficiently complete functions such as word segmentation, stop word filtering, and word frequency statistics [7,38].
Based on the research needs and data characteristics, and referring to text preprocessing specifications, the following parameters were configured for the ROST CM6.0 software. First, the “case sensitive” option was unchecked to avoid statistical bias caused by differences in word cases. Second, a custom list of stop words (the author’s self-made Stopwords.txt file) was loaded to filter out meaningless function words such as “the” and “and” as well as redundant words irrelevant to the research. Third, the “merge word forms” function was checked to merge different forms of words, such as plural and gerund forms, into a unified basic word form to ensure statistical consistency. The “merge word forms” function is implemented via the built-in lemmatization algorithm of ROST CM6.0. It standardizes plural forms of nouns (e.g., exhibits to exhibit) and different tenses of verbs (e.g., played to play) into their respective base word forms, thereby ensuring statistical consistency. Third, synonyms were manually merged, such as merging “child” and “kid” into “child” to eliminate interference from synonymous words with different forms. For specific details of synonym merging, see Table 2. Finally, a word frequency threshold was set, and the top 60 most frequently occurring words were extracted and exported as a word frequency table to focus on content of great interest to users. Domain-specific terms related to science museums, such as planetarium and exhibition hall, are all retained without being merged or deleted. Only synonymous expressions have been standardized; for example, “Macau Science Center” and “MSC” are consolidated into the single term “museum” to facilitate statistical work.
To more intuitively present the distribution characteristics of high-frequency words, word cloud diagrams were drawn using the Matplotlib (https://matplotlib.org/, accessed on 14 December 2025) and wordcloud libraries in Python based on the word frequency data generated by ROST CM6.0. The font size of words in the word cloud diagram was positively correlated with their frequency of occurrence, with high-frequency core words highlighted in larger fonts, intuitively reflecting the focus of user attention and providing a foundation for subsequent sentiment analysis and topic mining [39,40].

2.3.2. Semantic Network Analysis

Semantic network analysis is a visualization analysis method based on word co-occurrence relationships. By using keywords as nodes and word co-occurrence relationships as edges, it intuitively presents the association strength of core concepts, network structure, and core node characteristics, thereby revealing the hidden semantic association patterns in the text [7,19]. This study uses semantic network analysis to explore the inherent association patterns of core keywords in UGC related to the Macau Science Center, clarify the association strength of each attention dimension, and provide network structure support for subsequent topic modeling and sentiment analysis [38].
This study uses the Python programming language combined with its third-party libraries to construct and visualize semantic networks. The key tool libraries involved include NetworkX (https://networkx.org/en/, accessed on 14 December 2025), Matplotlib, Pandas (2.3.3 version, https://pandas.pydata.org/, accessed on 14 December 2025), NLTK—Natural Language Toolkit (https://www.nltk.org/, accessed on 14 December 2025), etc. At the same time, keyword preprocessing optimization is performed by combining the word frequency analysis results of the ROST CM 6.0 software [39]. First, for text preprocessing parameters, load the custom stop word list ( S t o p w o r d s . t x t ), filter function words and redundant words, and convert all text to lowercase. Second, for node filtering parameters, in the overall semantic network, the minimum frequency threshold of keywords is set to four. In the time segment semantic network, it is divided into four time periods, namely Early, Mid1, Mid2, and Late, and the minimum frequency threshold is adjusted to three. To highlight the core association, only the top 30 core nodes in the network sorted by degree centrality are retained to construct the visualization network. Third, for edge filtering parameters, in the overall semantic network, the minimum threshold of co-occurrence weight between keywords is set to three, and the minimum threshold of co-occurrence weight in the time segment semantic network is set to two. Isolated nodes and associated edges with weights below the threshold are automatically deleted from the network to optimize the network structure. Fourth, for network construction parameters, an undirected graph structure is used to construct the semantic network; node attributes include size and centrality. Edge attributes are the co-occurrence frequency between keywords, serving as a quantitative basis for association strength. Finally, for layout and visualization parameters, the s p r i n g _ l a y o u t layout algorithm is used; the layout compactness k value is calculated based on 10.0 core nodes, and the range is limited to 1.2 to 3.0. The number of iterations is set to 400, and the random seed is fixed at 42 to ensure that the results are repeatable. The visualization results are an output.
The core value of semantic network analysis lies in transforming scattered keywords into a structured network model through co-occurrence relationships. Core nodes reflect the core issues that users care about, and the weights of related edges reflect the degree of association between issues [41]. The result provides intuitive visualization support and a quantitative basis for the subsequent analysis of the internal association mechanism of factors affecting satisfaction in this study.

2.3.3. LDA Model Analysis

Latent Dirichlet allocation (LDA) is an unsupervised probabilistic topic modeling method that can automatically mine hidden latent topics from massive amounts of unstructured text, revealing the inherent semantic structure of text data [42,43]. In tourism satisfaction research, the LDA model has been widely used to extract core demand topics from user-generated content (UGC), providing support for accurately identifying factors influencing satisfaction [44,45]. This study uses the LDA model to mine latent demand topics in UGC related to the Macau Science Center, clarifying the core dimension classification of user attention and laying the foundation for subsequent sentiment association analysis.
This study uses the Python programming language and its third-party library Gensim (https://github.com/piskvorky/gensim, accessed on 14 December 2025) to build and train the LDA model. This library has mature application cases in the field of topic modeling and can efficiently complete corpus construction, model training, and result output [46]. The analysis process strictly follows the technical framework of text preprocessing, corpus construction, model training, topic evaluation, and result interpretation. It refers to the standardized process of UGC text topic mining in existing studies [47,48].
Perplexity and coherence are crucial indicators for evaluating the latent Dirichlet allocation (LDA) topic model: the lower the perplexity, the better the model performance, and the higher the coherence, the better the model performance. Prior to conducting LDA topic model analysis, it is necessary to clarify the perplexity and coherence performance under different numbers of topics to ensure the rationality of LDA model training results. In this study, the Gensim library in Python was used to train the LDA model on the preprocessed valid data, and the perplexity and coherence indicators were calculated when the number of topics ranged from two to eight. Perplexity showed a trend of a slight initial increase, followed by a continuous decline as the number of topics increased; coherence exhibited fluctuating characteristics: it reached the highest level when the number of topics was two, dropped significantly when the number of topics increased to three, and then reached a stage-level high at five topics (Figure 3). Based on the evaluation criteria, when the number of topics was set to five, the model achieved both relatively low perplexity and high coherence, making it the optimal number of topics after balancing the two indicators.
Combining the characteristics of the research data with the LDA model application specifications [49,50], this study uses a self-made Python program to set the following core parameters to ensure the effectiveness of model training and the reliability of results. First, the text is preprocessed with the following parameters: text splitting size ( c h u n k _ s i z e ), 100 words/segment, and splitting long texts into short text segments suitable for model training. Vocabulary filtering conditions: Retain words with a length ≥3 and filter out meaningless short words. Document filtering threshold: Retain only documents with a cleaned vocabulary count ≥5 and remove invalid short texts. High-frequency word selection: Based on the word frequency table generated by ROST CM6.0, only high-frequency words are retained for topic modeling. Finally, stop filtering: Load a custom stop word list ( S t o p w o r d s . t x t ) to filter out function words and redundant words.
Second, the model is trained in the following parameters:
  • Number of topics ( n u m _ t o p i c s ): Five; determined manually after a comprehensive analysis of coherence scores and perplexity scores for two to eight topics.
  • Random seed ( r a n d o m _ s t a t e ): 100; ensuring repeatable training results.
  • Training epochs ( p a s s e s ): 10; improving the model’s traversal learning of the corpus.
  • Number of iterations ( i t e r a t i o n s ): 50 per document; enhancing the model’s accuracy in topic assignment for individual documents.
  • The alpha parameter: a u t o ; automatically optimizes the prior probability of topic distribution.
  • Topic output setting ( p e r _ w o r d _ t o p i c s ): True; outputs the probability distribution of each word in each topic.
Third, the coherence index was adopted to evaluate the model, and the c_v index was used to quantify topic coherence: the closer the score is to one, the better the topic coherence and the higher the discriminability [51].
Through the above parameter settings and model training, two core objectives are achieved: First, extract potential demand topics from 788 valid UGC texts and clarify the core keywords and semantic features of each topic [52,53]. Second, to output the probability distribution of each document under each topic, quantify the correlation strength between user feedback and different demand topics [54]; provide data support for revealing the correlation between topics and sentiment in subsequent VADER sentiment analysis.

2.3.4. VADER Sentiment Analysis

VADER (Valence Aware Dictionary and sEntiment Reasoner) is a hybrid sentiment analysis tool based on dictionaries and rules. It is designed for informal texts, such as short ones on social media and user comments. It can accurately capture the sentiment polarity and intensity of text and is especially suitable for features such as subjective evaluation, sentiment expression, and interjections contained in tourism UGC [55]. This study uses the S e n t i m e n t I n t e n s i t y A n a l y z e r module in the NLTK library of Python to implement VADER sentiment analysis. This tool has been proven to have high adaptability and reliability in tourism satisfaction research [19] and can effectively correlate user sentiment tendencies with venue service influencing factors.
This study utilizes a Python program to perform sentiment analysis. The core parameters are set as follows: sentiment classification thresholds; a compound sentiment score ≥ 0.05 is classified as “Positive”, ≤ −0.05 as “Negative”, and values in between as “Neutral”. First, text preprocessing is performed, with parameters including filtering nouns and meaningless short words ≤ 2 characters in length. Stop word processing involves loading a custom list of stop words ( S t o p w o r d s . t x t ), filtering function words and redundant vocabulary, and automatically identifying domain interference words at the 30th percentile, such as frequently co-occurring words across positive and negative sentiment, to avoid redundant words interfering with the analysis. Trend analysis parameters calculate the average sentiment score at a monthly time granularity ( r e s a m p l e ( M ) ) to generate a time series trend. Keyword extraction parameters extract the top 15 high-frequency feature words from the positive and negative sentiment texts for semantic and sentiment association analysis. Finally, visualization results are an output.
def perform_semantic_analysis(text):
 sid = SentimentIntensityAnalyzer()
 sentiment_score = sid.polarity_scores(text)
 if sentiment_score[‘compound’] ≥ 0.05:
  return “Positive”
 elif sentiment_score[‘compound’] ≤ −0.05:
  return “Negative”
 else:
  return “Neutral”
Subsequently, data preprocessing was performed, including cleaning, denoising, and part-of-speech tagging of the translated English UGC text, removing null values and invalid text. Sentiment scoring was conducted, using s i d   =   S e n t i m e n t I n t e n s i t y A n a l y z e r ( ) to calculate the composite sentiment score and positive/negative/neutral scores for each text. Sentiment classification was performed, categorizing sentiment types based on preset thresholds and calculating the proportion of each type. Automatic detection of domain interference words was performed, filtering out generic domain terms with no distinguishing significance through intersection analysis of the positive and negative sentiment texts. Feature association and visualization were then performed, extracting core feature words from the positive and negative sentiment texts and creating sentiment distribution pie charts, time trend charts, and semantic and sentiment keyword comparison charts to support subsequent correlation analysis between sentiment and influencing factors.

3. Results

3.1. Identification of Core Focus Dimensions

From the overall perspective of the word frequency table and word cloud, high-frequency words cover not only the core elements of the venue but also multiple aspects, such as the visitor experience, surrounding environment, consumer services, and emotional evaluation, forming a multi-dimensional and three-dimensional attention network. Among them, parent–child-related words, core facilities of the venue, landscape environment, and emotional feedback occupy the dominant position. This reflects that users’ attention to the Macau Science Center focuses not only on the services and content of the venue itself but also extends to the comprehensive experience and external environment during the visit (Table 3, Figure 4).
The core findings show that “child” ranks first with 754 occurrences, followed by “hall”, “museum”, “exhibition”, etc. Combined with the word cloud (Figure 4), it is clear that users’ core concerns are concentrated in five dimensions: parent–child and family visit needs, exhibition and interactive experience, ticketing and consumption services, surrounding environment and landscape, and emotional evaluation and recommendation intention. The distribution characteristics of high-frequency words confirm the Macau Science Center’s composite positioning of “science popularization + interaction + landscape”. The absolute advantage of parent–child-related words indicates that family groups are the core audience. The high frequency of words such as “family”, “adult”, “play”, and “fun” further illustrates that users attach great importance to children’s sense of participation and entertainment, regarding the venue as an important place for parent–child interaction and science enlightenment; words such as “ticket”, “free”, “McDonald”, and “hotel” reflect the importance of consumption service dimensions such as ticket cost-effectiveness and surrounding catering and accommodation; words related to geographical location such as “sea”, “view”, and “firework” highlight the value of the surrounding environment and landscape as a unique experience dimension; emotional and recommendation words such as “great”, “good”, and “recommend” directly reflect users’ positive satisfaction tendencies. The identification of these core dimensions lays the foundation for subsequent analysis of keyword association patterns, mining of potential demand themes, and exploration of emotional correlation mechanisms, and points out the core direction for the venue to accurately optimize services.

3.2. Keyword Association Network Features

The semantic network presents a “core-extension” hierarchical characteristic (Figure 5). The core nodes always revolve around “child”, “museum”, and “exhibition”, gradually expanding from early basic service associations (Figure 5b) to scenario-based experiences (Figure 5d), full tourism process integration (Figure 5e), and a comprehensive experience network (Figure 5c). Node size is positively correlated with word frequency, color depth reflects centrality, and line width and color correspond to keyword co-occurrence weight, intuitively presenting the association strength and network structure of core concepts.
The evolutionary trajectory of the network reflects the transformation of users’ perception of the venue from a “single service provider” to a “comprehensive travel scenario”, suggesting that the venue needs to strengthen the linkage between core services and extended experiences. In the early stage, the core nodes of the network were concentrated on “child”, “museum”, “exhibition”, and “hall”, and mainly associated with infrastructure words such as “entrance” and “floor”, with low co-occurrence frequency and loose node connections; users focused on basic venue space and initial parent–child needs; in the mid-term, the core nodes remained stable, with new nodes such as “time”, “activities”, and “fireworks” added; the association between “child” and “time”, “activities”, and “exhibition” strengthened, the experience dimension expanded to scenario-based, and the network density increased significantly; subsequently, the core cluster of the network became denser, with new tourism-related nodes such as “trip” and “photos”, and supporting nodes such as “hotel”, “restaurant”, and “bus”; parent–child experience was deeply integrated with venue services, travel scenarios, and living facilities; in the recent stage, the core nodes still revolve around core elements; with new landscape nodes such as “statue” and “night”, and transportation nodes such as “transportation”, the association between “view” and “experience” was strengthened, and user needs shifted to an integrated scenario of “venue services + landscape viewing + travel experience”. Overall, the keyword association network has evolved from an early loose basic service network to a multi-dimensional, high-density comprehensive experience network. The core associations have always revolved around family needs and core venue facilities, the extended dimensions have continued to enrich, and the node association strength and co-occurrence frequency have been continuously improved, reflecting the continuous deepening of users’ perception of the venue.

3.3. LDA Topic Modeling Results: Potential Needs Topic Decomposition

Through latent Dirichlet allocation (LDA) topic modeling, five potential demand topics were successfully extracted from 788 valid user-generated content (UGC) texts. The topic coherence score of the model was 0.3729; the closer the score is to one, the better the topic coherence, indicating that each topic has clear boundaries and focused semantics, and can effectively reflect different dimensions of users’ core needs (Table 4).
The five potential topics cover different dimensions of user needs: Topic 1 focuses on comprehensive tour experience and scene integration, centered on the keyword “museum” and associated with “sea”, “art”, “coffee”, “city”, and other words, reflecting users’ composite demand of “museum visit + urban leisure” that is highly consistent with the Macau Science Museum’s positioning of “popular science + landscape”, while words like “first” and “year” imply the appeal of the venue to users with different visit frequencies; Topic 2 centers on parent–child interaction and characteristic scene experience, with “children” and “fireworks” as the core keywords (each with a weight of 0.013), combined with “exhibition”, “interactive”, “october”, and other words, focusing on the core needs of parent–child groups and their impression of characteristic scenes such as October fireworks, confirming that family parent–child groups are one of the core target audiences and that “interactivity” and “scene interestingness” are key factors affecting their satisfaction; Topic 3 highlights core venue facilities and ticketing services, with “hall” having the highest weight (0.028), followed by “children”, “exhibition”, “tickets”, and other keywords, reflecting users’ attention to core information such as exhibition hall layout, exhibition content, and ticketing policies; words like “planetarium” and “fun” indicate that the experience of characteristic functional areas and the interestingness of the venue are important indicators for measuring core service quality, while the high frequency of ticketing-related keywords reflects that ticket cost–performance is a key consideration for users’ decision-making; Topic 4 is dominated by keywords such as “great”, “worth”, and “experience”, reflecting users’ direct emotional feedback and evaluation of tour value, with expressions like “great” and “worth visiting” conveying positive recognition, and words like “free” and “fireworks” indicating that free resources and characteristic scenes are important bonus items for enhancing the sense of tour value, providing a clear thematic anchor for subsequent emotional correlation analysis; Topic 5 focuses on transportation and surrounding landscapes, with “bus” as the core keyword associated with “zhuhai”, “sea”, “view”, “walk”, and other words, reflecting users’ attention to the convenience of bus travel, the geographical location adjacent to Zhuhai, and coastal scenery; the word “walk” implies the walking accessibility around the venue and the demand for casual wandering, showing the impact of the extended experience of “transportation + landscape” on overall satisfaction.
In summary, the five potential topics, covering core services to extended experiences, and from functional needs to emotional evaluations, construct a complete system of user needs. Among them, parent–child-related needs run through multiple topics, with core facilities and ticketing services being the basic guarantee, and the surrounding environment and emotional value being important bonus items. This lays a structured foundation for subsequently accurately correlating emotional tendencies with influencing factors and proposing targeted service optimization strategies.

3.4. VADER Sentiment Analysis Results: Correlation Patterns Between Sentiment and Influencing Factors

VADER sentiment analysis based on 788 valid UGC texts systematically revealed the correlation characteristics between user sentiment and various influencing factors of the Macau Science Center, which are mainly characterized by a dominant positive sentiment, steady upward sentiment trend, and strong correlation between positive and negative sentiments and specific service dimensions, providing quantitative support for accurately grasping the satisfaction-driven mechanism (Figure 6, Figure 7 and Figure 8).
Positive user sentiment accounts for 82.7% (Figure 8), and positive sentiment is strongly correlated with keywords such as “hall”, “exhibition”, and “sea” (Figure 6); the sentiment score shows a fluctuating upward trend from 2023 to 2025 (Figure 7). The sentiment trend echoes the service optimization process, and negative sentiment is concentrated on local details, indicating that core services have been recognized; detail refinement is the key breakthrough for improving satisfaction. Specifically, high-frequency unique words in positive reviews are mainly concentrated on “hall”, “great”, “free”, “fun”, “floor”, etc., corresponding to key influencing factors such as exhibition space, service quality, ticket discounts, interactive experience, venue facilities, and surrounding supporting facilities. The high-frequency co-occurrence of emotional words such as “great”, “good”, and “perfect” with service-related words such as “hall”, “exhibition”, and “free” indicates that users’ positive feedback directly comes from the recognition of the venue’s core functions and additional value; the frequency of high-frequency unique words in negative reviews is significantly lower than that in positive reviews, mainly including “attitude”, “driver”, “eating”, “refund”, etc., involving service attitude, transportation, catering experience, after-sales guarantee, and other dimensions; negative sentiment is concentrated on specific service details rather than core functions, reflecting that the venue still has room for optimization in supporting details and personalized services beyond basic services. In terms of time trend, the sentiment score was generally low in 2023 (0.42–0.55), fluctuated and rose in 2024 (from 0.58 in February to a peak of 0.80 in September), and remained stable at a high level in 2025 (0.43–0.75), reaching 0.74 in November. This trend echoes the evolution of the keyword association network, confirming the positive guiding effect of the venue’s measures in core service improvement, supporting facility improvement and environmental optimization on user emotional satisfaction. In the overall sentiment distribution, neutral sentiment accounts for 9.9%, mainly derived from users’ objective descriptions of the visit process without obvious emotional tendency, further indicating that there are no universal problems in the core service dimensions of the venue, and the overall service quality is at a high level.

4. Discussion

4.1. Comparison with Existing Research

Existing tourism satisfaction studies mostly focus on historical heritage, comprehensive scenic spots, or smart tourism technology applications. For venues like science museums that combine science popularization and entertainment, the specificity of the factors influencing their satisfaction using a multi-method framework has not been fully explored. This study takes the Macau Science Center as a specific research object and integrates a multi-technology analysis framework based on UGC data, effectively filling the gap in the subdivision of science popularization venue satisfaction research. Compared with Pai et al.’s research on the experience of smart tourism technology in Macau, this study further refines the core experience dimensions of venue tourism, confirming that the synergistic effect of parent–child needs and interactive experiences; environmental landscape has a more significant impact on the satisfaction of science popularization venues, and that user attention has extended from single technology accessibility to comprehensive scene experience [3]. Leong et al. emphasized the indirect driving effect of historical narrative and tour guide interaction on the satisfaction of cultural heritage visitors, while this study found that science museum visitors focus more on exhibition interactivity and parent–child suitability. This difference highlights the functional characteristics of science popularization venues with experiential learning and family participation as the core [2].
At the research methodology level, existing UGC-driven tourism research mostly uses one or two of the following techniques (word frequency analysis, LDA topic modeling, or sentiment analysis), lacking a systematic framework for multi-method collaboration. This study integrates four major techniques (word frequency analysis, semantic network analysis, LDA topic modeling, and VADER sentiment computing), and constructs a complete analysis chain of “surface attention identification, association structure analysis, potential theme mining, and sentiment association verification”. Compared with Jia et al.’s research using ROST CM6.0 to analyze UGC in the Historic Center of Macau, this study reveals the dynamic evolution characteristics of keyword association through semantic network analysis and achieves accurate quantitative association between demand themes and sentiment tendencies through the deep integration of LDA and VADER [19]. Li et al. combined LDA and sentiment analysis to identify that foreign tourists’ dissatisfaction with Chinese attractions is concentrated in service management and infrastructure. This study further found that the negative sentiment of science popularization venues mainly stems from local service details rather than core functions, and the correlation strength between positive sentiment and environmental landscape and interactive experience is significantly higher than that of traditional scenic spots [21]. The Macau Science Center’s unique location and landscape advantages, along with its high-quality core service supply, significantly contribute to this difference.
In terms of demand dimension mining, Fan et al. pointed out that museum visitor participation is affected by generative demand (care and guidance for the next generation). This study verified this conclusion and further refined it, confirming that parent–child demand is the core carrier of generative demand in science popularization venues. High-frequency words such as “play,” “fun,” and “suitable” fully demonstrate the direct driving effect of children’s sense of participation and entertainment on the satisfaction of family visitors [12]. Luo et al. emphasized that the completeness of facilities and the professionalism of services are the key to improving venue satisfaction. This study, through word frequency analysis and LDA topic modeling, clarified that the cost-effectiveness of ticketing and the special functional areas, such as the planetarium and the surrounding catering and accommodation facilities, are key component dimensions of the core services of science popularization venues and supplemented the specific connotation of the core service system of science popularization venues [13].
Compared with existing UGC research in the tourism field, this study focuses more on technology integration and knowledge transformation of social data analysis. Through multi-method collaboration, it achieves in-depth value mining of UGC data instead of mere demand description. Meanwhile, it directly transforms the analysis results into actionable service optimization strategies, clarifies the application path of social data in the decision-making process, and fills the research gap in data-decision transformation.

4.2. Interpretation of the Influencing Mechanism of Factors

The mechanism influencing visitor satisfaction at the Macau Science Center exhibits a multi-dimensional collaborative characteristic of “core driving force, basic guarantee, key value-added, and optimization extension”. The various factors interconnect and progressively enhance each other, forming a complete driving system for user satisfaction.
Parent–child needs are the core driving factor, and their influence runs through the entire visit experience cycle. In the word frequency analysis, “child” ranked first with an absolute high frequency of 754 times. Related words such as “family” and “adult” appeared frequently. Combined with the core node status of “child” in the semantic network, it is confirmed that the family group is the core audience of the Macau Science Center. This mechanism is deeply bound to the science museum’s function of popular science education and entertainment [12]. The strong correlation between words such as “play,” “fun”, “suitable”, and “child” further illustrates that children’s sense of participation, fun, and suitability are direct influencing factors of parent–child family satisfaction. Words such as “first” and “year” reflect the phenomenon of first visits and revisits, indicating that a high-quality parent–child experience can create a continuous attraction and serve as a key link in transforming satisfaction into the willingness to revisit.
Core services and supporting facilities are the basic guarantee of satisfaction. LDA Topic 3, namely Core Services and Supporting Facility Services, has the highest weight of “hall” (0.028) in most documents. The high frequency of keywords such as “ticket”, “free”, and “planetarium” confirms that the cost-effectiveness of tickets, the operation of special functional areas, and the surrounding supporting facilities are the core demands of users [13]. Semantic network analysis shows that there are strong links between core service nodes and other dimensions. This means that the completeness of core services has a direct effect on how users perceive the quality of interactive experiences and environmental landscapes. High-quality basic services positively enhance other experience dimensions, while the absence of core services weakens the positive effects of value-added factors such as landscapes and interactions. VADER sentiment analysis further validates this, showing that core service-related words such as “hall”, “exhibition”, and “free” have the strongest association with positive sentiments, while negative sentiments are concentrated only on localized details such as service attitude and food quality, confirming that the stability of core services is a key foundation for maintaining high overall satisfaction.
The surrounding environment and landscape are important value-added factors for satisfaction. The location advantage of the Macau Science Center facing the sea to the south makes its landscape experience transcend the simple activities inside the venue. High-frequency words such as “sea”, “view”, and “firework” exist independently with Topic 5, namely Transportation and Surrounding Landscapes, confirming that the environment and landscape have become the core component of the visitor experience [19]. The gradual strengthening of the association between landscape-related words and travel-related words such as “trip” and “travel” in the semantic network indicates that users have regarded the Macau Science Center as an important node in the “urban travel scene.” The linkage perception between the venue and the surrounding geographical environment and characteristic scenes, such as Zhuhai and the fireworks landscape, significantly enhances the comprehensive value of the visit. The high positive emotional association of landscape-related words in VADER sentiment analysis further illustrates that the beautiful natural and urban landscape is a unique value-added advantage of science popularization venues that distinguishes them from traditional museums and can effectively enhance the overall experience perception of users.
Interactive experience and tour experience are optimization extension factors for satisfaction. The presence of keywords such as “interactive”, “planetarium”, and “show” in LDA Topic 2, i.e., Parent–Child Interaction and Characteristic Scene Experience, combined with the trend of strengthening association between interactive words and core nodes in the semantic network, confirms that the fun, participation, and professionalism of the exhibition are key optimization points to improve satisfaction [21]. The high frequency of time-related words such as “time”, “hour”, and “minute” reflects the indirect impact of the richness of exhibition content and the arrangement of the tour rhythm on user experience—a reasonable exhibition layout and rhythm design can improve tour efficiency and enhance the immersiveness of the interactive experience. The high frequency of co-occurrence of positive emotion words such as “great”, “good”, and “perfect” with interactive words confirms that a high-quality interactive experience can further amplify the positive effects of core services and environmental landscape, forming a superimposed improvement in satisfaction.

4.3. Optimization Strategies for Venue Services

Based on the above analysis of the influencing mechanisms, and combined with the dimensions of core needs and emotional association patterns identified in the research results, the following targeted strategies are proposed for optimizing the service content and improving visitor satisfaction at the Macau Science Center:
Upgrade the family-friendly service system. In response to the high-frequency core needs of children, add age-appropriate interactive zones and optimize the family-friendly suitability of the exhibition content. Specifically, increase sensory experience exhibits for younger children and strengthen interactive science experiment projects for school-aged children. Improve family facilities by adding dedicated facilities such as family restrooms, children’s rest areas, and baby care rooms. Set up child safety devices and fun guide signs in the exhibition hall. Optimize ticketing and activity design by launching preferential policies such as family packages and family annual cards. Combined with the high-frequency attention of free activities, launch free special activities such as family science workshops and children’s science dramas to enhance the willingness of family visitors to revisit [12].
Improve the quality of exhibitions and interactive experiences. Focus on core keywords such as “interactive” and “planetarium”. Upgrade the technological content and fun of existing interactive exhibitions, and regularly update exhibition themes and exhibit combinations to avoid user fatigue. Rationally plan the layout of exhibition halls and visitor routes, set up clear guide signs and visit duration suggestions, optimize the flow of people to popular exhibits, and improve visit efficiency. Strengthen the operation of special functional areas such as planetariums, increase the number of screenings in the planetarium, launch themed star science popularization activities, and create core experience IPs [21]. Design differentiated visit schemes, such as “basic tour routes” and “in-depth experience routes,” that incorporate time-related terms like “time” and “hour” to meet the varying time needs of users.
Ticketing and support facilities are well-developed. Based on the core consumer focus of tickets and free, the ticketing pricing strategy is optimized, and diversified ticketing products such as student tickets, group tickets, and off-peak season discounts are launched. Free open days or free exhibition areas are set up to improve the cost-effectiveness of tickets [13]. The surrounding supporting services are improved, and cooperation with surrounding catering and accommodation institutions is strengthened. Linked products such as “venue visit and catering discounts” and “venue visit and accommodation packages” are launched to improve the experience quality related to supporting words such as McDonald’s and hotel. The transportation facilities are optimized, and the shuttle buses between the venue and major transportation hubs and popular attractions are increased. The parking lot management is improved to reduce negative emotional feedback related to words such as “driver” and “railway” [55].
Improve the quality of detailed services. Strengthen the training of service personnel to enhance the professionalism and friendliness of popular science explanations and consultation guidance, focusing on service attitude and catering experience, which are the focus of negative emotions. Optimize the catering supply in and around the venue, introduce more diversified catering options, improve the quality and cost performance of catering, and meet the dietary needs of different users [55]. Improve the after-sales guarantee mechanism, simplify the ticket refund process, establish a rapid response channel for user feedback, and reduce negative feedback related to refunds. Strengthen the construction of the signage system and barrier-free facilities in the venue to improve the convenience of visiting for special groups, such as elderly and disabled tourists.
General optimization suggestions for science museum-type venues: (1) Construction of a family-friendly system: Regardless of differences in venue positioning, efforts should be made to strengthen the design of age-stratified interactive exhibition areas and family supporting facilities, such as children’s rest areas and parent–child guided tour services. Launch family packages and popular science-themed parent–child activities to meet the core needs of family visitors. (2) Upgrade of exhibition interactivity: Regularly update interactive exhibits and thematic exhibitions, optimize the circulation design of exhibition areas and guide signs, and create core experience IPs based on venue characteristics (e.g., planetariums, science laboratories) to enhance the fun and professionalism of exhibitions. (3) Optimization of ticketing and supporting services: Establish a diversified ticketing system, including student tickets, group tickets, and off-peak season discounts. Set up free open days or free exhibition areas, and strengthen linkage cooperation with surrounding catering, accommodation, and transportation facilities to improve tourists’ comprehensive experience. (4) Improvement in detailed services: Enhance the popular science explanation and consultation service capabilities of staff, optimize catering supply and after-sales guarantee mechanisms, improve barrier-free facilities and signage systems, and reduce negative feedback caused by local service details.

4.4. Limitations

Although this study has made a series of valuable findings through a multi-technology fusion framework, it still has the following limitations: First, the data sources are limited to five major platforms: Google Maps reviews, TripAdvisor, Sina Weibo, Xiaohongshu (Rednote), and Ctrip. This study may have missed user feedback from other channels such as niche social platforms, offline guestbooks, and travel forums, resulting in insufficient data coverage [19]. Second, the time range is limited to January 2023 through November 2025. Although it can reflect the recent user demand characteristics, it lacks longitudinal comparison data over a longer period, making it difficult to capture the long-term evolution trend of user demand and major events, such as the impact of large-scale exhibitions and policy adjustments. Third, there is room for optimization in the analysis methods. The LDA model sets the number of topics to five. Although the topic consistency test has verified the model, there may be additional potential topics that have not been explored. The VADER sentiment analysis is mainly based on the surface sentiment characteristics of the text and fails to fully capture the complex emotional tendencies implied by users [19]. Fourth, this study did not consider demographic characteristics such as user age, gender, region, and educational background, making it difficult to analyze the differences in needs among different groups and to explore the moderating effect of these characteristics on the satisfaction mechanism. Fifth, the study did not conduct a correlation analysis between satisfaction and revisit intentions or word-of-mouth dissemination, failing to fully present the transmission path of influencing factors. Sixth, this study uniformly translated Chinese UGC texts into English via Google Translate for analysis. Machine translation may have semantic deviations, especially limited accuracy in converting emotional expressions, dialectal vocabulary, etc., which may affect the accuracy of sentiment analysis and topic modeling. Seventh, the study did not collect demographic data such as users’ age, gender, region, and educational background, making it impossible to analyze demand differences and preference characteristics among different groups, which limits the precision of service optimization suggestions. Eighth, the research object is only the Macau Science Center. Its unique geographical location, functional positioning, and customer group structure may restrict the generalizability of the research results, making it difficult to directly extend to other types and regions of science popularization venues. In the future, the scope of research objects needs to be expanded to improve the universality of the conclusions. Ninth, the credibility screening adopted in this study still has limitations, as it failed to integrate advanced methods and technologies in the field of fake information detection. For instance, it did not adopt the LSTM-based topic modeling identification model proposed by Sarraf et al. [56], nor did it reference the multi-feature detection method combining LDA topic modeling with VADER sentiment analysis developed by Thakur et al. [57]. In future studies, natural language processing (NLP) technology, account credit scoring models, and multi-source cross-validation mechanisms can be introduced to further enhance the rigor and accuracy of data credibility assessment.

5. Conclusions

5.1. Key Findings

This study, using the Macau Science Center as its research subject, based on 788 valid user-generated content (UGC) data points from January 2023 to November 2025, integrated four major technologies: word frequency analysis, semantic network analysis, LDA topic modeling, and VADER sentiment computing. This constructed a systematic research framework to accurately identify the influencing factors and correlation patterns of visitor satisfaction at the Macau Science Center (Table 5). The core findings are as follows:
First, users’ core concerns regarding the Macau Science Center exhibit multi-layered and comprehensive characteristics, covering five categories: parent–child and family visit needs, exhibitions and interactive experiences, ticketing and consumption services, surrounding environment and landscape, and emotional evaluation and recommendation intentions. Among these, parent–child needs are the absolute core focus, with “child” ranking first in the word frequency list with 754 occurrences. Related words such as “family” and “adult” also frequently co-occurred, confirming that families are the core audience of the venue.
Second, the keyword association network exhibits a clear dynamic evolution trajectory: from an early, loose network focusing on basic venue services and initial family needs, it gradually developed into a denser network, expanding to scenario-based experiences and integrating travel and supporting facilities in the mid-term, ultimately forming a comprehensive association network focusing on venue services, landscape viewing, and travel experiences in the recent stage. The core associations consistently revolve around family needs and core venue facilities, while continuously incorporating extended dimensions such as landscape, supporting facilities, and travel, reflecting the progression of users’ perception of the venue from a single service carrier to a comprehensive travel scenario.
Third, LDA thematic modeling successfully extracted five potential demand themes: comprehensive visiting experience and scenario integration, parent–child interaction and characteristic scenario experience, core venue facilities and ticketing services, visiting value and emotional evaluation, and transportation and surrounding landscapes. The thematic distribution reflects a hierarchical and diversified user demand system: core venue facilities and ticketing services serve as the fundamental guarantee, with high-weight keywords such as hall and tickets indicating that exhibition layout, content quality, and ticket cost–performance ratio are key user decision-making factors. Collectively, these five themes fully cover users’ explicit and implicit needs for the Macau Science Center, constructing a complete demand system from core services to extended experiences and functional needs to emotional evaluation, laying a structured foundation for in-depth emotional tendency analysis and targeted service optimization strategy formulation.
Fourth, VADER sentiment analysis shows that user sentiment is predominantly positive, accounting for 82.7%; neutral sentiment accounts for 9.9%, and negative sentiment for only 7.4%. Positive sentiment is strongly correlated with core services, environment, and interactive experiences; high-frequency words such as “hall”, “exhibition”, “free”, “sea” and “view” are key triggers for positive sentiment. Negative sentiment is concentrated in localized details such as service attitude, dining, and transportation, and its impact is limited. In terms of time trends, user sentiment scores fluctuated from a low level in 2023 to a stable high level in 2025, indicating the positive guiding effect of on-site service optimization on user sentiment.
In summary, the core contributions of this study in the field of social data analysis and UGC knowledge mining lie in the following: verifying the effectiveness of the integrated analysis of multi-source UGC data; constructing a multi-dimensional knowledge mining framework applicable to cultural venues; and achieving the precise transformation of mining results into management decisions, thereby providing data-driven solutions for the smart operation of cultural venues in the information age.

5.2. Future Prospects

In view of the limitations of this study, and in light of the findings and industry development needs, the following aspects can be further explored in the future: First, expand the data sources by extending the survey scope to niche social platforms, offline comments, travel forums, and other channels; incorporating more dimensions of user feedback and adding on-site survey data to achieve mutual verification between online UGC and offline surveys, thereby improving the universality of the research conclusions [19]. Second, enrich the research perspective by introducing user demographic characteristics and analyzing the group heterogeneity of the differences in needs and satisfaction influencing mechanisms among users of different ages, genders, regions, and educational backgrounds, thus providing support for precise services. Third, optimize the analysis methods by adjusting the parameters of the LDA model to explore more subdivided potential topics, combining deep learning models such as BERT to improve the depth and accuracy of sentiment analysis and capture the complex emotional tendencies implicit in users [19]. At the same time, introduce quantitative methods such as structural equation modeling to verify the causal relationship between various influencing factors and satisfaction and revisit intention. Fourth, the research cycle is extended to conduct longer-term longitudinal studies, tracking the long-term evolution trends of factors influencing satisfaction and analyzing the impact of external variables such as large-scale exhibitions, policy adjustments, and unforeseen events, providing a dynamic decision-making basis for the long-term development of venues. Fifth, the application scenarios are expanded, applying the multi-technology collaborative analysis framework constructed in this study to other science popularization venues, museums, science and technology museums, and other similar cultural venues, providing methodological references and practical guidance for solving common problems in the industry, and promoting the standardization and systematization of cultural venue satisfaction research.

Author Contributions

Conceptualization, J.L. (Jingwei Liang), L.Z. and Y.C.; methodology, J.L. (Jingwei Liang); software, J.L. (Jingwei Liang); validation, J.L. (Jingwei Liang), Q.D., C.W. and Y.Z.; formal analysis, J.L. (Jingwei Liang), Q.D. and Y.Z.; investigation, J.L. (Jingwei Liang), Q.D., Y.Z. and J.L. (Jiahai Liang); resources, J.L. (Jiahai Liang); data curation, J.L. (Jingwei Liang) and J.L. (Jiahai Liang); writing—original draft preparation, J.L. (Jingwei Liang), Q.D. and Y.Z.; writing—review and editing, J.L. (Jingwei Liang), Q.D., Y.Z., J.L. (Jiahai Liang), C.W., L.Z. and Y.C.; visualization, J.L. (Jiahai Liang); supervision, Y.C.; project administration, J.L. (Jiahai Liang); funding acquisition, Y.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the (1) Faculty Research Grants funded by Macau University of Science and Technology (FRG-MUST) (grant number: FRG-25-041-FA; FRG-25-067-FA); (2) Guangdong Provincial Department of Education’s key scientific research platforms and projects for general universities in 2023: The Guangdong, Hong Kong, and Macau Cultural Heritage Protection and Innovation Design Team (grant number: 2023WCXTD042); (3) Guangdong Provincial Philosophy and Social Sciences Planning 2025 Lingnan Cultural Project (grant number: GD25LN30). The funders had no role in study conceptualization, data curation, formal analysis, methodology, software, decision to publish, or preparation of the manuscript. There was no additional external funding received for this study.

Institutional Review Board Statement

The study focuses on the collection and analysis of public web data, excluding specific user information and privacy details. Therefore, this study is exempt from Institutional Review Board ethical approval.

Informed Consent Statement

Informed consent was not required as the study involves scraping massive amounts of web data, excluding specific use information and privacy details.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed at the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Stopwords.txt Files

Table A1. Detailed content of the Stopwords.txt File.
Table A1. Detailed content of the Stopwords.txt File.
bcomIt’ours(themintowhensbutbuydobelowsincewhilewhere
ctheit’theirIwhenevenwheretwasthroughdoesbesidewhyhowvery
dandastheirsatwillwhyuonAfterdidbehindwellnowthen
ea.thesehowday21:00mevyoujusthavebeforeoncetwicealways
fissothoseminealldon’minewwhich14hasafteroftensometimesnever
gtoyoursuchyoursbothwantyoursxnotMacau’hadbecauseeveralreadyyet
hofAsomeheabout2heythisdohavingsincestillalmostnearly
iiforThereanyhimby15himzthathadwillwhilehardlyscarcelybarely
iiiitonlyallhisor!hisingbe*would;:)
jare,eachsheupM.sheedfromyou’shall[]{}
kthereThiseveryhertooI.her-ingwereI’should/\- 
ls0nohersoff?hersesthavetwocan_@#$
minoneusout10:00usermorellcould%^&+
ninItoneourontooffourlymyitsmay=
ocanon-onesacrossacrossoutontonesshasYoumight‘’“”th
pwithWeambetweenbetweenamongabovefultthanmuststdonmopavenida
qalsoeachbeenamongbelowbesidebehindanifherenextmacauMacauthere
rmanyThere’beingabovebeforeafterbecauseTheweIf     
Data source: Compiled by the authors.

References

  1. Yu, J.; Zhang, X.; Kim, H.S. Using Online Customer Reviews to Understand Customers’ Experience and Satisfaction with Integrated Resorts. Sustainability 2023, 15, 13049. [Google Scholar] [CrossRef]
  2. Leong, A.M.W.; Yeh, S.S.; Zhou, Y.; Hung, C.W.; Huan, T.C. Exploring the influence of historical storytelling on cultural heritage tourists’ value co-creation using tour guide interaction and authentic place as mediators. Tour. Manag. Perspect. 2024, 50, 101198. [Google Scholar] [CrossRef]
  3. Pai, C.-K.; Liu, Y.; Kang, S.; Dai, A. The Role of Perceived Smart Tourism Technology Experience for Tourist Satisfaction, Happiness and Revisit Intention. Sustainability 2020, 12, 6592. [Google Scholar] [CrossRef]
  4. Chen, Z.; Suntikul, W.; King, B. Constructing an intangible cultural heritage experiences cape: The case of the Feast of the Drunken Dragon (Macau). Tour. Manag. Perspect. 2020, 34, 100659. [Google Scholar] [CrossRef]
  5. Zhang, W.; Lin, L. Study on application of Ieoh Ming Pei’s design concept in landscape architecture design. Fresenius Environ. Bull. 2021, 30, 981–987. [Google Scholar]
  6. Xue, C.Q. World Architecture in China; Joint Publishing: Hong Kong, China, 2010. [Google Scholar]
  7. Yin, J.; Feng, J.; Wu, R.; Jia, M. Tourists’ Perception of Macau’s City Image: Based on the Analysis of User-Generated Content (UGC) Text Data. Buildings 2023, 13, 1721. [Google Scholar] [CrossRef]
  8. Wang, P.; Li, C.; Liu, J. Research on Tourist Satisfaction Evaluation of Macau’s Built Heritage Space Under the Genius Loci. Buildings 2025, 15, 1701. [Google Scholar] [CrossRef]
  9. Liu, J.; Zhu, Y.; Liu, J.; Wang, P. Unraveling Tourist Behavioral Intentions in Historic Urban Built Environment: The Mediating Role of Perceived Value via SOR Model in Macau’s Heritage Sites. Buildings 2025, 15, 2316. [Google Scholar] [CrossRef]
  10. Pai, C.K.; Chen, T.; Lee, T.J.; Wu, X.D. Hotel brand signature, brand attitude, subject norm, and perceived behavior control. J. Vacat. Mark. 2025, 31, 904–922. [Google Scholar] [CrossRef]
  11. Leong, A.M.W.; Yeh, S.S.; Chen, H.B.; Lee, C.L.; Huan, T.C. Does gender make a difference in heritage tourism experience? Searching for answers through multi-group analysis. Tour. Manag. Perspect. 2024, 52, 101250. [Google Scholar] [CrossRef]
  12. Fan, Y.; Luo, J.M. Impact of generativity on museum visitors’ engagement, experience, and psychological well-being. Tour. Manag. Perspect. 2022, 42, 100958. [Google Scholar] [CrossRef]
  13. Luo, J.M.; Ye, B.H. Role of generativity on tourists’ experience expectation, motivation and visit intention in museums. J. Hosp. Tour. Manag. 2020, 43, 120–126. [Google Scholar] [CrossRef]
  14. Xi, F.; Tang, Y.K.; Xi, L.; Xu, H.F. Analysis of tourist perception and emotion in ethnic minority village tourist destinations based on network text: A case study of Xijiang Qianhu Miao Village. Resour. Dev. Mark. 2025, 1–12. Available online: https://link.cnki.net/urlid/51.1448.N.20250729.1650.002 (accessed on 30 December 2025). (In Chinese).
  15. Wei, Y.Z.; Li, F.L. Study on the perception of Guangxi’s tourism image based on web text analysis: A case study of Xingping Ancient Town. J. Hezhou Univ. 2024, 40, 116–124. Available online: https://kns.cnki.net/kcms2/article/abstract?v=OfZxIIxxsvDB9LQp9vXX8QeWFvSdWit934Q0qwgEgIjSH9cDP3OjiwxAWxvyy-3eJ_LDNC7_GqtwcQAtHtRmpXiq3F8dgnVPmMuRDTE5461yJW3ZVOUyjJZgZXjN-FKzWaCjY8xosTbD83yTiK78jItBS3B6CwEmkq2tEO-avr82PwqwdbVSp7hoMnbqu3cc&uniplatform=NZKPT&language=CHS (accessed on 30 December 2025). (In Chinese).
  16. Zhang, H.C.; Wang, S.S.; Fan, Q.; Tang, W.Z. A study on audience satisfaction in science and technology museums based on service-dominant logic: A case study of the audience satisfaction survey of the China Science and Technology Museum. Sci. Pop. Res. 2022, 17, 57–64+95+104. [Google Scholar] [CrossRef]
  17. Ding, L. Analysis of tourist reviews in movie theme parks based on the LDA model: A case study of Huayi Brothers Movie World (Suzhou). North. Econ. Trade 2025, 10, 136–141. Available online: https://kns.cnki.net/kcms2/article/abstract?v=OfZxIIxxsvAFjrqY5j6xi5-rP7EuxRcP55D-1NDVsb0WznbmQ_ZBy3e4tnyW5yzPDU9tkEVm7o3dSbpF-KWLDM1W1dwLAr8oqH02Ldoef2UWMnV7biDFhUE2zimrzoDnovs1AaIXy_sKEO-WM885HzIFzbA68tV77oOvNjdtkwKMcOcul1xX85gFRgiVbMG-&uniplatform=NZKPT&language=CHS (accessed on 30 December 2025). (In Chinese).
  18. Tian, S.M.; Ding, Y. Study on the improvement of tourism service quality in famous historical and cultural towns based on the LDA topic model: A case study of Qingyan Ancient Town in Guiyang City. Inn. Mong. Sci. Technol. Econ. 2025, 19, 54–59+94. Available online: https://kns.cnki.net/kcms2/article/abstract?v=OfZxIIxxsvDrXXD5bCvk_Qx90gdlgBELl866Pi6QO4lmlSep-1OocgiNA5nrZQWJ18rbvqeu-TFBdD-N6hKQ4NaMEAayZFf9OB6LFNbKuzYyKzLieNCxNGIBWLbATMFbADOgFIlqT3HN2HLjomMV_tk68XCyryq4sXdOYYXC8bBpFXiAvab0sQ==&uniplatform=NZKPT&language=CHS (accessed on 30 December 2025). (In Chinese).
  19. Saoualih, A.; Safaa, L.; Bouhatous, A.; Bidan, M.; Perkumienė, D.; Aleinikovas, M.; Šilinskas, B.; Perkumas, A. Exploring the Tourist Experience of the Majorelle Garden Using VADER-Based Sentiment Analysis and the Latent Dirichlet Allocation Algorithm: The Case of TripAdvisor Reviews. Sustainability 2024, 16, 6378. [Google Scholar] [CrossRef]
  20. Jia, M.; Feng, J.; Chen, Y.; Zhao, C. Visual Analysis of Social Media Data on Experiences at a World Heritage Tourist Destination: Historic Centre of Macau. Buildings 2024, 14, 2188. [Google Scholar] [CrossRef]
  21. Li, X.; Zhang, Y.; Mei, L. Analyzing online reviews of foreign tourists to destination attractions in China: A novel text mining approach. Asia Pac. J. Tour. Res. 2023, 28, 647–666. [Google Scholar] [CrossRef]
  22. Guo, S.B. Study on the perception of the image of mountain tourism destinations based on the LDA model: A case study of Maling River Canyon Scenic Area in Guizhou Province. Commer. Exhib. Econ. 2025, 21, 115–119. [Google Scholar] [CrossRef]
  23. Dai, Q.C.; Guo, X.J.; Yin, R.Q.; Rong, X. Emotional mining and analysis of online reviews in Xingwen Stone Forest Scenic Area based on the LDA topic model. China Mark. 2025, 21, 106–109. [Google Scholar] [CrossRef]
  24. Li, N.; Xie, Z.Y.; Zhang, G.P.; Hao, Z.C.; Xiang, Z. Topic classification of tourist online reviews based on LDA: A case study of the Forbidden City. J. Inf. Eng. 2017, 3, 55–63. Available online: https://kns.cnki.net/kcms2/article/abstract?v=Ow72tX7v2w325cZxPfs-j0jhaOUqwEaA7UWh-5agyYBIX33AFCqTHT7YLhxqraHf-qACBGZfH5Xb6nTwW30OAbpREt5j0up4ZnjFMBkmHjrjLHILLT1Im47NufsYK5Q3aiyBdUP-xb3zjveL_7QpE1lF6VpXKHLzacneZsFNrI5Yv3JSEXLM4nNPEJCnaH3w&uniplatform=NZKPT&language=CHS (accessed on 30 December 2025). (In Chinese).
  25. Lili, I.; Xhina, E.; Kosta, A.; Ceni, A.; Ulqinaku, M. Comparing VADER and BERT for Short-Text Sentiment Analysis: Challenges and Observed Variations. In Pioneer and Innovative Studies in Computer Sciences and Engineering; Chapter 94; All Sciences Academy: Konya, Turkey, 2024. [Google Scholar]
  26. Youvan, D.C. Understanding Sentiment Analysis with VADER: A Comprehensive Overview and Application. AI Data Sci. J. 2024. [Google Scholar] [CrossRef]
  27. Macau Science Center. Available online: https://www.Macautourism.gov.mo/en/events/whatson/9920/ (accessed on 12 December 2025).
  28. National Science Popularization and Education Base. Available online: https://www.msc.org.mo/en/article-detail/140/1b27706a08be2dae0142cf7c0226eba9 (accessed on 12 December 2025).
  29. Borrego, Á.; Comalat Navarra, M. What users say about public libraries: An analysis of Google Maps reviews. Online Inf. Rev. 2021, 45, 84–98. [Google Scholar] [CrossRef]
  30. Khan, A.M.; Loan, F.A. Exploring the reviews of Google Maps to assess the user opinions about public libraries. Libr. Manag. 2022, 43, 601–615. [Google Scholar] [CrossRef]
  31. Compagnone, M.R.; Fiorentino, G. TripAdvisor and tourism: The linguistic behaviour of consumers in the tourism industry 2.0. In Strategies of Adaptation in Tourist Communication; Brill: Leiden, The Netherlands, 2018; pp. 270–294. [Google Scholar]
  32. Liu, Q.; Niu, K.; He, Z.; He, X. Microblog user interest modeling based on feature propagation. In Proceedings of the 2013 Sixth International Symposium on Computational Intelligence and Design, Hangzhou, China, 28–29 October 2013; IEEE: New York, NY, USA, 2013; Volume 1, pp. 383–386. [Google Scholar] [CrossRef]
  33. Weibo Released Its Q4 and Full-Year 2024 Financial Report, with Total Revenue of 12.61 Billion Yuan. Available online: https://finance.sina.com.cn/tech/2025-03-13/doc-ineppeui0597904.shtml?finpagefr=p_108 (accessed on 14 December 2025).
  34. Ning, K.X. Research on Brand Marketing Strategies on the Xiaohongshu Platform. In Proceedings of the 2024 15th International Conference on E-Business, Management and Economics, Beijing, China, 19–21 July 2024; pp. 264–270. [Google Scholar] [CrossRef]
  35. Wan, R.; Tong, L.; Knearem, T.; Li, T.J.J.; Huang, T.H.K.; Wu, Q. Hashtag re-appropriation for audience control on recommendation-driven social media Xiaohongshu (rednote). In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems, Yokohama, Japan, 26 April–1 May 2025; pp. 1–25. [Google Scholar] [CrossRef]
  36. 2025 “Active Users” Research Report (Xiaohongshu Platform). Available online: https://www.qian-gua.com/information/detail/3149 (accessed on 14 December 2025).
  37. Chen, J. Research on the Ctrip Tourism Market Analysis and Marketing Strategy Optimization. In SHS Web of Conferences; EDP Sciences: Les Ulis, France, 2024; Volume 207, p. 03003. [Google Scholar] [CrossRef]
  38. Wang, Z.; Zhou, Q.; Man, T.; He, L.; He, Y.; Qian, Y. Delineating Landscape Features Perception in Tourism-Based Traditional Villages: A Case Study of Xijiang Thousand Households Miao Village, Guizhou. Sustainability 2024, 16, 5287. [Google Scholar] [CrossRef]
  39. Weng, F.; Li, X.; Xie, Y.; Xu, Z.; Ding, F.; Ding, Z.; Zheng, Y. Study on Multidimensional Perception of National Forest Village Landscape Based on Digital Footprint Support—Anhui Xidi Village as an Example. Forests 2023, 14, 2345. [Google Scholar] [CrossRef]
  40. Chen, H.; Wu, X.; Zhang, Y. Impact of Short Video Marketing on Tourist Destination Perception in the Post-pandemic Era. Sustainability 2023, 15, 10220. [Google Scholar] [CrossRef]
  41. Liu, Y.; Lai, L.; Yuan, J. Research on Zhanjiang’s Leisure Sports Tourism Development Strategy in Coastal Recreational Areas. J. Coast. Res. 2020, 111, 248–252. [Google Scholar] [CrossRef]
  42. Guo, Y.; Barnes, S.J.; Jia, Q. Mining meaning from online ratings and reviews: Tourist satisfaction analysis using latent dirichlet allocation. Tour. Manag. 2017, 59, 467–483. [Google Scholar] [CrossRef]
  43. Bi, J.W.; Liu, Y.; Fan, Z.P.; Zhang, J. Wisdom of crowds: Conducting importance-performance analysis (IPA) through online reviews. Tour. Manag. 2019, 70, 460–478. [Google Scholar] [CrossRef]
  44. Tang, H.; Liu, X.; Li, J.; Wang, H. Study on the conservation and renewal of traditional rural tourism spaces: A perspective based on tourists’ revisit intention. J. Clean. Prod. 2025, 499, 145184. [Google Scholar] [CrossRef]
  45. Cai, S.; Lin, D.; Xiao, H. The collision of tradition and fashion: How anthropomorphizing museum exhibits influences cultural inheritance. Tour. Manag. 2025, 109, 105133. [Google Scholar] [CrossRef]
  46. Wang, J.; Fan, W.; You, J. Evaluation of tourism elements in historical and cultural blocks using machine learning: A case study of Taiping Street in Hunan Province. NPJ Herit. Sci. 2025, 13, 30. [Google Scholar] [CrossRef]
  47. Jiang, B.; Zhang, C.; Cui, Y.; Zhu, J.; Liu, Z. Enhancement of Harbin ice and snow tourism destination competitiveness: A large-scale data study based on sentiment analysis and Latent Dirichlet Allocation. PLoS ONE 2025, 20, e0319435. [Google Scholar] [CrossRef]
  48. Sun, X.; Wang, Z.; Zhou, M.; Wang, T.; Li, H. Segmenting tourists’ motivations via online reviews: An exploration of the service strategies for enhancing tourist satisfaction. Heliyon 2024, 10, e23539. [Google Scholar] [CrossRef]
  49. Wei, Y.; Chen, M. Tracing the Evolution of Tourist Perception of Destination Image: A Multi-Method Analysis of a Cultural Heritage Tourist Site. Sustainability 2025, 17, 5476. [Google Scholar] [CrossRef]
  50. Chen, Q.; Liu, R.; Jiang, Q.; Xu, S. Exploring cross-cultural disparities in tourists’ perceived images: A text mining and sentiment analysis study using LDA and BERT-BILSTM models. Data Technol. Appl. 2024, 58, 669–690. [Google Scholar] [CrossRef]
  51. Fu, L.; Fu, H.; Xiong, C. Evaluating Perceived Cultural Ecosystem Services in Urban Green Spaces Using Big Data and Machine Learning: Insights from Fragrance Hill Park in Beijing, China. Sustainability 2025, 17, 1725. [Google Scholar] [CrossRef]
  52. Zhao, X.; Sharudin, S.A.; Lv, H.L. A novel product shape design method integrating Kansei engineering and whale optimization algorithm. Adv. Eng. Inform. 2024, 62, 102847. [Google Scholar] [CrossRef]
  53. Hao, Y.; Hasna, M.F.; Aziz, F.A. Integrating LDA thematic model, FCE, and QFD methods for consumer-centered visual planning of the creative tourism destination: A macrosystem decision approach. Appl. Soft Comput. 2025, 177, 113299. [Google Scholar] [CrossRef]
  54. Tosun, P.; Cagliyor, S.I.; Gurce, M.Y. Reducing Consumer-Brand Incongruity Through Corporate Social Responsibility and Brand Trust: Exploring Negative Word-of-Mouth (NWOM). Int. J. Consum. Stud. 2024, 48, e13099. [Google Scholar] [CrossRef]
  55. Hussain, M.S.; Rahman, M.F. Customer Sentiment Analysis and Prediction of Insurance Products’ Reviews Using Machine Learning Approaches. FIIB Bus. Rev. 2023, 12, 386–402. [Google Scholar] [CrossRef]
  56. Sarraf, S.; Kushwaha, A.K.; Kar, A.K.; Dwivedi, Y.K.; Giannakis, M. How did online misinformation impact stockouts in the e-commerce supply chain during COVID-19—A mixed methods study. Int. J. Prod. Econ. 2024, 267, 109064. [Google Scholar] [CrossRef]
  57. Thakur, N.; Cui, S.; Knieling, V.; Khanna, K.; Shao, M. Investigation of the Misinformation about COVID-19 on YouTube Using Topic Modeling, Sentiment Analysis, and Language Analysis. Computation 2024, 12, 28. [Google Scholar] [CrossRef]
Figure 1. Location distribution of the Macau Science Center. (a) Geographical location of Macau in China; (b) specific location of the venue within Macau; (c) overview of the venue’s surrounding environment; (d) actual scene of the Macau Science Center. (Image source: drawn and photographed by the author).
Figure 1. Location distribution of the Macau Science Center. (a) Geographical location of Macau in China; (b) specific location of the venue within Macau; (c) overview of the venue’s surrounding environment; (d) actual scene of the Macau Science Center. (Image source: drawn and photographed by the author).
Information 17 00057 g001
Figure 2. Flowchart of data sources and processing procedures. (1) Data sources collected from five mainstream platforms (Google Maps reviews, TripAdvisor, Sina Weibo, Xiaohongshu, Ctrip). (2) Data processing including field cleaning, date filtering, noise removal, deduplication, and data merging. (3). Data statistics: Translation integration and valid data screening (final 788 valid samples). (Image source: drawn by the author).
Figure 2. Flowchart of data sources and processing procedures. (1) Data sources collected from five mainstream platforms (Google Maps reviews, TripAdvisor, Sina Weibo, Xiaohongshu, Ctrip). (2) Data processing including field cleaning, date filtering, noise removal, deduplication, and data merging. (3). Data statistics: Translation integration and valid data screening (final 788 valid samples). (Image source: drawn by the author).
Information 17 00057 g002
Figure 3. Coherence score for different numbers of topics and perplexity score for different numbers of topics. Figure (a) shows the coherence score plot corresponding to different numbers of topics, with the abscissa representing the number of topics and the ordinate representing the coherence score (c_v metric) for measuring the semantic coherence of topics. Figure (b) shows the perplexity score plot corresponding to different numbers of topics, with the abscissa representing the number of topics and the ordinate representing perplexity for measuring the prediction difficulty of the language model.
Figure 3. Coherence score for different numbers of topics and perplexity score for different numbers of topics. Figure (a) shows the coherence score plot corresponding to different numbers of topics, with the abscissa representing the number of topics and the ordinate representing the coherence score (c_v metric) for measuring the semantic coherence of topics. Figure (b) shows the perplexity score plot corresponding to different numbers of topics, with the abscissa representing the number of topics and the ordinate representing perplexity for measuring the prediction difficulty of the language model.
Information 17 00057 g003
Figure 4. Word cloud of high-frequency keywords in UGC related to the Macau Science Center. (Note: The font size of keywords is positively correlated with their occurrence frequency; high-frequency core keywords reflect users’ key concerns. Image source: Drawn by the authors).
Figure 4. Word cloud of high-frequency keywords in UGC related to the Macau Science Center. (Note: The font size of keywords is positively correlated with their occurrence frequency; high-frequency core keywords reflect users’ key concerns. Image source: Drawn by the authors).
Information 17 00057 g004
Figure 5. Semantic network analysis of core keywords. (a) Overall semantic network (node size = word frequency, node color depth = centrality, edge width/color = co-occurrence weight); (b) early-stage semantic network (focus on basic venue services and family needs); (c) recent semantic network (focus on comprehensive travel experience); (d) mid-term 1 semantic network (added scenario-based experience); (e) mid-term 2 semantic network (added travel and supporting facilities). (Image source: Drawn by the authors).
Figure 5. Semantic network analysis of core keywords. (a) Overall semantic network (node size = word frequency, node color depth = centrality, edge width/color = co-occurrence weight); (b) early-stage semantic network (focus on basic venue services and family needs); (c) recent semantic network (focus on comprehensive travel experience); (d) mid-term 1 semantic network (added scenario-based experience); (e) mid-term 2 semantic network (added travel and supporting facilities). (Image source: Drawn by the authors).
Information 17 00057 g005
Figure 6. Frequency distribution of top unique keywords in positive and negative sentiment reviews. (Note: Positive review keywords are mainly related to core services and experiences; negative review keywords focus on service details. Image source: Drawn by the authors).
Figure 6. Frequency distribution of top unique keywords in positive and negative sentiment reviews. (Note: Positive review keywords are mainly related to core services and experiences; negative review keywords focus on service details. Image source: Drawn by the authors).
Information 17 00057 g006
Figure 7. Monthly average sentiment score trend (January 2023–November 2025). (Note: The horizontal axis represents time, the vertical axis represents average sentiment score; the neutral threshold is 0.0. Image source: Drawn by the authors).
Figure 7. Monthly average sentiment score trend (January 2023–November 2025). (Note: The horizontal axis represents time, the vertical axis represents average sentiment score; the neutral threshold is 0.0. Image source: Drawn by the authors).
Information 17 00057 g007
Figure 8. Overall sentiment distribution of UGC related to the Macau Science Center. (Note: Positive sentiment accounts for 82.7%, neutral for 9.9%, negative for 7.4%. Image source: Drawn by the authors).
Figure 8. Overall sentiment distribution of UGC related to the Macau Science Center. (Note: Positive sentiment accounts for 82.7%, neutral for 9.9%, negative for 7.4%. Image source: Drawn by the authors).
Information 17 00057 g008
Table 1. UGC original texts and translation matching example.
Table 1. UGC original texts and translation matching example.
PlatformsCommentsTranslation
Google MapsA great place for children, youngsters. They refurbished new topics such as Data Science, Network Security, AI and so on. By the way, the MacDonald inside the exhibition hall also provides the great sea view.(The content itself is already in English and does not require translation.)
TripAdvisor我们今天下午去了这里,以后去澳门一定会回来的。孩子们可以做的事情太多了,他们很喜欢。我们看到了一些画廊,但还没有尝试游乐区或天文馆,这两个看起来惊人。太值钱了!We went here this afternoon and will definitely come back when we go to Macau in the future. There are so many things kids can do and they love it. We saw some galleries but have not tried the play area or planetarium, these two look amazing. It is worth a lot!
Sina Weibo澳门科学馆真是个溜娃的好地方!特别在台风过后,没什么游客,空调也够足,设施也非常好,工作人员的服务态度也是很好,完美!!!The Macau Science Center is a great place to take the kids! Especially after the typhoon, there were not many tourists. The air conditioning was great, the facilities were excellent, and the staff were very friendly and helpful—perfect!!!
Xiaohongshu (rednote)光是看着贝聿铭设计的澳门科学馆都是视觉享受 内部空间设计非常值得体验(麦当劳有种泡沫经济时代的感觉)。Just looking at I.M. Pei’s design for the Macau Science Center is a visual treat, and the interior design is definitely worth experiencing (McDonald’s has a bubble economy vibe).
Ctrip位于澳门半岛的澳门科学馆值得花时间一游,旁边就是观音莲花苑休息区,有大型的免费儿童游乐场,两者体验都体验非常好,澳门科学馆逛下来起码三四小时,推荐。The Macau Science Museum, located on the Macau Peninsula, is worth a visit. It is right next to the leisure area of Kun Iam Statue Waterfront, which has a large, free children’s playground. Both offer excellent experiences. A visit to the Macau Science Museum will take at least three to four hours. This is highly recommended.
Note: This table only presents partial translation examples. Data source: Compiled by the authors.
Table 2. Synonym merging list.
Table 2. Synonym merging list.
Core Words After MergingSynonymsInstructions
child“kid” “children” “kids”Both represent “child” and are unified into the basic word form “child”
museum“science museum” “Macau Science Center” “MSC”All refer to the research object, unified as “museum”
exhibition“exhibit” “exhibits” exhibition hall”All indicate “exhibition/exhibit/exhibition hall”, unified as “exhibition”
view“scenery” “scene” “seaside view” “landscape”All represent “scenery/landscape”, unified as “view”
play“playing” “played” “fun play” “enjoy playing”All represent “play/entertainment”, unified as “play”
family“family trip” “parent–child” “family visit” “family group”All related to family travel, unified as “family”
experience“visit experience” “interactive experience” “tour experience”All represent “experience”, unified as “experience”
ticket“tickets” “ticket price” “admission ticket”All related to tickets, unified as “ticket”
Data source: Compiled by the authors.
Table 3. Statistics of the top 60 high-frequency keywords in UGC.
Table 3. Statistics of the top 60 high-frequency keywords in UGC.
No.WordWord FrequencyNo.WordWord Frequency
1child75431show113
2hall41432interactive112
3museum38933photo110
4exhibition34734McDonald107
5ticket34635adult106
6take23736family106
7time22837minute105
8visit22438exhibit105
9experience21239see101
10view20240light99
11firework18241new97
12free17642worth95
13area16843site95
14play16344display94
15sea16145close93
16great16046trip93
17go15247learn92
18good15048hotel92
19recommend14649open90
20fun14250perfect90
21planetarium13851enjoy90
22year13052world90
23first12853get89
24floor12754Zhuhai89
25make12755art87
26place12556build87
27pm12357movie82
28walk11858suitable79
29hour11659space76
30design11560activity75
Note: The table lists the top 60 keywords by occurrence frequency, reflecting users’ core attention points. Data source: Compiled by the authors.
Table 4. Top 10 core keywords of five potential demand topics.
Table 4. Top 10 core keywords of five potential demand topics.
TypeTopic 1Topic 2Topic 3Topic 4Topic 5
Keyword 1
(Weight 1)
museumchildrenhallmuseumbus
(0.009)(0.013)(0.028)(0.013)(0.011)
Keyword 2
(Weight 2)
seafireworkschildrenchildrenchildren
(0.006)(0.013)(0.022)(0.011)(0.009)
Keyword 3
(Weight 3)
firstexhibitionexhibitionkidstickets
(0.005)(0.012)(0.016)(0.01)(0.009)
Keyword 4
(Weight 4)
arthallticketsgreatfree
(0.005)(0.008)(0.015)(0.008)(0.008)
Keyword 5
(Weight 5)
coffeemuseummuseumexperiencezhuhai
(0.005)(0.007)(0.011)(0.008)(0.008)
Keyword 6
(Weight 6)
yearexperiencehallsfireworkstake
(0.005)(0.005)(0.008)(0.007)(0.007)
Keyword 7
(Weight 7)
timetimekidsworthmuseum
(0.005)(0.005)(0.008)(0.006)(0.007)
Keyword 8
(Weight 8)
exhibitionoctoberplanetariumvisitsea
(0.005)(0.005)(0.007)(0.006)(0.007)
Keyword 9
(Weight 9)
cityinteractivefuntimeview
(0.004)(0.004)(0.006)(0.005)(0.006)
Keyword 10
(Weight 10)
childrenhallsticketfreewalk
(0.004)(0.004)(0.006)(0.005)(0.006)
Note: Topic 1 = comprehensive tour experience and scene integration; Topic 2 = parent–child interaction and characteristic scene experience; Topic 3 = core venue facilities and ticketing services; Topic 4 = direct emotional feedback and evaluation of tour value; Topic 5 = transportation and surrounding landscapes. Data source: Compiled by the authors.
Table 5. Research questions and their corresponding core findings, analysis methods, and supporting evidence.
Table 5. Research questions and their corresponding core findings, analysis methods, and supporting evidence.
Research Questions (RQs)Core Research FindingsCorresponding Analysis MethodsSupporting Evidence
(1) What are the core dimensions of user attention to the Macau Science Center?Five core dimensions:
  • Parent–child and family visit needs (with “child” and “family” as core keywords);
  • Exhibitions and interactive experiences (focusing on “hall”, “exhibition”, “interactive”, etc.);
  • Ticketing and consumption services (including relevant needs such as “ticket”, “free”, “McDonald”);
  • Surrounding environment and landscape (centered on “sea”, “view”, “firework”, “Zhuhai”);
  • Emotional evaluation and recommendation intention (represented by “great”, “good”, “recommend”)
Word frequency analysis, word cloud visualizationTable 3, Figure 4
(2) What is the correlation strength and network structure of the core attention keywords?
  • The association network shows a dynamic evolutionary trajectory: from an early loose network focusing on “basic services + initial family needs”, it gradually develops into a dense network of “scenario-based experiences + travel supporting facilities” in the mid-term, and finally forms a comprehensive association network of “venue services + landscape viewing + full travel process” in the recent period;
  • The core nodes have always revolved around “child”, “museum”, “exhibition”, and “hall”, and the correlation strength with extended dimensions such as landscape, transportation, and catering continues to increase.
Semantic network analysisFigure 5
(3) What are the underlying needs and themes hidden behind user feedback?Five major potential demand themes:
  • Comprehensive tour experience and scene integration.
  • Parent–child interaction and characteristic scene experience.
  • Core venue facilities and ticketing services.
  • Direct emotional feedback and evaluation of tour value.
  • Transportation and surrounding landscapes.
LDA topic modelingTable 4
 (4) What are the correlation patterns between various influencing factors and user emotional tendencies?
  • Emotional distribution: Positive emotions account for 82.7%, neutral emotions 9.9%, and negative emotions only 7.4%;
  • Correlation characteristics: Positive emotions are strongly correlated with core services, environmental landscape, and interactive experiences (trigger words: hall, exhibition, free, sea, etc.), while negative emotions are concentrated in local details such as service attitude, transportation, and catering (trigger words: attitude, driver, eating, etc.);
  • Temporal trend: From 2023 to 2025, the emotional score showed a fluctuating upward trend, gradually moving from a low level to a stable high level.
VADER sentiment analysisFigure 6, Figure 7 and Figure 8
Source: Author statistics.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liang, J.; Deng, Q.; Zhu, Y.; Liang, J.; Wu, C.; Zheng, L.; Chen, Y. Visitor Satisfaction at the Macau Science Center and Its Influencing Factors Based on Multi-Source Social Media Data. Information 2026, 17, 57. https://doi.org/10.3390/info17010057

AMA Style

Liang J, Deng Q, Zhu Y, Liang J, Wu C, Zheng L, Chen Y. Visitor Satisfaction at the Macau Science Center and Its Influencing Factors Based on Multi-Source Social Media Data. Information. 2026; 17(1):57. https://doi.org/10.3390/info17010057

Chicago/Turabian Style

Liang, Jingwei, Qingnian Deng, Yufei Zhu, Jiahai Liang, Chunhong Wu, Liang Zheng, and Yile Chen. 2026. "Visitor Satisfaction at the Macau Science Center and Its Influencing Factors Based on Multi-Source Social Media Data" Information 17, no. 1: 57. https://doi.org/10.3390/info17010057

APA Style

Liang, J., Deng, Q., Zhu, Y., Liang, J., Wu, C., Zheng, L., & Chen, Y. (2026). Visitor Satisfaction at the Macau Science Center and Its Influencing Factors Based on Multi-Source Social Media Data. Information, 17(1), 57. https://doi.org/10.3390/info17010057

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop