1. Introduction
Festivals are one of the representative cultures of a country, nation or region. They have multiple functions such as gathering social consensus, inheriting traditional culture, and enriching spiritual life [
1]. In the process of modernization and globalization, and with economic growth, civic living conditions have gradually improved. Associated with this, the life-autonomy of city residents has increased, and options for festival activities have increased [
1,
2]. How to provide full access to and enrich the diverse culture and functions of traditional Chinese festivals and the revival of traditional Chinese culture are issues of current general concern to Chinese society.
In the process of globalization and modernization, conflicts and exchanges between different cultures have gradually increased. There are relatively few studies on the use of big data applied to festival cultural perception. More commonly used research frameworks to investigate folk customs and social activities are based on analysis during the actual situation, combined with theoretical analysis, and these are then used to suggest management options [
3,
4]. There have been many studies of the inheritance and development of Chinese festivals from perspectives and these have provided suggested improvements. Zhang proposed that Chinese festivals are in an era of development and made suggestions on the design of Chinese festivals from the perspective of history and folklore [
1]. Wang briefly reviewed the inheritance and development of Chinese traditional festivals in Hong Kong, Macau, and Taiwan [
5]. Li used methods such as field investigations, questionnaire surveys, and literature studies to analyze the status of traditional Chinese festivals and to propose further developments [
6]. However, it should be noted that the method of collecting relevant information based on field trips, questionnaire surveys, or interviews has high time and money costs, and is subject to questionnaires [
7]. Restrictions on design, interview rules, and personal subjective factors had greatly affected the accuracy of the data, and because the temporal and spatial scale of the sample coverage was small, there was a certain risk to the reliability of the data and conclusions. Since most of the research on festivals and culture’s research methods were based on field trips, questionnaire surveys, literature research, and other methods [
8,
9].
With the widespread adoption of mobile devices and location-based services, social media data have increasingly attracted the attention of scholars due to their large user base, rich spatiotemporal and semantic information, and low cost of access [
10,
11]. Meanwhile, understanding how conversational discourse on online social networks changes semantically and geographically over time will help reveal the dynamic changes of interpersonal relationships and digital traces of social events [
12]. Xie and others used sign-in data of the Sina Weibo social media platform in Beijing in 2016. They used the TF-IDF (term frequency-inverse document frequency) algorithm based on geographic location information and spatial clustering to locate hot spots in Beijing in order to study social and cultural differences and crowd behaviors between different areas of Beijing [
13].
Studying people’s behavior is of great importance to urban planning and design and to the improvement of the living standards of residents [
14,
15]. Traditional methods of collecting human behavior data such as surveys are only suitable for small sample research projects. Moreover, these methods are time-consuming and costly, and the results obtained are difficult to update. In recent years, people are willing to disclose useful personal information on social media [
16]. How to fully mine social media data to obtain residents’ opinions of festivals has become an important topic of current research. Garay used social media (especially Twitter) to analyze the potential contribution of festivals in generating the image of festival destinations, but their research goals were more focused on the commercial value of festivals [
17]. Zhou selected Sina Weibo data from 2012 to 2014 related to the five traditional festivals of the Spring Festival, the Lantern Festival, the Qingming Festival, the Dragon Boat Festival, and the Mid-Autumn Festival. People’s perception of traditional Chinese festivals and regional differences in their perception of traditional festivals were investigated using word frequency analysis and LDA theme analysis [
18].
Existing related research has achieved important results in research on festival activities and human perceptions [
10,
19]. Liu and others used social media data to study the daily activities of residents. Based on this, the proposed framework integrates textual semantic analysis, statistical method, and spatial techniques, broadens the application areas of social media data, especially text data, and provides a new paradigm for the research of residents’ activities and spatiotemporal behavior [
20]. However, there are relatively few studies on the analysis of residents’ festival activities from two aspects, text mining and space analysis. Therefore, there is still a lot of room for research on festival activities based on social media data. We process the unpredictable, sparse, and irregular data that appears in location-based social networks, and convert this uncertain, noisy geo-tagged data into useful, well-structured high-level information [
21,
22] (for example, the space distributed for festival events). Minatel proposed that when using stay points to construct LBSN, it presents much more information since GPS logs convey more users’ mobility information [
23]. It is a very challenging task to easily explain this, to make better decisions for further festival construction. There were relatively few researches using big data from this perspective. Therefore, there is still a lot of room for research on festival activities based on social media data.
On a small scale, such as that of a region or city, the comparison of residents’ perceptions of various festivals needs further research. With the widespread adoption of mobile devices and location-based services, social media data have increasingly attracted the attention of scholars due to their large user base, rich spatiotemporal and semantic information, and low cost of access [
10,
11]. Meanwhile, understanding how conversational discourse on online social networks changes semantically and geographically over time will help reveal the dynamic changes of interpersonal relationships and digital traces of social events [
12]. Xie and others used sign-in data for the Sina Weibo social media platform in Beijing in 2016. They used the TF-IDF (term frequency-inverse document frequency) algorithm based on geographic location information and spatial clustering to locate hot spots in Beijing in order to study social and cultural differences and crowd behaviors between different areas of Beijing [
13].
Using big data analysis and text mining research methods, it is possible to examine the attitudes, activities, and preferences of people in different areas of a city, and reveal social, cultural, and functional characteristics of hot spots [
24,
25]. Such research methods can also be used to enhance cultural perception, to explore cultural connotations of traditional Chinese festivals in order to revive traditional Chinese festivals, and to provide suggestions and solutions to meet the requirements of the current era [
26].
Using social media data from the Sina Weibo platform, based on the text and spatial temporal information, the residents’ festival activities are studied from two aspects: text mining and spatial analysis. Through the integration of natural language processing technology, spatial analysis, statistical analysis, and other technical means, it provides a new research paradigm for festival culture research. This research focuses on the behavioral characteristics of Beijing residents’ festival activities and their perceptions of various types of festivals. Firstly, the behaviors of festival activities are classified by extracting keywords and other information from Weibo text. The spatial patterns of various actions are then mapped. This research discussed the sensing and spatial characteristics of residents’ festival activities.
The rest of this article is organized as follows. In
Section 2, data collection and research methods are introduced. In
Section 3, the results of sorting and categorizing the information of residents’ festival activities are described, and the semantic characteristics, perceived content, and temporal and spatial patterns of residents’ festival activities are analyzed. In
Section 4, the advantages and disadvantages of the research methods used in this article are discussed. Finally, in
Section 5, we summarize our study, draw conclusions, and propose future research directions.
3. Results
3.1. Festive Event Word Frequency Statistics
Festivals with more than 10,000 Weibo posts were National Day, Mid-Autumn Festival, New Year’s Day, Christmas Day, Lantern Festival, and Christmas Eve (
Table 1). As 2019 was the 70th anniversary of the founding of the People’s Republic of China, most Weibo posts were related to the National Day. The status of the family in the Chinese people’s concept of festival is culturally important, and hence the Mid-Autumn Festival with the theme of a family reunion was the second largest festival-related Weibo content in 2019.
All of the 213,649 festival-related Weibo posts from Beijing in 2019 were sorted by word frequency statistics (
Figure 3). As 2019 was the 70th anniversary of the founding of the People’s Republic of China, the frequency of words related to the National Day, such as “motherland”, “happy birthday”, “70”, was high. The frequency of entries related to the Mid-Autumn Festival was also high. In Chinese festival activities, eating food was clearly an essential behavior and a principal way people participated in festivals.
Based on all festival-related Weibo content in 2019, the main content of residents’ perception of festivals and the main ways of participating in festivals were reflected in word cloud diagrams (
Figure 4). High-frequency words corresponded to festivals with a large number of Weibo posts in 2019. For example, words such as “motherland”, “China”, and “happy birthday” were also reflected in word cloud maps for National Day, Mid-Autumn Festival, New Year, and other related words. Words such as “eat” and “delicious” reflected the main ways that residents participate in festivals.
The festivals were divided into three categories: traditional festivals, foreign festivals, and modern festivals, sorted according to the number of related posts from most to least, and the proportion of the number of posts of various types of festivals in the total data were calculated. The results were shown in
Table 2.
Beijing residents posted the largest number of Weibo posts related to traditional festivals, accounting for 40.46%. Among the traditional festivals, the Mid-Autumn Festival, with the theme of a family reunion, was the most frequently mentioned. However, the number of Weibo posts related to the Spring Festival was relatively small. This was due to the fact that the time span of the Spring Festival was long. Only the Weibo data on the day of the holiday was extracted here, so there was a deviation in the number of Weibo posts. In addition, Weibo users tend to be young, and hence the Weibo post data may not reflect the feelings of middle-aged and elderly people.
Traditional festivals are closely related to Chinese history and culture. In order to explore the degree of attention to traditional culture in Weibo, it is necessary to analyze some relatively low-frequency words in the characteristic words (
Table 3). Residents’ festival activities are greatly influenced by traditional culture. This is not only reflected in clothing and locations, such as “Hanfu” and “Confucian Temple”. In traditional festivals, the influence of traditional culture is more obvious. “Will live long as he can!”, “From far away you share this moment with me.” and other phrases corresponding to the Mid-Autumn Festival appear more frequently.
Foreign festivals accounted for 20.40% of the Weibo data on the day of the festival, showing that traditional festivals still dominate residents’ perception of festivals. Besides Christmas and Christmas Eve being key points of residents’ sense of foreign festivals, foreign festivals do not occupy the central position of residents’ sense of festivals. For modern festivals, the number of posts related to National Day, where residents expressed their patriotic feelings, accounted for about one-third of the total number of posts.
We also found that some festival activities, especially some foreign festivals, have a certain connection with religion (
Table 4). In the published text information, not only the names of religious beliefs are clearly mentioned, but the names of religious places appear relatively frequently on the day of the festival.
3.2. Semantic Sensing of Festival Activities
Figure 5 shows the internal proportions of various types of different types of festival data, and a longitudinal comparison of the same type of data. You can find the same type of festival data, the proportions of different parts of speech and types. Especially in traditional festivals, verbs make up the largest proportion of words, which is significantly different from other types of festivals.
Figure 6 is a horizontal comparison of different types of festival data of the same parts of speech. Modern festivals have the most holiday features in nouns, and traditional festivals in have “Eating” as the most common verb.
Nouns reflected residents’ perception of festivals, especially the representative symbols and elements of festivals, for example, the nouns “moon cake”, “zongzi”, and “tangyuan”, as these traditional Chinese foods were used in relation to the traditional festivals, i.e., Mid-Autumn Festival, Dragon Boat Festival, and Lantern Festival, respectively. Words such as “Santa Claus”, “Christmas gift”, and “apple” were used, related to foreign festivals, i.e., Christmas and Christmas Eve. For modern festivals, words like “mother country” and “China”, related to National Day, were used frequently.
Regardless of the type of festival, the word “Forbidden City” appeared frequently. This indicates that the local attraction of the Forbidden City has become an indispensable part of festivals in the perception of Beijing residents, providing an emotional support and cultural symbol. Finally, the proportion of Weibo terms of each type of festival showed that the proportion of traditional festivals was the largest, as high as 59%, which showed that residents had the most abundant perception of traditional festivals.
All high-frequency words were divided into four categories according to part of speech and semantic content. For example, such as “eat”, “drink” etc. in the group verb. In order to better summarize such activities, we named these words “eating”. Activities that can also be carried out in daily life such as “check in” and “walking around” are named “leisure activities”. Because of the limited space of other word classifications, there is not much explanation. Verbs reflect the main behaviors of residents participating in festivals. From the frequency of words, the behaviors of Beijing residents participating in festivals appeared relatively uniform across festival types (
Figure 6). For example, words such as “eat” and “check in” indicate that the main behaviors of residents participating in festivals were associated with dining. It would appear that online celebrity shops’ “check in” has become an important way for Beijing residents to participate in festivals.
Adjectives mainly reflect the emotional expression of residents towards festivals, and different types of festivals corresponded to different emotional expressions. “Ching Ming” in traditional festivals corresponded to the Ching Ming Festival. Words such as “peaceful “, “smooth”, and “consummate” were cultural manifestations of traditional festivals. The word “peaceful” in foreign festivals appeared most frequently, which corresponded to people’s wish for peace on Christmas Eve. High-frequency adjectives used for modern festivals reflected the concentration of residents on the National Day, expressing pride in the motherland and giving positive comments on the status quo of the motherland, with adjectives such as “safe”, “strong”, and “prosperity”.
3.3. Spatial Distribution Characteristics of Festival Activities
Figure 7 shows the kernel density distribution map of Beijing residents’ Weibo posts related to festivals in 2019, overall and by festival type. The density distribution of traditional festivals was not much different from that of modern festivals, although the central density of residents’ postings related to traditional festivals was denser than that of modern festivals. The density of foreign festivals appeared much lower than either traditional or modern festivals, but there appeared to be many areas with no posts, suggesting that traditional festivals still occupy the main position of Chinese residents’ holiday behavior and culture. This contradicts the perception that traditional festivals are being significantly impacted and influenced by foreign festivals.
3.4. Theme Sensing of Festival Activities
Among the 29 festivals in 2019, the LDA theme model divided the festival-related posts into three types: the emotional expression of the posts; the specific behavior of residents; and the representative culture of the related festival. Residents’ festival activities were roughly divided into two categories: eating food with relatives and friends and going to various restaurants to check-in; going to multiple tourist attractions and festival activities. The LDA model analysis was applied to the three festival types; modern, traditional, and foreign, and results were imported into ArcGIS for thematic spatial analysis.
The 5 topics, each topic was more evenly distributed in space, but topic 2 was most distributed in space (
Figure 8). Comparing
Table 5, the high-frequency words of topic 2 mainly correspond to the Mid-Autumn Festival and the Spring Festival, such as “moon cake”, “reunion”, “year of pig”, and “good luck”.
The theme space distribution of foreign festivals was not as wide as that of traditional festivals, but there are obvious spatial differences in the theme space distribution (
Figure 9). Theme 1 is mainly distributed in the area outside the Fifth Ring Road in Beijing, and theme 4 is mainly distributed in the area inside the Fifth Ring Road. According to the topic high-frequency words in
Table 6, topic 1 was mainly associated with residents’ emotional perception and expression of the festival, with as words such as “happiness”, “hope”, and “peace”. Theme 4 was mainly related to specific behaviors of residents participating in festivals, such as “Christmas gifts” and “apples”, which means that residents participating in Christmas mainly give gifts and apples to express their care for relatives and friends.
Theme 2 and Theme 3 for modern festivals also showed significant spatial differences (
Figure 10). Combined with the high-frequency words in
Table 7, high-frequency words in theme 2 included “Happy New Year”, “Military Parade”, “Hope”, “Fireworks”, “Tiananmen Square”, and other words, some of which expressed the best wishes of residents during the festival. The other part mainly described the representative symbols and constituent elements of festivals, especially National Day. The high-frequency words of theme 3, such as “delicious”, “check in”, and “taste” were associated with food and eating.
Combined with the differences in the spatial distribution of the theme of foreign festivals, it could be concluded that the main way residents participated in festivals had a certain relationship with the perfection of infrastructure. In the specific festival behaviors, residents living in the central city of Beijing can participate in various festival activities, so most of the content on Weibo reflects specific festival behaviors. Residents living in the suburbs of Beijing may have been restricted due to access to such infrastructure. Therefore, people expressed more wishes on the content of Weibo, with regard to the festival or the cultural concept of the festival itself.
4. Discussion
Most current researches on festivals and culture are conducted through surveys and field trips, and seldom uses big data to analyze related issues. Therefore, many scholars have realized the urgency of using social media data to carry out research on festival activities [
4]. For example, Zhou’s research mainly uses word frequency statistics and LDA theme models to identify residents’ perceptions of traditional festivals and regional differences [
18]. According to their research results, LDA topic classification is obviously a powerful method for analyzing social media data, text mining, and revealing the spatiotemporal characteristics of related activities. A study by Liu [
41] studied the emotional characteristics of Chinese tourists to Australia based on big data text analysis and part-of-speech tagging. These methods all extend the textual analysis of festival activities. However, the above research lacks comprehensive mining of the rich semantic information and spatiotemporal information in social media data. Therefore, this research uses NLP technology to identify festival-related Weibo posts, and combines word frequency statistics, text labeling, LDA theme models, and GIS spatial analysis methods to analyze residents’ perception characteristics of festivals and activities.
Judith Mair and Karin Weber [
3] pointed out that many studies in the field of festival analysis had adopted a case study approach. Therefore, the research on special festivals is relatively sufficient, but the comprehensive comparative study of many festivals is lacking. This could be said to limit the scope and scale of our understanding of festival. Therefore, by expanding the scope of research on different types of festivals, we hope to improve the understanding the residents’ perception of different festivals. Through the comparison of different types of festivals, the research found that Weibo texts reflect that residents pay more attention to different festivals. Traditional festivals still receive more widespread attention; from thematic analysis, it can be found that there are common characteristics between different types of festivals. For example, attention to leisure activities and food is very prominent; it is also universal to express greetings to family and friends through festivals. However, it can also be found that among different types of festivals, traditional festivals are more closely related to history and culture, while modern festivals are more closely related to leisure and consumption. Western festivals have been more connected with consumption and entertainment while retaining some religious imprints. Such a comprehensive study is of great significance for in-depth understanding of the connotations of festivals and social and economic development.
On the spatial scale, this study found an interesting phenomenon in the spatial pattern of residents’ festival activities in a giant city. Although the gathering areas for different types of festivals are concentrated in densely populated urban centers, the activities of traditional festivals and modern festivals’ distribution ranges are significantly larger than that of foreign festivals in the West. We believe that the way residents participate in festivals is related to the degree of infrastructure, especially the number of entertainment facilities such as catering and services. At the same time, the regional difference of festival activities within the city also proves the imbalance in the urban structure of Beijing, that is, the northern part of the city is more distributed than the southern area (
Figure 7) [
42]. Just as Wilson [
4] emphasized the important role of festivals to local communities. By increasing festival-related facilities in underdeveloped urban areas, it is also possible to promote the balanced development of the city. However, this research also poses a new challenge, that is, the difference between the east and the west of the city is also more obvious. This part of the reason needs to be studied in depth.
The results of this research show that we can understand the residents’ perception of festivals by using social media big data. However, according to the 2020 Weibo User Development Report, Weibo users are predominantly people aged 20–30, and account for close to 80% of users [
23]. Therefore, social media data is more of a relatively young group, and the data has problems with sample bias and representativeness. In order to solve this problem, in further research, traditional questionnaire surveys and other methods can be used to supplement the research samples by combining multiple sources of data to compensate for the problem of social media data sample deviation.
5. Conclusions
This study uses social media data to study residents’ perceptions of festivals and the spatial characteristics of activities. By using a text classification model based on BERT and Transformers framework, we analyzed Weibo social media data related to festivals in Beijing in 2019. We obtained Beijing residents’ perceptions of festivals and the ways they participated in festivals, and explored the spatial differences of residents’ participation in festival activities.
Using word frequency statistics, part-of-speech analysis, and LDA topic model analysis, we analyzed Weibo social media data related to festivals in Beijing in 2019. We obtained Beijing residents’ perceptions of festivals and the ways they participated in festivals, and explored the spatial differences of residents’ participation in festival activities.
Traditional culture had a huge influence on festivals, which is not only reflected in residents’ motivation to participate in festivals, but also in the ways they participated in festivals and the feelings they expressed. Traditional festivals occupied the central position of residents’ perception of festivals. This was different from current concerns that traditional festivals are being greatly affected and impacted by foreign festivals. The feelings of family and motherland occupied a central position in modern festivals. This was clearly manifested in word frequency and topic spatial distribution. For traditional festivals, residents expressed their feelings through ancient poems from traditional Chinese culture. For example, for traditional festivals, frequently use words such as “On festive occasions more than ever one thinks of one’s dear ones far away. (每逢佳节倍思亲)“, “Will live long as he can! (但愿人长久)”, and other verses were not used in relation to other types of festivals. The way residents participate in festivals is related to the degree of infrastructure, especially the number of entertainment facilities such as catering and services. Most of the Weibo posters from inner-city areas expressed specific festival-related behaviors showing that they were directly participating in the festival activities, while posters in the outer city area often expressed holiday wishes. Additionally, some of the residents’ festival activities were related to religious beliefs, reflecting the cultural traditions and connotations behind the festivals in different types of festivals.
Through the analysis of the spatial distribution pattern of festival-related microblogs, it can be found that the temporal and spatial information of social media data can help understand the characteristics of urban spatial structure. Residents’ festival activities are concentrated in densely populated and economically developed urban centers. The regional differences between the north and the south festival activities within the city are also in line with the characteristics of Beijing’s urban spatial structure. However, this study found that the difference between the eastern and western parts of the city is also very obvious. This discovery presents a new challenge. The reasons for the differences between the east and west spaces of residents’ activities need to be studied in depth.
This study uses social media data to study residents’ perceptions of festivals and the spatial characteristics of activities. Combining natural language processing technology, statistical analysis, part-of-speech tagging, topic analysis, and spatial analysis, provides a new paradigm for the research in the field of festivals. However, the LDA topic model has certain shortcomings in processing sparse social media data. This requires subsequent advances in data processing technology. There is a problem of sample bias in social media data, which cannot reflect the situation of middle-aged and elderly people who use fewer social media well. In the follow-up research, traditional questionnaire survey methods can be used to supplement the samples with multi-source data. The spatial differences of residents’ festival activities found in this study can only be described from a qualitative perspective at present. In the future, we hope that further studies can explain the reasons for the spatial differences from a quantitative perspective.