Next Article in Journal
The Effect of Training Data Size on Disaster Classification from Twitter
Next Article in Special Issue
Complexity Evaluation of Test Scenarios for Autonomous Vehicle Safety Validation Using Information Theory
Previous Article in Journal
DIPA: Adversarial Attack on DNNs by Dropping Information and Pixel-Level Attack on Attention
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Research on Resident Behavioral Activities Based on Social Media Data: A Case Study of Four Typical Communities in Beijing

1
College of Applied Arts and Sciences, Beijing Union University, Beijing 100191, China
2
School of Management, Zhejiang University of Technology, Hangzhou 310023, China
*
Author to whom correspondence should be addressed.
Information 2024, 15(7), 392; https://doi.org/10.3390/info15070392
Submission received: 30 May 2024 / Revised: 28 June 2024 / Accepted: 3 July 2024 / Published: 5 July 2024
(This article belongs to the Special Issue Big Data Analytics in Smart Cities)

Abstract

:
With the support of big data mining techniques, utilizing social media data containing location information and rich semantic text information can construct large-scale daily activity OD flows for urban populations, providing new data resources and research perspectives for studying urban spatiotemporal structures. This paper employs the ST-DBSCAN algorithm to identify the residential locations of Weibo users in four communities and then uses the BERT model for activity-type classification of Weibo texts. Combined with the TF-IDF method, the results are analyzed from three aspects: temporal features, spatial features, and semantic features. The research findings indicate: ① Spatially, residents’ daily activities are mainly centered around their residential locations, but there are significant differences in the radius and direction of activity among residents of different communities; ② In the temporal dimension, the activity intensities of residents from different communities exhibit uniformity during different time periods on weekdays and weekends; ③ Based on semantic analysis, the differences in activities and venue choices among residents of different communities are deeply influenced by the comprehensive characteristics of the communities. This study explores methods for OD information mining based on social media data, which is of great significance for expanding the mining methods of residents’ spatiotemporal behavior characteristics and enriching research on the configuration of public service facilities based on community residents’ activity spaces and facility demands.

1. Introduction

As of 1 November 2020, the permanent resident population of Beijing was 21.893 million. Compared with 19.612 million in the sixth national census in 2010, the population increased by 2.281 million over ten years, with an average annual increase of 228,000 people and an average annual growth rate of 1.1% [1]. Beijing, with its massive population, faces significant issues related to urban sprawl, such as severe traffic congestion, prominent environmental problems, and irrational resource distribution—often referred to as “big city diseases.” Studying the spatial structure of the city is crucial in addressing these challenges. From a theoretical standpoint, research on urban spatial structure helps reveal the complex interactions between various social, economic, and environmental factors during the city’s development. This deepens our understanding of the dynamic changes in urban space and provides new perspectives and theoretical support for urban geography and related disciplines. From a practical perspective, optimizing the urban spatial structure is key to enhancing the city’s sustainable development and improving the residents’ quality of life. A well-designed urban spatial structure can effectively alleviate traffic congestion, improve the ecological environment, and promote the rational allocation and efficient use of resources, thereby enhancing the overall functionality and livability of the city. According to the “Beijing City Master Plan (2016–2035)” improving urban spatial structure and continuously enhancing urban development quality and residential environment quality are essential for ensuring the sustainable development of Beijing [2]. The “14th Five-Year Plan and the Long-Range Objectives Through the Year 2035 for Beijing’s Economic and Social Development” outlines the goal of building a high-quality livable city during the 14th Five-Year Plan period, focusing on optimizing urban space and functional layout [3]. Beijing’s urban spatial structure is currently undergoing a period of adjustment and transformation. The interaction between urban space and residents’ behavior is a core topic in urban geography research. Investigating the spatial-temporal characteristics of residents’ daily activities and their influencing factors from a spatial-temporal integration perspective can reflect the spatial distribution of urban functional areas and the usage of public service facilities. This research provides a unique perspective for understanding the complex relationship between residents’ activities and the urban environment in both spatial and temporal dimensions [4,5], offering significant insights into the characteristics of Beijing’s urban structure.
Behavioral geography seeks to understand geographic space and its changes from the perspective of individual spatial behavior, focusing on the connections and spatial-temporal characteristics of residents’ activities. Gu Jie et al. argue that the urban spatial-temporal structure refers to the characteristics displayed by residents’ behaviors within the urban spatial-temporal framework [6]. Therefore, traditional empirical research on urban spatial-temporal structures typically uses residents’ daily travel behaviors as a starting point. Such studies often rely on travel logs to investigate the spatial-temporal characteristics of residents’ daily commutes or analyze the factors influencing these behaviors [6,7]. With the advancement of information technology, the emergence of OD data has provided a new data source for studying the urban spatiotemporal structure. OD data is a type of data that provides the start and end locations and sparsely describes the moving trajectories of objects [8]. Mining OD data can reveal group movement patterns, analyze spatial anomalies, and uncover hidden relationships. OD data mining is also a research hotspot in areas such as data visualization and geospatial analysis [9,10]. Current research on OD flow patterns tends to use taxi trajectory data [11,12], mobile phone signaling data [13,14], and GPS data [15,16], with a focus on mining the movement patterns of research objects [17], analyzing the spatiotemporal characteristics of group behavior [18], and trajectory prediction or demand prediction based on existing data [15]. Utilizing big data for urban research can address the limitations of statistical data spatial scale and survey sample size in traditional research methods. At the same time, the integration of new data and methods with time geography is also an important development trend in urban spatiotemporal structure research [19]. However, the aforementioned OD data tends to use continuous movement data that can better represent complete trajectories, which is convenient for trajectory analysis but lacks semantic information, making it relatively difficult to infer the underlying purposes and meanings of trajectories solely from their spatiotemporal characteristics, thereby lacking sufficient integration with spatiotemporal behavior research. In recent years, the rapid development of information and communication technologies has highlighted the significant value of social media data mining and research in the field of geography [5]. Due to the large user base, rich spatial-temporal and semantic information, and low acquisition cost, social media data has garnered increasing attention from scholars [20,21]. However, it is relatively more challenging to obtain data from the origin compared to the destination. Studies using social media data as a source to investigate OD (origin-destination) information are not as mature as those utilizing continuous movement trajectory data, such as GPS and mobile phone data. Consequently, further expansion and development of research focusing on OD data extraction from social media are necessary.
Communities are the fundamental spatial units where people conduct their daily activities and serve as crucial elements in urban planning and research [22]. Particularly, large-scale residential communities, which are major focal points of development, face significant issues due to their singular functional nature, resulting in a spatial separation of residents from the public service facilities they need. Therefore, studying the spatial-temporal behavioral characteristics of residents in these large residential communities can provide valuable insights into their actual demands for public services, offering data support and decision-making references for urban planning, and ultimately promoting the optimization and enhancement of urban functions. In light of this, our study leverages urban spatial-temporal theory, focusing on four large residential communities in Beijing: Tiantongyuan, Huilongguan, Wangjing, and Shangdi. We fully utilize the advantages of social media data, which offers large samples and individual-level details, to address the traditional OD data’s lack of semantic information. By employing the ST-DBSCAN algorithm, we identify residents’ living locations and tackle the challenge of extracting origin-destination flow information using social media data. Our study examines residents’ behaviors from their homes to various activity locations, categorizing these behaviors into seven types: socializing, dining, leisure, shopping, studying, exercising, and working. We analyze the spatial-temporal characteristics and semantic aspects of these activities to understand the basic behavioral patterns and inter-group differences among residents of different large communities.
The specific research objectives of this study are as follows:
(1)
Utilize the ST-DBSCAN algorithm to identify users’ residential locations from social media data and construct OD data of residents’ daily activities. Apply the BERT model to classify these activities into seven categories.
(2)
Explore the spatial-temporal characteristics of the daily activities of residents in the four large residential communities in Beijing through kernel density analysis and statistical analysis. Conduct semantic analysis of activity data using the TD-IDF algorithm.
Through these objectives, our study aims to refine the methods for measuring residents’ spatial-temporal behaviors, enhance the understanding of differentiated behavioral activities among residents in various types of communities, and offer practical insights into the spatial and facility needs based on residents’ behavioral differences. This research provides valuable references for the configuration of urban public service facilities and contributes to the enhancement of urban spatial quality.

2. Data and Methods

2.1. Study Area

This study conducts an empirical study on four large residential communities: Tiantongyuan, Huilongguan, Wangjing, and Shangdi. Refer to Figure 1 and Table 1 for detailed information.
Tiantongyuan, located in the Changping District of Beijing, is one of the most densely populated areas for migrants in Beijing. According to data from the Seventh National Census, Tiantongyuan has approximately 250,000 permanent residents. The area is well-served by multiple subway and bus lines, forming a robust transportation network. As a large residential community, Tiantongyuan offers comprehensive amenities, including several hospitals, schools, and large shopping centers.
Similarly, Huilongguan, also in Changping District, is a large residential area with a dense population and vast area, making it the community with the largest population and broadest research scope in this study. It is also the area where identifying residents through their Weibo activity has proven most effective, with 1303 residents identified by clustering the time and location of their Weibo posts.
Wangjing, located in the northeastern part of Chaoyang District, is the large residential area closest to the city center among the study cases. Wangjing has a higher concentration of commercial outlets and large shopping centers compared to other communities. Wangjing SOHO, as a significant business district, also brings numerous employment opportunities to the area.
Shangdi, situated in the mid-eastern part of Haidian District, was approved for development by the State Science and Technology Commission and the Beijing Municipal Government in 1991. It has now become a comprehensive high-tech industrial zone. Compared to other communities, Shangdi has a shorter development history, thus having relatively fewer residents and a smaller area.
Overall, the communities within the study area can be categorized into two main types based on housing nature and community positioning: large residential communities and large mixed residential–commercial communities [23]. These categories are representative of the current status of large communities in Beijing. To enhance the comparability of the samples, this paper compares resident activities not only across different types of communities but also among communities of the same type.

2.2. Data

Social Media Data

According to the “2020 Weibo User Development Report” released by the Weibo Data Center, as of September 2020, the monthly active users on Weibo had increased to 511 million, with 224 million daily active users [24]. These figures underscore Weibo’s leading position among social media platforms in China. By using web crawler tools, we collected 2019 social media data from the Weibo platform, obtaining over 11.5 million Weibo posts from more than 330,000 users in Beijing. The dataset includes attributes such as user ID, text, time, and coordinates.
However, social media data inherently suffers from sample bias issues. The user base on social media platforms tends to be relatively young, and individual behavior may vary due to socioeconomic attributes such as age, gender, and occupation [25,26,27]. Despite these limitations, numerous studies have demonstrated the significant role of social media data in extracting human activity information. Social media data is rich in contextual content related to activities, emotions, experiences, and geographic information [28,29,30]. Leveraging this data, we can identify residential locations and study the activity patterns and differences among residents in various communities, providing a unique perspective for understanding urban spatial structure and public service allocation.
Figure 2 depicts the research framework of this study. In this study, Weibo data from Beijing was obtained through web scraping. We first applied the ST-DBSCAN algorithm to cluster the data temporally and spatially, extracting users who had a single cluster center located within residential areas. We then identified users whose residences were in the four communities of Huilongguan, Tiantongyuan, Wangjing, and Shangdi, and retrieved all Weibo posts made by these users in 2019. Subsequently, we used the BERT model to classify the Weibo posts into seven activity categories: socializing, dining, leisure, shopping, studying, exercising, and working. Posts that did not fit into any of these activity categories were filtered out. The final dataset consisted of activity-related Weibo posts made by community residents throughout 2019, which served as the data source for this study.

2.3. Method

2.3.1. Extraction of Daily Activities Using the BERT Model

Bidirectional Encoder Representations from Transformers (BERT) is a word embedding method based on a deep bidirectional Transformer model, utilizing the encoder part of the Transformer architecture [31]. The BERT model undergoes unsupervised training on a large corpus of text, aiming to jointly learn context in all layers. Its training tasks include masked language modeling and next-sentence prediction. Compared to earlier word embedding methods such as Word2Vec or GloVe, BERT can generate deep, rich, and bidirectional contextual embeddings for words. Traditional word vectors often provide static representations, neglecting the dynamic meanings of words in different contexts. BERT’s contextual embeddings can more accurately capture word polysemy and complex semantic relationships, resulting in significant performance improvements across various NLP tasks. The model can be used for tasks like question answering, sentiment analysis, spam filtering, named entity recognition, and document clustering [32]. In the study of residents’ daily activities, scholars have used the BERT model to classify Weibo texts, finding that the types and distribution of residents’ activities are closely related to the characteristics of the activities and holiday schedules [33].
In this study, we constructed a text classification model based on BERT to classify the collected Weibo texts. Specifically, we selected 70,000 data entries as the training sample. For each entry, if it belonged to one of the seven activity categories—socializing, dining, leisure, shopping, studying, exercising, or working—it was manually labeled from 1 to 7, respectively. Otherwise, it was labeled as 0. Next, we used machine learning and the pre-trained BERT model to train the data and verify the classification accuracy. By adjusting relevant parameters and conducting multiple iterative experiments, we obtained a trained multi-class text classification model with an overall accuracy exceeding 87%. Subsequently, using the derived classifier, all Weibo entries were input into the BERT model to identify and classify entries related to various resident activities. Finally, from the seven activity categories, an additional 5000 random samples were selected, with each category’s data manually verified, achieving an average accuracy of 94.12%, thus validating the effectiveness of the classification model. Based on this work, we classified the activities in the obtained social media data to further study the characteristics of different types of resident activities.

2.3.2. Extraction of Residential Locations Using the ST-DBSCAN Algorithm

The study of early clustering analysis methods began in the early 20th century and has since found extensive applications across various fields. Harold E. Driver and Alfred L. Kroeber, in their research during the 1930s, utilized statistical methods to analyze Polynesian cultural data, exploring how to measure similarities between cultures and how to classify cultures into different groups based on these similarities. In 1932, they published the book “Quantitative Expression of Cultural Relationships” [34], which introduced their clustering algorithm, marking the inception of the field of clustering analysis. While their work was primarily focused on anthropology, it laid the foundation for subsequent developments in clustering analysis. Following this, Robert Tryon’s 1939 monograph “Cluster Analysis” was one of the earliest works to introduce clustering methods [35]. In the mid-20th century, renowned psychologist Joseph Zubin brought clustering analysis into psychology and clinical medicine, using it to classify behavioral disorders. The development of modern clustering algorithms has further enriched this field. For instance, in 2022, Cambe et al. proposed an innovative clustering method for exploring the dynamics of research communities. This method utilized time series analysis tools to offer new perspectives on the evolution of research communities [36]. Similarly, in 2022, Lukauskas et al. introduced a new clustering method based on the inversion formula. This method identifies clustering structures within data through the inversion of mathematical formulas, offering high accuracy and robustness [37]. Despite the impressive performance of these modern clustering algorithms in their respective domains, they still face challenges when dealing with complex spatiotemporal data. The ST-DBSCAN (Spatial Temporal-DBSCAN) algorithm addresses this by considering both spatial and temporal attributes, making it more effective for handling such complex data. Specifically, ST-DBSCAN is suitable for applications that require simultaneous consideration of geographic location and time dimensions, such as identifying the residential locations of Weibo users in this study. Compared to purely spatial or temporal clustering methods, ST-DBSCAN more accurately reflects the spatiotemporal patterns of user behavior.
The ST-DBSCAN (Spatial Temporal-DBSCAN) algorithm is an extension of the DBSCAN algorithm, incorporating a temporal dimension into the clustering process. DBSCAN, proposed by Martin Ester et al., is a density-based clustering method [38]. In DBSCAN, the density around a point is determined by the number of points within a specified radius. Points with a density higher than a set threshold are grouped into clusters. DBSCAN can identify clusters of arbitrary shapes, such as linear, concave, and elliptical shapes. Additionally, unlike some other clustering algorithms, DBSCAN does not require the number of clusters to be specified in advance and has proven effective in handling very large databases [39,40]. ST-DBSCAN enhances DBSCAN by clustering spatiotemporal data based on their non-spatial, spatial, and temporal attributes. When dealing with clusters of varying densities, ST-DBSCAN addresses DBSCAN’s limitation of failing to detect certain noise points by assigning a density factor to each cluster. If the non-spatial values of neighboring objects differ slightly and the clusters are adjacent, the boundary objects in a cluster might have significantly different values from those on the opposite side of the boundary. ST-DBSCAN resolves this by comparing the cluster’s average values with new values [41].
In this study, the ST-DBSCAN algorithm was employed to cluster data spatially and temporally, identifying the residential locations of Weibo users through multiple layers of data filtering and processing. First, we filtered Weibo posts made within Areas of Interest (AOIs) corresponding to residential areas and extracted the corresponding users, along with all their posts. Next, we filtered for posts made during nighttime (from 8 p.m. to 4 a.m.), as posts made during this period are more likely to be from the user’s residence. We then applied the ST-DBSCAN algorithm to cluster the data both spatially and temporally, focusing on users who had only one cluster center. Users with multiple cluster centers were excluded, as it is difficult to determine which cluster represents the residential location. For users with a single cluster center, we assumed this center to be their residential location. Finally, we retrieved all Weibo posts from these users, which served as the data source for this study.

2.3.3. Semantic Analysis of Weibo Text Using the TF-IDF Algorithm

The TF-IDF (Term Frequency-Inverse Document Frequency) algorithm is composed of two main components: term frequency (TF) and inverse document frequency (IDF). Term frequency refers to the number of times a particular term appears in a specific document, while inverse document frequency measures the importance of the term across the entire document collection. In 1971, Gerard Salton, a professor at Cornell University, published “The SMART Retrieval System—Experiments in Automatic Document Processing,” which introduced the concept of converting query keywords and documents into “vectors” and assigning different values to the elements within these vectors [42]. The SMART retrieval system described in this paper, particularly its discussion of TF-IDF and its variants, has since become a crucial reference for many industrial-grade systems. In 1972, British computer scientist Karen Spärck Jones elaborated on the application of IDF [43]. She later discussed the combination of TF and IDF [44].
The TF-IDF algorithm is a widely used weighting technique in text classification. It evaluates the importance of words in a document by considering their frequency within the document and their rarity across a corpus. A term that appears frequently in a specific document but rarely in the entire document set is considered highly distinctive and useful for categorization [45]. The process begins with tokenizing the training and testing corpora. The TF-IDF algorithm is then applied to calculate the weights of each term, extracting feature sets accordingly. After identifying the feature words, a custom dictionary and stop-word list are used to filter out prepositions, symbols, and other non-essential elements to ensure the extraction of accurate semantic information. Finally, terms are ranked based on their frequency, in descending order. The calculation formula is as follows:
T F I D F ( t , d ) = T F ( t , d ) × I D F ( t )
where TF(t,d) represents the frequency of term t in document d, while IDF(t) denotes the inverse document frequency of term t across the entire document collection. The definitions are as follows:
I D F ( t ) = log ( N N ( t ) + 1 )
where N is the total number of documents in the collection, and N(t) is the number of documents containing the term t.
In this study, the Jieba library was employed to segment the Weibo text content. After tokenizing the content, the text was converted into a sequence of words. The TF-IDF algorithm was then used to mine the textual information from social media data. We categorized the Weibo content posted by residents of the four large communities—Tiantongyuan, Huilongguan, Wangjing, and Shangdi—according to activity types. Subsequently, we conducted a word frequency analysis on the Weibo text and created word clouds to visualize the results.

3. Results and Analysis

This section may be divided by subheadings. It should provide a concise and precise description of the experimental results, their interpretation, as well as the experimental conclusions that can be drawn.

3.1. Overall Characteristics of Resident Activities

(1)
Predominance of Dining and Leisure Activities
Based on social media data posted by users at various locations, we applied the ST-DBSCAN algorithm for spatiotemporal clustering to identify the users’ residential locations. Using these locations as anchor points, we visualized the various types of activities performed by residents (Figure 3). A resident’s activities, as posted on the Weibo platform, are centered around their residence and radiate outward, encompassing various types of activities. Figure 3 illustrates the total activities originating from each community in 2019, showing the flow, distance, and quantity of different types of activities.
The number of resident activities varies significantly across communities, as shown in Table 1. Huilongguan, with its large area and dense population, has the highest number of resident activities. In contrast, Shangdi, being a mixed-use community with a shorter development history, has fewer resident activities. To facilitate the comparison of different types of activities within each community, we converted the number of activities in each area into percentages, with the total activity proportion for each area being 1. The results are presented in Table 2.
In three of the four communities, except for Shangdi, leisure activities account for the largest proportion, followed by dining activities. In Shangdi, the proportions of dining and studying activities are nearly equal, indicating that dining and leisure activities dominate the daily behaviors of residents. Socializing, shopping, and working activities have relatively low proportions, generally not exceeding 6%. The proportions of studying and exercising activities are relatively higher in Shangdi compared to other communities, which show little variation in these activities.
The radius of gyration of resident activities, an indicator of the range of residents’ activities, is shown in Figure 4, which presents the average radius of gyration for different types of activities in each community.
(2)
Activity Range by Activity Type
When comparing the median radius of gyration for different types of activities across the four communities, socializing activities typically have the largest activity range, while studying activities have the smallest.
(3)
Community-Level Comparison
A horizontal comparison of the radius of gyration for various activities within the same community reveals that Wangjing has the smallest range difference, with activity radii between 10.5 and 12 km. Tiantongyuan and Shangdi show relatively larger differences, with variations around 5 km in the radius of gyration for different activities.

3.2. Comparative Analysis of Spatiotemporal Characteristics of Resident Activities

3.2.1. Spatial Distribution Differences of Various Activity Types

By performing kernel density analysis on the residents’ activity locations for different types of activities in the four communities, we can accurately describe the spatial distribution characteristics of these activities.
Due to the differing nature of large residential communities and large mixed residential–commercial communities, there are notable similarities and differences in resident activity behaviors worth exploring.
For all communities, the spatial distribution patterns of dining and leisure activities are consistent. Although socializing activities do not exhibit as clear a consistency as dining and leisure activities, they do show some spatial similarities. The locations for shopping and working activities are spatially correlated, with shopping locations generally being more concentrated than working locations. In large mixed residential–commercial communities, the spatial distribution of shopping and working locations is more concentrated compared to purely residential communities.
However, unlike other case studies, the spatial distribution patterns of studying and exercising activities in Wangjing are notably more concentrated. This distinctive characteristic merits further investigation.
The spatial distribution characteristics of activities in large residential communities are as follows:
(1)
Spatial Distribution and Overlap of Activities
The spatial distribution of dining and leisure activities among residents of large residential communities is dispersed, with significant overlap in activity locations. Conversely, while studying and exercising activities are also dispersed, they exhibit less overlap in activity locations.
The kernel density maps of seven types of activities (socializing, dining, leisure, shopping, studying, exercising, and working) for Tiantongyuan residents are shown in Figure 5. The spatial distributions of socializing, dining, and leisure activities are highly similar, indicating that residents frequently visit the same locations for these activities. Shopping activities are the most concentrated while working activities show a relatively concentrated distribution. Studying and exercising activities, however, are more dispersed.
In Huilongguan, which covers a large area with a dense population, the dining and leisure activities of residents have similar spatial distributions. Studying and exercising activities are more dispersed, and socializing activities exhibit a roughly uniform spatial distribution. Shopping and working activities are relatively concentrated, with shopping activities forming clusters. Although working activities are more numerous and widely distributed than shopping activities, there is a degree of spatial overlap between the locations for these two types of activities.
(2)
Consistency and Clustering in Activity Locations
The spatial distributions of socializing, dining, and leisure activities for Tiantongyuan residents are highly consistent, while Huilongguan residents’ shopping and working activities exhibit a clustered spatial pattern.
Both Tiantongyuan and Huilongguan are large residential communities, and there are several similarities in the spatial distribution of resident activities. The spatial distribution of dining and leisure activities is notably similar, with socializing and shopping activities being relatively concentrated. Studying and exercising activities are more dispersed, and there is some spatial similarity between the distributions of shopping and working activities. However, despite the similar community type, there are distinct differences in the spatial distribution of resident activities.
In Tiantongyuan, the spatial distribution of socializing activities is highly consistent with dining and leisure activities. In contrast, in Huilongguan, socializing activities do not show significant spatial consistency with dining and leisure activities. Additionally, the locations for shopping and working activities in Huilongguan exhibit a clustered spatial pattern, whereas in Tiantongyuan, these activities are more uniformly dispersed.
The spatial distribution characteristics of activities in large mixed residential–commercial communities are as follows:
In large mixed residential–commercial communities, there is a notable commonality and high concentration in the spatial distribution of shopping and working activities.
For Wangjing residents, the spatial distribution patterns of socializing, dining, and leisure activities are similar and relatively dispersed. Shopping, studying, exercising, and working activities, however, are more concentrated, with shopping activities being the most concentrated.
For Shangdi residents, the spatial distribution of dining and leisure activities is similar and relatively dispersed. The distribution of studying and exercising activities is also similar and dispersed. In contrast, socializing, shopping, and working activities are concentrated, with shopping and working activities showing similar spatial distribution patterns.

3.2.2. Temporal Characteristics of Resident Activities

To analyze the temporal variations in resident activities, this study divides the activities into weekdays and weekends. Figure 6 illustrates the temporal changes in activities for community residents during different time periods on weekdays and weekends. To facilitate the identification of temporal variation characteristics, the Y-axis for different activities in different communities is adjusted based on their quantities.
The activities of residents in different communities show minor temporal differences, indicating that resident activities are temporally uniform across different communities. The differences in resident activities are primarily influenced by the functional differences of the areas, as reflected in the spatial dimension. The temporal characteristics of resident activities are as follows:
(1)
Based on the frequency of residents’ activities, these activities can be categorized into three tiers. Dining and leisure activities have the highest frequency and show significant temporal variation. Learning and fitness activities have moderate frequencies with minor fluctuations over time. Socializing, shopping, and work activities occur less frequently and exhibit relatively stable temporal patterns.
(2)
In four residential communities, dining and leisure activities predominate and align temporally with holidays. Dining and leisure activities are the most frequent regardless of weekdays or weekends. Due to their temporal specificity, leisure activities play a predominant role during residents’ weekends. Moreover, dining and leisure activities peak around May and October, aligning with residents’ behavior during and around the “May Day” and “National Day” holidays.
(3)
The temporal variations in resident activities exhibit unique characteristics. During weekdays, work activities in the Huilongguan area peak in July. In contrast, the Shangdi community experiences significant fluctuations in activity times, with a notably higher frequency of learning and fitness activities, distinguishing it from other communities. On weekends, dining activities among Tiantongyuan residents show a declining trend in November, contrary to the rising trend observed in other communities.

3.3. Semantic Analysis of Residents’ Activity Weibo Posts

To delve deeper into the OD flows of residents’ activities and analyze the semantic information underlying these activities, this study performs a semantic analysis of residents’ activities. Based on the word frequency statistics, the results are organized into Table 3, and corresponding word cloud diagrams are generated.
(1)
Strong Correlation Between Community Residents’ Activity Types and Surrounding Built Environment.
The built environment surrounding a community shows a strong correlation with the types and locations of activities performed by community residents. The activity choices of community residents are consistent with the characteristics of the residents themselves, and each community has its unique traits. For instance, in terms of fitness activities, residents of Tiantongyuan and Huilongguan prefer nearby and low-cost fitness venues such as forest parks. In contrast, Wangjing, with its numerous golf courses, sees a higher frequency of the term “golf” in fitness-related Weibo posts compared to other communities. Residents of Shangdi often mention Beijing Sport University in their fitness posts, reflecting the prevalence of gyms in the Software Park where many residents work. These posts are more detailed and specific, mentioning workouts targeting legs, abs, shoulders, etc.
(2)
The diversity and differences in residents’ daily activities are significantly influenced by the comprehensive characteristics of their communities and are strongly associated with the attributes of the residents.
Figure 7 shows some typical cases. Tiantongyuan residents show a high enthusiasm for star-chasing activities. In shopping activities, the names of celebrities such as “李现” (Li Xian, a Chinese celebrity) and “周震南” (Zhou Zhennana, a Chinese celebrity) are common, indicating that Tiantongyuan residents consider celebrity endorsements in their shopping choices. This also reflects the relatively younger demographic of this community.
Wangjing residents’ social activities are closely linked to work. Wangjing features the iconic commercial CBD building Wangjing SOHO, where work activities are prominent. Unlike other communities where social interactions are mostly with friends, Wangjing residents frequently socialize with colleagues. Social activities here are less about dining and more about team-building events.
For residents of the Huilongguan and Shangdi communities, proximity to the Zhongguancun Software Park significantly influences their work activities. Many residents are employed within the software park, which is reflected in the high-frequency words found in their work-related activities. In addition to common terms like “工作” (work) and “加班” (overtime), keywords such as “中关村” (Zhongguancun, a high-tech industrial development zone), “软件园” (software park), and “代码” (code) frequently appear, indicating a strong association with their professional environment.

4. Discussion

This study utilizes social media data rich in semantic information to address the limitations of traditional OD data, which typically lacks semantic context. By employing the ST-DBSCAN algorithm to identify users’ residential locations, we address the challenge of lacking origin information in OD flow analysis when using social media data as a source. This approach extends research on OD data mining based on social media data. By classifying residents’ activities using the BERT model, we explore the basic characteristics and inter-group differences in daily activities across different communities. By conducting a semantic analysis, this study delves into the characteristics and differences in the content of residents’ activities, thereby, identifying factors that influence behavioral differences among residents of different communities. This enriches the methods for measuring residents’ spatiotemporal behavior and enhances our understanding of differentiated behavioral activities among residents in different types of communities. The findings of this research are valuable for understanding the differentiated spatial and facility needs based on residents’ behaviors, thereby aiding in the improvement of urban public service facility planning.

4.1. The Reasons for Differences in Community Residents’ Activities

4.1.1. Differences in Community Positioning and Built Environment

The planning and positioning of different communities lead to variations in the activities of their residents. Tiantongyuan and Huilongguan have been designated by the Beijing municipal government as key areas for the construction of affordable housing. Both communities primarily serve residential functions, with significant separation between work and residence. Despite multiple subway lines and bus routes, traffic congestion remains severe, unable to meet the demand. The long commuting distances and times, dispersed work locations throughout the main urban area, community management difficulties due to population density, and relatively poor living environments are prominent issues. Affordable housing is a type of guaranteed housing, relatively far from the city center, with lower housing prices, mainly targeting urban low- and middle-income families. This leads to a distinctly different development direction for large residential communities than mixed-use communities. Therefore, apart from social activities that require larger spaces and more options, possibly necessitating travel to farther locations, the radius of daily activities for Tiantongyuan and Huilongguan residents is relatively small. It is inferred that most residents, due to the long commuting distances, prefer to engage in other activities within the community or nearby after returning home.
Wangjing, as one of the earliest planned areas in Beijing, has comprehensive supporting facilities, forming a business core area that drives the overall development of the region. It is also one of the few large communities in Beijing that balances industry and residence. In terms of transportation, Wangjing is surrounded by major roads including the Fourth Ring Road, Fifth Ring Road, Airport Expressway, and Jingcheng Expressway, providing convenient transportation. In Beijing’s Twelfth Five-Year Plan, it is clearly stated that Wangjing is Beijing’s second CBD, integrating business, residential, and entertainment functions. Due to the concentration of business and entertainment facilities and convenient transportation in Wangjing, residents have a wide range of choices for most activities in nearby areas, resulting in an average activity radius that is relatively appropriate.
Shangdi, developed as China’s first high-tech industrial base and approved by the National Science Development Committee and Beijing Municipal Government in 1991, carries the primary task and responsibility of information industry development compared to its residential function. In its initial planning, Shangdi reserved a certain proportion of residential areas to avoid the issue of job-housing separation and paid attention to the diversity of functional zones, including concentrated commercial areas aside from corporate offices and residences. However, the actual situation reveals a significant insufficiency in residential resources and commercial supply within the area. Compared to Wangjing, Shangdi still requires the improvement of supporting facilities. Consequently, despite having certain commercial facilities, the concentration of high-tech industries and office areas necessitates that residents travel to other areas for daily activities, leading to a larger radius of various types of daily activities for Shangdi residents.

4.1.2. Differences in Community Resident Attributes

The planning and built environment of a community directly influence its housing prices, which in turn lead to significant differences in the attributes of its residents. Tiantongyuan and Huilongguan are both large residential communities designated as key areas for affordable housing construction in Beijing. These communities have abundant housing resources and relatively low prices, targeting low- to middle-income groups. Comparatively, more young people choose Tiantongyuan to alleviate living pressures, while Huilongguan benefits from its proximity to high-tech industries in areas such as Shangdi, Xierqi, and Zhongguancun, making it more attractive to residents in these industries. As a result, the overall demographic structure of Huilongguan is more stable than that of Tiantongyuan. Although both are large residential communities, Tiantongyuan primarily consists of low- to middle-income families, with a larger proportion of young people whose daily needs are concentrated within the community, resulting in a smaller activity radius. In contrast, Huilongguan, influenced by the high-tech industrial zone, includes many high-tech professionals whose higher living standards and more dispersed workplaces lead to a larger daily activity radius compared to Tiantongyuan.
Wangjing is a large community primarily developed with commercial housing. Due to its early and mature housing market development, it includes a variety of housing types such as villas, apartments, standard residences, and affordable housing, catering to all social strata. However, its advantageous location, developed transportation network, and comprehensive supporting facilities result in high housing prices, attracting many mid- to high-level corporate managers. Therefore, residents of Wangjing generally belong to the middle- to high-income bracket with better economic conditions. Additionally, the mixed-use nature of Wangjing and its transportation convenience result in reasonable commuting times and distances, allowing residents sufficient economic means and time to meet high-quality living needs, thus having a wider range of activities.
Shangdi, as an information industry base, is a comprehensive high-tech industrial zone dominated by the electronic information industry, primarily targeting professionals in related fields. The varying housing prices across these large residential areas reflect significant economic disparities among residents, leading to distinct resident attributes in different regions. Shangdi residents are mostly high-tech professionals with high demands for quality of life and work-related social activities. They may need to travel farther for high-quality daily activities, resulting in a more dispersed daily activity distribution and larger activity distances.

4.2. Policy Implications

Firstly, improve the supporting facilities of large residential communities. For large residential communities like Tiantongyuan and Huilongguan, the government should further enhance commercial and service facilities within these communities, particularly in areas such as dining, leisure, shopping, and fitness, to reduce the need for residents to travel long distances to meet their daily needs. Additionally, the public transportation system should be strengthened. Despite the presence of multiple subway lines and bus routes, public transportation services need to be optimized and increased, especially during peak hours, to alleviate traffic congestion, shorten commute times, and improve residents’ quality of life.
Secondly, optimize resource allocation in mixed-use communities to improve the balance between work and residence. In mixed-use communities like Wangjing and Shangdi, further optimization of the balance between work and residence is necessary. Increasing the proportion of residential and commercial service land can allow residents to transition between work and life over shorter distances. Providing diverse, high-quality commercial and entertainment services can meet the varied needs of residents.
Thirdly, implement differentiated planning based on the characteristics of different communities. The government should plan according to the unique features of each community. For example, Tiantongyuan and Huilongguan can focus on developing affordable entertainment and leisure facilities. Residents in these areas tend to choose forest parks for fitness activities; hence, protecting and expanding these natural resources can provide more low-cost, nearby fitness and recreational venues. In contrast, Wangjing and Shangdi can introduce more high-end business and cultural entertainment facilities.
Finally, strengthening community interaction and social activities is essential. Increasing social activities and public spaces within communities can promote interaction and communication among residents. As shown by the research results, residents of Tiantongyuan are enthusiastic about celebrity-related activities, while Wangjing community residents primarily engage in social activities with colleagues, placing a higher emphasis on team-building events. Therefore, in communities with a larger proportion of young people, such as Tiantongyuan and Huilongguan, organizing celebrity meet-and-greets and community events can enhance community cohesion. In mixed-use communities like Wangjing, encouraging businesses to cooperate with the community in organizing diverse team-building activities can improve employee job satisfaction and overall happiness.

5. Conclusions

Using four large communities as examples, this study primarily investigates the spatiotemporal characteristics and differences in the daily activities of residents in large communities in Beijing. Through semantic analysis, the similarities and differences in resident activities among different communities are explored, enhancing the understanding of the activity spaces of Beijing residents. The main findings of this study are as follows:
(1)
In the spatial dimension, residents’ daily activities are primarily centered around dining and leisure activities. These activities are centered around residential areas and radiate towards the northern part of the central urban area. Additionally, there is spatial overlap between residents’ shopping and working locations. Based on the type of residential community, mixed-use large communities exhibit more concentrated spatial distributions of shopping and working locations compared to purely residential large communities.
(2)
In the temporal dimension, resident activities exhibit a notable uniformity, largely unaffected by community type or resident attributes, resulting in minimal differences between different communities. The temporal variations in resident activities within the same type of community show significant similarities based on the nature of the community. While there are substantial monthly variations in the quantity of resident activities, the periods of highest activity intensity correspond with major holidays.
(3)
In the semantic dimension, firstly, the types of activities that community residents engage in and their choice of locations are closely related to the surrounding built environment. For example, in the case of fitness activities, residents of Tiantongyuan and Huilongguan tend to choose nearby and cost-effective options such as forest parks. In contrast, Wangjing, with its numerous golf courses, sees a higher frequency of golf-related mentions in fitness activities. Secondly, the diversity and variation in residents’ daily activities are influenced by the comprehensive characteristics of their communities. Residents of Tiantongyuan exhibit a strong enthusiasm for fan activities, showing a notable interest in celebrity-endorsed products during social and shopping activities, which reflects their relatively younger demographic. In contrast, residents of Wangjing are closely linked to their workplaces, with social activities often centered around Wangjing SOHO and involving colleagues, emphasizing team-building activities.
However, due to the unequal number of samples between communities and the limited scope of the study to only four communities, it is challenging to perform attribution analysis using statistical methods. Future studies should consider increasing the sample size for further research. This study explores the activity characteristics of residents in large Beijing communities from temporal, spatial, and semantic perspectives. Nevertheless, with the residence as the anchor point, the spatial distribution of different types of activities by each resident should be radial. This study only visualizes the one-origin-to-multiple-destinations behavior pattern of residents without conducting an in-depth analysis.
In future research, effective evaluation metrics for resident activity patterns should be established to enhance the understanding of spatial differences in resident activities. Additionally, due to the age bias in social media data, with relatively fewer data points for middle-aged and elderly groups, the analysis of daily activities for these groups is limited. Future studies may need to supplement research data with traditional survey questionnaires and other auxiliary survey methods.

Author Contributions

Conceptualization, Z.O. and B.W.; methodology, B.M.; software, Z.O. and C.S.; validation, C.S., B.W. and D.Z.; formal analysis, Z.O.; investigation, Z.O.; resources, B.M.; data curation, C.S., Z.O. and B.W.; writing—original draft preparation, Z.O.; writing review and editing, B.W.; visualization, Z.O. and B.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Academic Research Projects of Beijing Union University (No.ZKZD202305), Sponsored by the team-building subsidy of “Xuezhi Professorship” of the College of Applied Arts and Science of Beijing Union University (BUUCAS-XZJSTD-2024005).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Yu, M. Beijing Seventh National Population Census Bulletin (No. 3); Beijing Statistics: Beijing, China, 2021. Available online: http://www.beijing.gov.cn/gongkai/shuju/sjjd/202105/t20210519_2392888.html (accessed on 20 December 2023).
  2. Zhu, Y. Beijing Urban Master Plan (2016–2035); Beijing Daily: Beijing, China, 2017. Available online: http://www.gov.cn/xinwen/2017-09/30/content_5228705.html (accessed on 5 April 2023).
  3. Planning Department. The Fourteenth Five-Year Plan for National Economic and Social Development and the Long-Term Goals for 2035 of Beijing; National Development and Reform Commission of the People’s Republic of China: Beijing, China, 2021. Available online: https://www.ndrc.gov.cn/fggz/fzzlgh/dffzgh/202103/t20210331_1271321.html?code=&state=123 (accessed on 6 June 2023).
  4. Weijing, Z.; De, W. Urban space study based on the temporal characteristics of residents’ behavior. Prog. Geogr. 2018, 37, 1106–1118. [Google Scholar]
  5. Clark, C. Urban population densities. J. R. Stat. Soc. Ser. A 1951, 114, 490–496. [Google Scholar] [CrossRef]
  6. Gu, J.; Qi, L.; Zhou, S.; Yan, X. Origins and review of urban time-space structure studies. World Reg. Stud. 2016, 25, 69–79. [Google Scholar]
  7. Bin, M.; Huili, Y.; Limin, Z. A comparative study on the commuting behavior of residents in large residential areas in Beijing—Take Wangjing and Tiantongyuan residential area as examples. Geogr. Res. 2012, 31, 2069–2079. [Google Scholar]
  8. Rui, X.; Tinghua, A.; Wei, Y.; Tao, F. Spatial Voronoi partitioning algorithm and OD flow visualization analysis considering the distribution density of taxi OD points. J. Geo-Inf. Sci. 2015, 17, 1187–1195. [Google Scholar]
  9. Le, T.; Binjie, C.; Zhiguang, Z. Overview of visual analysis of OD data. J. Comput. Aided Des. Comput. Graph. 2021, 33, 1160–1171. [Google Scholar]
  10. Jiansu, P.; Huamin, J.; Mingxuan, N. Visualization of movement trajectory data. J. Comput. Aided Des. Comput. Graph. 2012, 24, 1273–1282. [Google Scholar]
  11. Tao, F.; Wu, J.; Lin, S.; Lv, Y.; Wang, Y.; Zhou, T. Revealing the impact of COVID-19 on urban residential travel structure based on floating Car trajectory data: A case study of nantong, China. ISPRS Int. J. Geo-Inf. 2023, 12, 55. [Google Scholar] [CrossRef]
  12. Guo, X.; Xu, Z.; Zhang, J.; Lu, J.; Zhang, H. An OD flow clustering method based on vector constraints: A case study for Beijing taxi origin-destination data. ISPRS Int. J. Geo-Inf. 2020, 9, 128. [Google Scholar] [CrossRef]
  13. Zhang, Y.; Sun, K.; Wen, D.; Chen, D.; Lv, H.; Zhang, Q. Deep Learning for Metro Short-Term Origin-Destination Passenger Flow Forecasting Considering Section Capacity Utilization Ratio. IEEE Trans. Intell. Transp. Syst. 2023, 24, 7943–7960. [Google Scholar] [CrossRef]
  14. Luo, C.; Cai, R.; Guo, H.; Luo, S.; Mao, R.; Jiang, L.; Zhang, D. MG-ASTN: Multi-Graph Framework with Attentive Spatial-Temporal Networks for Crowd Mobility Prediction. IEEE Internet Things J. 2023, 10, 19054–19061. [Google Scholar] [CrossRef]
  15. Lishan, S.; Lin, J.; Zhonghua, W.; Junfeng, L. Demand forecasting of taxi travel based on GPS data. J. Transp. Inf. Saf. 2021, 39, 128–136. [Google Scholar]
  16. Wang, H.; Zhang, Z.; Fan, Z.; Chen, J.; Zhang, L.; Shibasaki, R.; Song, X. Multi-Task Weakly Supervised Learning for Origin-Destination Travel Time Estimation. IEEE Trans. Knowl. Data Eng. 2023, 35, 11628–11641. [Google Scholar] [CrossRef]
  17. Wenda, H.; Yubo, T.; Ke, Q.; Hai, L. Visual Analysis of Group Behavior Based on Origin-Destination Data. J. Comput. Aided Des. Comput. Graph. 2018, 30, 1023–1033. [Google Scholar]
  18. Qiong, L.; Hong, S.; Yajin, X.; Wen, L. Citizen Commuting Analysis Using Mobile Trajectory Data. Geomat. Inf. Sci. Wuhan Univ. 2021, 46, 718–725. [Google Scholar]
  19. Xiao, Q.; Feng, Z.; Lifang, X.; Shoujia, Z. Research methods of urban spatiotemporal behavior in the era of big data. Prog. Geogr. 2013, 32, 1352–1361. [Google Scholar]
  20. Zhang, F.; Zhou, B.; Liu, L.; Liu, Y.; Fung, H.H.; Lin, H.; Ratti, C. Measuring human perceptions of a large-scale urban region using machine learning. Landsc. Urban Plan. 2018, 180, 148–160. [Google Scholar] [CrossRef]
  21. Wang, B.; Meng, B.; Wang, J.; Chen, S.; Liu, J. Perceiving Residents’ Festival Activities Based on Social Media Data: A Case Study in Beijing, China. ISPRS Int. J. Geo-Inf. 2021, 10, 474. [Google Scholar] [CrossRef]
  22. Sicong, Z.; Shanqi, Z.; Feng, Z. Measurement of community daily activity space and influencing factors of vitality based on residents’ spatiotemporal behavior: Taking Shazhou and Nanyuan streets in Nanjing as examples. Prog. Geogr. 2021, 40, 580–596. [Google Scholar]
  23. Beijing Infinite Forward Technology Co., Ltd. Talking Data: Observation Report on Travel in Large Beijing Communitie; Beijing Infinite Forward Technology Co., Ltd.: Beijing, China, 2017. [Google Scholar]
  24. Sina Weibo Data Center. 2020 Weibo User Development Report; Weibo Corporation: Beijing, China, 2021. [Google Scholar]
  25. Marti, P.; Serrano-Estrada, L.; Nolasco-Cirugeda, A. Social Media data: Challenges, opportunities and limitations in urban studies. Comput. Environ. Urban Syst. 2019, 74, 161–174. [Google Scholar] [CrossRef]
  26. Liu, Y.; Yuan, Y.H.; Zhang, F. Mining urban perceptions from social media data. J. Spat. Int. Sci. 2020, 20, 51–55. [Google Scholar] [CrossRef]
  27. Yandong, W.; Hao, L.; Teng, W.; Zhu, J. The Mining and Analysis of Emergency Information Sudden Events Based on Social Media. Geomat. Inf. Sci. Wuhan Univ. 2016, 41, 290–297. [Google Scholar]
  28. Cao, G.; Wang, S.; Hwang, M.; Padmanabhan, A.; Zhang, Z.; Soltani, K. A scalable framework for spatiotemporal analysis of location-based social media data. Comput. Environ. Urban Syst. 2015, 51, 70–82. [Google Scholar] [CrossRef]
  29. Liu, X.; Huang, Q.; Gao, S.; Xia, J. Activity knowledge discovery: Detecting collective and individual activities with digital footprints and open source geographic data. Comput. Environ. Urban Syst. 2021, 85, 101551. [Google Scholar] [CrossRef]
  30. Zipei, X. Big Data and Its Cause of Formation. Sci. Society. 2014, 4, 14–26. [Google Scholar]
  31. Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
  32. Kai, J. Social Media Mining and Application with Geographic Location Information; University of Science and Technology of China: Hefei, China, 2014. [Google Scholar]
  33. Liu, J.; Meng, B.; Wang, J.; Chen, S.; Tian, B.; Zhi, G. Exploring the Spatiotemporal Patterns of Residents’ Daily Activities Using Text-Based Social Media Data: A Case Study of Beijing, China. ISPRS Int. J. Geo-Inf. 2021, 10, 389. [Google Scholar] [CrossRef]
  34. Driver, H.E.; Kroeber, A.L. Quantitative Expression of Cultural Relationships; University of California Press: Berkeley, CA, USA, 1932. [Google Scholar]
  35. Tryon, R.C. Cluster Analysis; Edwards Brothers: Ann Arbor, MI, USA, 1939. [Google Scholar]
  36. Cambe, J.; Grauwin, S.; Flandrin, P.; Jensen, P. A new clustering method to explore the dynamics of research communities. Scientometrics 2022, 127, 4459–4482. [Google Scholar] [CrossRef]
  37. Lukauskas, M.; Ruzgas, T. A New Clustering Method Based on the Inversion Formula. Mathematics 2022, 10, 2559. [Google Scholar] [CrossRef]
  38. Ester, M.; Kriegel, H.P.; Sander, J.; Xu, X. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, OR, USA, 2–4 August 1996; Volume 96, pp. 226–231. [Google Scholar]
  39. Aoying, Z.; Shuigeng, Z. Approaches for scaling DBSCAN algorithm to large spatial database. J. Comput. Sci. Technol. 2000, 15, 509–526. [Google Scholar]
  40. Bo, C.; Berchtold, S.; Kriegel, H.-P.; Michel, U. Multidimensional index structures in relational databases. J. Intell. Inf. Syst. 2000, 15, 51–70. [Google Scholar]
  41. Salton, G. The SMART Retrieval System—Experiments in Automatic Document Processing; Prentice-Hall Inc.: Saddle River, NJ, USA, 1971. [Google Scholar]
  42. Sparck Jones, K. A statistical interpretation of term specificity and its application in retrieval. J. Doc. 1972, 28, 11–21. [Google Scholar] [CrossRef]
  43. Jones, K.S. Index term weighting. Inf. Storage Retr. 1973, 9, 619–633. [Google Scholar] [CrossRef]
  44. Birant, D.; Kut, A. ST-DBSCAN: An algorithm for clustering spatial–temporal data. Data Knowl. Eng. 2007, 60, 208–221. [Google Scholar] [CrossRef]
  45. Xu, P.; Li, X.; Hui, Y.; Zhang, G. Research and implementation of Chinese text classification related algorithms. J. Jinlin Univ. 2009, 47, 790–794. [Google Scholar]
Figure 1. Study area.
Figure 1. Study area.
Information 15 00392 g001
Figure 2. Research Framework.
Figure 2. Research Framework.
Information 15 00392 g002
Figure 3. OD map of community residents’ activities.
Figure 3. OD map of community residents’ activities.
Information 15 00392 g003
Figure 4. Distribution map of the mean value of the radius of gyration of community residents’ activities.
Figure 4. Distribution map of the mean value of the radius of gyration of community residents’ activities.
Information 15 00392 g004
Figure 5. Spatial distribution map of residents’ activities.
Figure 5. Spatial distribution map of residents’ activities.
Information 15 00392 g005
Figure 6. Time sequence change chart of community residents’ activities.
Figure 6. Time sequence change chart of community residents’ activities.
Information 15 00392 g006
Figure 7. Community resident activities special Weibo word cloud map.
Figure 7. Community resident activities special Weibo word cloud map.
Information 15 00392 g007
Table 1. Factor statistics of the research area.
Table 1. Factor statistics of the research area.
Population (Ten Thousand People)Area (km2)Residents (Persons)Number of Activities (Items)
Tiantongyuan25.927.476003741
Huilongguan50.6421.69130312,095
Wangjing14.6214.408535987
Shangdi6.719.522882288
Table 2. Statistics on the proportion of activity types of community resident resident.
Table 2. Statistics on the proportion of activity types of community resident resident.
CommunitySocializingDiningLeisureShoppingStudyingExercisingWorking
Huilongguan4.04%28.29%34.07%4.38%13.07%10.53%5.61%
Tiantongyuan4.04%29.89%36.92%3.77%12.94%8.07%4.38%
Wangjing4.89%28.78%39.24%2.96%11.24%9.47%3.42%
Shangdi3.67%21.33%31.08%2.27%22.42%16.08%3.15%
Table 3. High-frequency words of community residents’ activities.
Table 3. High-frequency words of community residents’ activities.
CommunityActivities TypeHigh-Frequency Words
TiantongyuanSocializingWedding, eating, attending, gathering, thank, small gathering, get together, friends
DiningEating, Tiantongyuan, delicious, restaurant, eating, check-in, taste, hot pot
LeisureTiantongyuan, movie, check-in, weekend, play, Dongyuan, eating, take photos
ShoppingBuying, eating, supermarket, Tiantongyuan, clothes, shopping, splurge
StudyingPostgraduate entrance exam, Beijing Institute of Fashion Technology, art, study, Tiantongyuan, 2020, exam, exam questions
ExercisingFitness, Tiantongyuan, exercise, check-in, running, ACE, jogging, losing weight
WorkingWork, overtime, going to work, Tiantongyuan, effort, interview, weekend
HuilongguanSocializingEating, Huilongguan, dinner, friends, classmates, drinking, received, wedding
DiningEating, delicious, Huilongguan, taste, breakfast, hot pot, meal, restaurant
LeisureHuilongguan, eating, weekend, movie, check-in, play, drink, holiday
ShoppingBuying, eating, Huilongguan, supermarket, clothes, shopping, store
StudyingStudy, exam, Huilongguan, class, North China Electric Power University, IELTS, class, write
ExercisingCheck-in, running, Huilongguan, lap, swimming, jogging, fitness, exercise
WorkingOvertime, work, working, Huilongguan, writing, interview, Zhongguancun
WangjingSocializingTeam building, Wangjing, get together, friends, received, gift, eating, dinner
DiningEating, Wangjing, delicious, gourmet, restaurant,
hot pot, taste, breakfast
LeisureWangjing, eating, Guoan, drinking, weekend, official, taking photos, movie
ShoppingBuying, eating, Wangjing, shopping, drinking, delicious, buying, cheap
StudyingStudy, Central Academy of Fine Arts, Wangjing, graduation, exam, writing, attend class
ExercisingExercise, Wangjing, running, check-in, effort, change, golf, desire
WorkingWork, overtime, Wangjing, tattoo, working, interview, off work, weekend
ShangdiSocializingChenxing, theater club, eating, gathering, Beijing Sport University, received, gift, friends
DiningEating, delicious, Beijing Sport University, check-in, taste, breakfast, canteen, sticker
LeisureBeijing Sport University, taking photos, eating, check-in, weekend, second, holiday, play
ShoppingBuying, shopping, eating, store, bought, every day, Beijing Sport University, BHGMall
StudyingBeijing Sport University, study, exam, library, graduation, bar exam, class, attend class
ExercisingBeijing Sport University, running, fitness, check-in, training, jogging, leg, swimming
WorkingWork, overtime, effort, working, interview, writing, code
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ou, Z.; Wang, B.; Meng, B.; Shi, C.; Zhan, D. Research on Resident Behavioral Activities Based on Social Media Data: A Case Study of Four Typical Communities in Beijing. Information 2024, 15, 392. https://doi.org/10.3390/info15070392

AMA Style

Ou Z, Wang B, Meng B, Shi C, Zhan D. Research on Resident Behavioral Activities Based on Social Media Data: A Case Study of Four Typical Communities in Beijing. Information. 2024; 15(7):392. https://doi.org/10.3390/info15070392

Chicago/Turabian Style

Ou, Zhiyuan, Bingqing Wang, Bin Meng, Changsheng Shi, and Dongsheng Zhan. 2024. "Research on Resident Behavioral Activities Based on Social Media Data: A Case Study of Four Typical Communities in Beijing" Information 15, no. 7: 392. https://doi.org/10.3390/info15070392

APA Style

Ou, Z., Wang, B., Meng, B., Shi, C., & Zhan, D. (2024). Research on Resident Behavioral Activities Based on Social Media Data: A Case Study of Four Typical Communities in Beijing. Information, 15(7), 392. https://doi.org/10.3390/info15070392

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop