Exploring the Spatiotemporal Patterns of Residents’ Daily Activities Using Text-Based Social Media Data: A Case Study of Beijing, China

: The use of social media data provided powerful data support to reveal the spatiotemporal characteristics and mechanisms of human activity, as it integrated rich spatiotemporal and textual semantic information. However, previous research has not fully utilized its semantic and spatiotemporal information, due to its technical and algorithmic limitations. The efﬁciency of the deep mining of textual semantic resources was also low. In this research, a multi-classiﬁcation of text model, based on natural language processing technology and the Bidirectional Encoder Representations from Transformers (BERT) framework is constructed. The residents’ activities in Beijing were then classiﬁed using the Sina Weibo data in 2019. The results showed that the accuracy of the classiﬁcations was more than 90%. The types and distribution of residents’ activities were closely related to the characteristics of the activities and holiday arrangements. From the perspective of a short timescale, the activity rhythm on weekends was delayed by one hour as compared to that on weekdays. There was a signiﬁcant agglomeration of residents’ activities that presented a spatial co-location cluster pattern, but the proportion of balanced co-location cluster areas was small. The research demonstrated that location conditions, especially the microlocation condition (the distance to the nearest subway station), were the driving factors that affected the resident activity cluster patterns. In this research, the proposed framework integrates textual semantic analysis, statistical method, and spatial techniques, broadens the application areas of social media data, especially text data, and provides a new paradigm for the research of residents’ activities and spatiotemporal behavior.


Introduction
The continuous advancements of globalization and informatization have profoundly affected people's daily lives and behavioral activities, causing tremendous changes in the traditional patterns of residents' activities. On the one hand, the emergence of network activities has had numerous effects, including the substitution, complementarity, and enhancement of residents' daily activities [1], thereby affecting the use of urban physical space. On the other hand, the rapid development of information and communication technologies (ICTs) has changed the temporal and spatial relationships of residents' daily activities so that some activities are no longer subject to specific temporal and spatial constraints, thereby allowing better flexibility and coordination [2]. In this context, research into residents' activities has received extensive attention in many disciplines, such as geography, urban planning, transportation, computers, and public health [3][4][5]. By exploring the differences in the distribution scales and types of various resident activities, the urban function and spatial structure can be better understood, the temporal and spatial laws of urban dynamics can be grasped, and the relationships between residents' activities and the objective environment in different temporal and spatial scenarios can be effectively revealed. This is of great significance for improving human health, guiding transportation and planning, and promoting the scientific understanding of human behavior [6][7][8][9].
With the rapid development of ICTs, while people are enjoying the convenience of ICTs, there has been an explosion of information on user activities and access records, either actively posted by users or passively recorded by devices and networks. It provides a wealth of convenient data to support research into human activities, and thus the exploration of new knowledge and methods of human activity patterns [10][11][12]. Some scholars have suggested that a new field is emerging that can utilize the capacity to collect and analyze data at a certain scale to reveal patterns of individual and group behavior [13], with sensible data mining algorithms making practical movement predictions that reveal trends and patterns that are difficult for humans to detect [14]. In this context, there has been tremendous progress in conducting resident activity research based on various types of big data, with an increasing variety of research results. However, the understanding of the interactions between residents' behavioral decisions and activity dynamics based on general big data (GPS data, mobile phone data, smart card data, inter-floating data, etc.) is limited due to the lack of information on the destinations that instigate population movements [15].
Fortunately, this limitation does not exist for text-based social media data based on user-initiated posts. Generally speaking, social media data contain detailed spatiotemporal, textual, image, social, and other multidimensional information about the user. It can be used to infer the user's activities and deeply study the characteristics of residents' activities and the influence mechanism of their choice of activities. However, among the existing studies on residents' activities via the use of social media data, most have primarily focused on using the spatial information in social media data, but less on the textual semantic information that contains rich activity content [16]. Specifically, the textual semantic information not only directly reflects the purpose and type of individual activities at a fine-grained level; furthermore, the quantity of data can also indicate the intensity of an individual's activity, and, combined with the spatial location, can efficiently reveal the behavioral activity characteristics of individual users [17]. Therefore, social media data (especially textual information) deserves more attention in the field of resident activity research. Meanwhile, it is necessary to organically combine spatiotemporal information with semantic information to improve the comprehensive utilization efficiency of social media data, and then to comprehensively and truly uncover the spatiotemporal characteristics of the users' activities and enhance the understanding of the dynamic characteristics of cities and residents.
In view of this, this study aims to introduce the current advanced natural language processing (NLP) technology into the field of resident activity research in order to efficiently extract the rich semantic information from the social media data. Then, the textual semantic information is combined with spatiotemporal information to improve the efficiency of social media data utilization in residential activity research, thus providing a high-quality data base for residential activity research. On this basis, the spatiotemporal characteristics and related patterns of residents' daily activities were explored, and the driving forces were then investigated to better examine and analyze the spatial structure of the city.
The remainder of this paper is organized as follows. In Section 2, existing research related to resident activities is presented from three perspectives. In Section 3, the study area and data collection method are presented. In Section 4, the process of classifying the residents' daily activities information is introduced, and the main methods used in this study are elaborated. In Section 5, the semantic characteristics, spatiotemporal patterns and attribution results of various residents' activities are specifically analyzed. Finally, the conclusions of this paper are drawn and future research directions are proposed in Section 6.

Study of Resident Activity under the Traditional Perspective
The study of residents' activities can be traced back to the time−geography theory proposed by Hagerstrand [18], which was later praised and applied by related scholars [19][20][21][22]. In traditional resident activity research, travel surveys, questionnaires, interviews, and other means are primarily used to construct resident activity-diary surveys, and to carry out various related studies involving residents' activities and travel behaviors [23][24][25]. These efforts demonstrate that the daily activity patterns of residents have great regularity and are closely related to land use and urban built environments [26][27][28]. However, the questionnaire and interview-based approaches to the collection of activity information are costly in terms of time and money. Meanwhile, there is variability and risk to the reliability of the data and findings with the limitation of questionnaire design, interview rules, spatiotemporal scale, and the subjective nature of the respondents [29][30][31].

Study of Resident Activity in the Era of Big Data
With the rapid development of ICTs, the information storm triggered by the era of big data is transforming our lives, work, and thinking, and is initiating a major transformation of the era [32]. In this context, scholars have conducted a series of studies on residents' activities in the era of big data with the help of various types of big data. Specifically, big data-based research has been mainly focused on the use of the user activity records collected from various data platforms such as GPS devices, mobile phones, smart cards, floating vehicles, social media, wearable devices, etc., to explain the movement patterns of individuals or groups of people, reveal the spatiotemporal patterns of various residents' activities (travel, work, leisure etc.) or specific groups and dynamic changes [7,[33][34][35][36][37]. For example, transit or travel smart card data are used to reveal the residents' daily activity patterns and laws [34,37] by extracting useful mobility information from the mobile phone location and call data to identify where residents live and work [38], investigating individual mobility patterns within cities [39]. Moreover, based on the extension of geographical information system (GIS) spatial models and analysis methods, and combined with data fusion, machine learning, and other means, the spatiotemporal patterns of human behavior can be extracted and the geospatial characteristics of human and socioeconomic elements can be inverted, which has become a hot research topic in recent years [40].

Study of Resident Activity Based on the Social Media Data
With the widespread adoption of mobile devices and location-based services, social media data have increasingly attracted the attention of scholars due to their large user base, rich spatiotemporal and semantic information, and low cost of access [12,17]. Social media data incorporating spatiotemporal and textual semantic multidimensional information has greatly enhanced the role of understanding human behavior and complex social dynamics in geographic space. Some scholars even argue that data generated based on internet communication and interaction may revolutionize our understanding of collective human behavior [41].
However, among the existing studies on residents' activities via the use of social media data, most have primarily focused on using the spatiotemporal information. These investigations include scalable and efficient spatiotemporal analyses via large-scale, locationbased social media data [42,43], the modeling and prediction of user behavior and activity patterns [44,45], and the revelation of the functions, dynamics, and spatial structures of cities [46,47]. Specifically, the digital footprints collected from social media platforms are clustered through various spatiotemporal analysis methods and their variants to identify various types of residents' daily activities (e.g., living, working, entertainment, and eating) [48]. Alternatively, activity types are inferred based on the geographical location of each data record linked to the type of place in combination with other types of data, such as land-use data, points of interest (POI), and street-view imagery [49][50][51]. Nevertheless, most clustering methods consider only the temporal or spatial distribution characteristics of travel activity points while ignoring their geographical context. It would result in the clustering of different types of activity data into the same cluster. This problem also exists in the use of place types to infer activity types. For example, a check-in at a residential building may label the location as "home", however, the place may not be the user's home, but rather their friend's home, and the user's behavior at this location should instead be labeled as "social" or "party". Similarly, a place marked as a place of entertainment may also be the user's workplace. In such cases, the results should be interpreted with caution [52].
In addition, as mentioned earlier, using textual information from social media data to conduct research on residents' activities is a very effective approach, but due to previous technical and algorithmic limitations, it is difficult to fully extract semantic information from text, and thereby the relevant literature is lacking. In the only relevant studies, activity information or activity topics were mainly identified and extracted via feature word extraction (e.g., Word2vec model) or some clustering methods (e.g., density-based spatial clustering of applications with noise (DBSCAN), Latent Dirichlet Allocation (LDA) model, etc.) [16,49,53,54]. However, these studies lack a comprehensive consideration of the semantic content, resulting in some bias in the authenticity of the obtained activity data. Especially for social media data with word limits like Twitter and Weibo, the sparse nature of short textual features makes it riskier to rely on feature words alone for semantic classification (e.g., "apple is ripe" and "Apple Inc." have two completely different meanings). With the significant breakthroughs in natural language processing in recent years [55], existing technologies have been able to support the efficient classification of large-scale text data. Therefore, it is necessary to introduce advanced natural language processing techniques into the study of residents' activities, by conducting comprehensive and in-depth mining of textual resources in social media data to obtain a high-quality dataset of residents' activities, which is important for improving the understanding of residents' activities and urban dynamics [48,56,57].

Study Area
Beijing, the capital of China, is also the political and cultural center of the country. As of the end of 2019, the city had a total area of approximately 16,410 km 2 , including 16 districts, and a resident population of 21,536,000 [58]. To fit well with the other datasets used in this research, the study area was divided into more than 16,000 one km grids; thus, the one km grid was the basic research unit for this study.

Data
The social media data in 2019 were obtained from Sina Weibo API using web crawler tools. In total there were 11,500,105 pieces of Weibo data involving more than one million users covering Beijing City. The attributes included the user ID, text, time, latitude, and longitude. According to the Weibo User Trends Report in 2020 by Weibo Data Center (http://data.weibo.com/datacenter/recommendapp, accessed on 30 September 2020), as of September 2020, the number of monthly active users on Weibo had increased to 511 million, with an average of 224 million daily active users. These data indicate the further strengthening of Sina Weibo's position as the leading social media platform in China. However, we are also aware of the problems of sample bias and representativeness in social media data. Specifically, the social media platforms are used by a relatively young group, and the users on social media may be varied by socio-economic attributes (ages, gender, occupation, etc.) and individual behavior differences [59]. However, many studies have shown that social media data still play an important role for the extraction of human activities, emotions and experiences associated with a place with the advantage of rich contextual content and geographical location information [17,60]. On this basis, large amounts of social media data can be integrated in order to profile groups of users and their activity patterns, thereby providing insight into the dynamics of cities and people on a larger scale [16,61]. Therefore, it is reasonable and effective to select the massive Weibo data to identify the spatiotemporal patterns of residents' daily activities at the group scale.
2019 Beijing POIs data were also sourced from the Gaode Map platform, and included 12 main categories (restaurants, shopping, accommodation, science, education, culture, etc.). The 2019 WorldPop dataset was also used as one of the base datasets for this study.

Methodology
This study constructs a framework for studying residents' activities by integrating natural language processing, statistical analysis and spatial analysis (Figure 1), and innovatively introduces textual multiclassification techniques, combined with machine learning methods into the study field of residents' daily activities in order to fully exploit the rich textual and spatiotemporal information in social media data. Specifically, first of all, the main types of residents' daily activities are identified based on time-geography and behavioral geography theories. Secondly, machine learning algorithms and BERT models are used to perform text multiclassification on the collected large-scale social media datasets to identify the specific types of users' activities, and then a high-quality spatiotemporal dataset of residents' daily activities is formed by combining the posting locations. Based on this, spatiotemporal patterns of residents' daily activities and related laws are explored from three perspectives: semantic, temporal and spatial.

Identification of Activity Categories
The study of residents' daily activities has traditionally been an important part in the fields of time-geography and behavioral geography, involving a range of activities such as commuting, shopping, and leisure [62]. There exists a wealth of existing methods for the classification of daily activities, with the number of activity types ranging from four to hundreds [63,64]. However, residents' daily activities are habitual, stable and highly repetitive, overly broad or trivial classifications of activities are not conducive to research, and the randomness of related activities may make it difficult to extract valuable regular features. Moreover, some scholars have found that social media-based daily activities include those such as at-home activities, working, eating, shopping, learning, leisure, and entertainment, with an activity coverage rate of 94.5% [15,52]. However, given the broad scope of at-home activities, the spatial characteristics are not obvious, and the Weibo data themselves have a certain outdoor feature. Therefore, based on previous research and taking into account the behavioral characteristics of local residents, seven types of activities were selected as the main types of daily activities (Table 1), based on which the subsequent step of text classification was carried out.

Classification of Residents' Daily Activities Based on BERT
Since the research object is the daily activities of Beijing residents, the original datasets need to be screened to exclude the user data of non-local residents. Therefore, according to the specific filtering rules (the time and frequency of user posts), 7,293,190 pieces of Weibo data of local residents were ultimately obtained. At the same time, as this research only focused on the Weibo items about the residents' daily activities, the original Weibo datasets were filtered to select the related items using the Bidirectional Encoder Representations from Transformers (BERT). It is one kind of language encoder, released by Google in 2018, able to translate the input sentences or paragraphs into corresponding semantic features, which has performed amazingly well and become an important recent advancement in NLP [55].
In this research, a text classification model was constructed based on the BERT to perform multiple classifications on the crawled Weibo texts. Specifically, first, 70,000 items were randomly selected as the training samples. For each item, if it was related to different residents' activities, it was labeled as 1−7 manually, otherwise labeled as 0 (Table 2) [64,65]. Second, using machine learning and the original BERT model to pretrain 70,000 training data and verify the classification accuracy, by adjusting the corresponding parameters and the number of iterations for several times under the experiment, the trained text multiclassification model was obtained (the overall accuracy exceeded 87%). Third, based on the derived classifier, all the Weibo items were input to BERT and the items relating to the various residents' activities were classified. Then, 1,198,600 pieces of data of the daily activities of Beijing residents in 2019 were identified. Finally, a further 5000 randomly selected social media posts from each of the seven categories of activity data were manually validated and an average accuracy of 94.12% was achieved, thereby verifying the excellent classification effect of the model. The detailed data screening process is presented in Figure 2.

Weibo Items Label
It's so boring! 0 (Irrelevant) An extraordinarily enjoyable team building~1 (Social) This hot pot is really delicious 2 (Eating) Take a stroll around the Forbidden City 3 (Entertainment) Come out to shop! 4 (Shopping) There are many things to learn, trying to learn 5 (Studying) Five kilometers completed 6 (Sports) I'm still struggling in the office at this hour 7 (Working) Figure 2. The data filtering process.

The Identification of Residential Activity Clusters
In order to reveal the clustering patterns of various residents' activities in the geospatial range and their distribution combinations, this study uses the activity density and type ratio methods with reference to the study of function, mixing the degree of spatial unit [66,67] to identify the clustering patterns and activity combinations of daily residents' activities in Beijing.
First, the activity density method was employed to calculate the proportions of different types of resident activity in each grid to the total number of corresponding types of resident activity in the study area, which were calculated as follows: where A ij is the number of type i activities in grid j as a proportion of the total number of type i activities in all spatial units in the study area, P ij indicates the number of type i activities in grid j, and n indicates the number of grids in the study area. Next, the relative proportions of the densities of different types of residential activity in each grid were calculated to reflect their type-ratio characteristics. These proportions were calculated as follows: where AC ij is the ratio of type i residential activities in spatial unit j, and A ij has the same meaning as in Equation (1).
In the identification of residential activity clusters, if the proportion of one type of activity in the grid is ≥50%, the spatial unit is dominated by a single activity; if the proportion of all types of activity in the grid is below 50%, the unit is considered to be a co-located cluster space of multiple activities. Particularly, in this study, the co-located clusters were subdivided and the activities with a proportion of ≥25% of the type of activity in the grid were considered as the dominant activity type within the spatial unit. If the proportion of all types is less than 25%, this indicates that the frequency density of different types of public services is relatively evenly distributed and there are no clearly dominant activities.
Furthermore, a vibrant urban space must maintain sufficient diversity to meet the diversity of people's needs [68]. To calculate the mix of daily activities of residents within different types of activity clusters, this study draws on the concept of measuring the landuse mix, which is commonly used in urban research [69], to construct a characteristic indicator for the measurement of activity diversity while taking into account the number and types of activities. The formula is as follows: where M j denotes the activity diversity index of grid j, q i denotes the ratio of the number of type i activities in grid j to the total number of activities in the spatial unit j, and k denotes the number of activity types in spatial unit j. The activity diversity index M j has a value range from 0 to 1, and its size reflects the degree of mixing of different activities; a larger value indicates a more balanced distribution of various types of activities in the spatial unit and a higher activity diversity, while a smaller value indicates a more homogeneous distribution of activity types and a lower diversity.

Analysis of Influencing Factors
The Geodetector was used to explore the causes of the spatial and temporal heterogeneity of the residents' daily activities. It consists of four components: risk detection, factor detection, ecological detection and interaction detection, to detect geo-spatial heterogeneity and reveal the driving forces behind it [70,71]. The method is good at analyzing typological quantities, detecting both numerical and qualitative data, and is unique in its ability to investigate the interaction between two explanatory variables to a response variable. As the decision-making process of the residents' activities is the result of a combination of factors, the spatial differentiation mechanism and diversity of residents' daily activities cannot be separated from the systematic analysis of multiple influencing factors. According to urban diversity theory [72], the formation of diversity is closely related to factors such as population, land, and transportation. Based on this, the Geodetector was employed to analyze the influences of various factors, including the socioeconomic attributes, facility configuration, and location conditions of the grid, on the formation of diversity in residential activity clusters. Specifically, the dependent variable was considered to be the resident activity pattern in the grid, while the explanatory variables included the population density, land price, traffic accessibility, and others (see Table 3). In short, factor detection and interaction detection were used to reveal the influencing mechanisms of the clustering pattern and diversity of residents' daily activities.

Semantic Characteristics of Residents' Daily Activities
According to the results of text classification processing, 1,198,600 pieces of data for seven types of activities were identified, and the percentages of the numbers of various activities and the top 10 highest-frequency words are reported in Table 4. Entertainment, eating, and studying activities were found to account for more than 80% of the total. It indicates that residents are more inclined to share these activities on online communities as opposed to work and social activities, with entertainment and study spaces becoming the main physical spaces corresponding to virtual online spaces. Based on the classified activity data, the top 100 highest-frequency keywords for each type of activity were extracted by word frequency statistics and arranged in reverse order (i.e., for the top 100 words in each type of activity, the No. 1 word was swapped with the No. 100 word, the No. 2 word was swapped with the No. 99 word, etc.), and were displayed in word clouds (Figure 3). For example, social activities often imply positive emotions, eating activities reflect the dishes and taste preferences of the residents' daily diets and some important places and restaurants, and healthy eating is also highly recognized. Entertainment, shopping, and sports reflect the corresponding specific types of activities and main venues. Studying highlights an intense state and atmosphere of learning, in addition to recording the activities themselves, such as exams and assignments. Moreover, working reflects more complex emotions about work itself (e.g., effort, motivated, nervous, too hard, etc.).

Activity Dynamics on a Long Timescale
The Weibo data of residents' daily activities in Beijing in 2019 were counted separately by months and days. First, from the results of the month-to-month statistics (Figure 4a), the activity intensities in terms of months and the types of resident activities were found to have large variations. Specifically, eating and entertainment were found to have the highest intensities and to be closely related to the temporal distribution of holidays, with obvious temporal clustering characteristics (May, August−October). Studying was found to have a strong correlation with China's own education system. The start of the term (March and September) and the end of the term (June and December) were found to have significantly more Weibo posts related to studying than the other periods. The temporal distribution of sports was found to be more seasonally correlated, with the fewest data related to sports in the winter and the most in the summer. In contrast, social, shopping, and working activities were found to be less intense and very evenly distributed between months. Secondly, after residents' daily activities in 2019 were counted by days (Figure 4b), it was found that the intensity of activities was significantly higher on weekends than on weekdays. Particularly, the intensities of entertainment, eating, social, and shopping activities were found to have more pronounced increases. However, working was found to be significantly less intense on weekends than on weekdays. In addition, the intensities of activities on Mondays and Fridays were also found to be more prominent due to the influence of weekend activities.

Activity Dynamics on a Short Timescale: The Activity Rhythm on Weekends Is Delayed by One Hour as Compared to That on Weekdays
The intraday distribution characteristics of residents' daily activities on weekdays and weekends were further examined. To reduce the impact of weekend activities on weekday activities, the activity data for Monday and Friday were excluded (see Figure 5). It was found that, overall, there is a clear temporal pattern for the seven types of activities, i.e., the intensity of residential activity is lowest before dawn, it fluctuates while increasing during the day, and it peaks at night. However, as compared to that on weekdays, the intensity of activity on weekends was found to be higher and longer, and the activity rhythm was found to be delayed by one hour, i.e., the nighttime sleep period (the six hours in the day with the lowest activity intensity) was found to be between 2:00 and 7:00 on weekends, compared to 1:00 and 6:00 on weekdays. Moreover, the peak of lunchtime activity was found to be 13:00 on weekends, as compared to 12:00 on weekdays. The intensity of activity at night continued to increase until 22:00 hours on weekends as compared to 21:00 hours on weekdays. This result is very similar to the pattern of mood changes observed by Golder et al. (2011) for Twitter users, i.e., people are happier on weekends, but the morning peak of positive affect is delayed by two hours [73]. This indicates that late bedtimes and late starts are becoming the norm on weekends, due to the increase in discretionary time and activities, resulting in a higher intensity and delayed pace of activity on weekends. However, compared to other countries, the daily activities of Chinese residents are more strongly influenced by the traditional routine, and the relative delay in activities on weekends is shorter.

Distribution of Hotspot Areas for Residents' Daily Activities
Kernel density estimation (KDE) is a non-parametric estimation method for the analysis of the density of geographic elements in the surrounding area [74]. KDE was used to carve out hotspot areas for the daily activities of Beijing residents, and the results are presented in Figure 6. Overall, the spatial distribution reveals that the hotspots of the residents' daily activities are mainly concentrated in the central city within the Fifth Ring Road. Moreover, the spatial distribution of residents' daily activities within the Fifth Ring Road is characterized by significant differences between the north and south, with the hotspots of various activities mainly located in the northern areas of the Fifth Ring Road. This reveals that the spatial structure of Beijing's city is still dominated by the traditional northern city model from the perspective of resident activity. In addition, due to the characteristics of various activities and the differences in the distributions of different types of activity facilities, the spatial characteristics of various activities were also found to be highly variable, as shown in Table 5. In general, the hotspot distributions of various activities and activity facilities exhibit spatial co-location patterns, especially some of the comprehensive activity facility clusters, which often become mixed clustering areas for multiple types of activities.  The nearest neighbor index (NNI) is an indicator that characterizes the proximity and mutual relationship between point-like geographic elements in a particular region [74]. The NNI was used to quantitatively measure the degree of agglomeration of various daily activities. The results indicate that the NNI for each type of activity is much less than 1 (Table 6), and there is significant spatial agglomeration for all types of activities. However, there is some variability in the degree of agglomeration of the various types of activity, with eating, entertainment, and studying being more spatially agglomerated, followed by shopping and sports, and social and working activities being relatively less agglomerated.

Identifying Residential Activity Clusters
To understand the common agglomeration characteristics and activity combination distribution patterns of different types of activities, the activity density and type ratio methods were utilized to identify the resident activity clusters, which were divided into two major categories: single-activity-dominant areas and multiple activity co-location clusters. Among them, the co-location clusters can be subdivided into four subcategories, namely co-location clusters I to IV, according to the distribution of activities in the grid. Ultimately, the daily activity clusters of Beijing residents are divided into five patterns. The specific definitions and characteristics of the activity patterns are reported in Table 7, and the spatial distributions of activity clusters are presented in Figure 7. Table 7. The definitions of resident activity patterns and spatial distribution characteristics.

Residential Activity
Cluster Definition Major Characteristics

Distribution Areas
Single-activitydominant areas The proportion of one activity in the grid is more than 50% Uneven distribution of activities within the grids with only one dominant activity 22.59% Areas outside the Fifth Ring Road Co-location cluster I The proportion of all activities in the grid is less than 25%.  Overall, the activity areas in the central urban district within the Fifth Ring Road and the urban district in the outer suburbs and counties were dominated by co-location clusters I and II. It reflects the relatively rich and balanced distribution of activities in the central city and the urban district in the outer suburbs and counties as places where various highdensity socioeconomic activities take place. However, the internal spatial distributions of the different residential daily activity clusters were found to vary considerably, with the exception of co-location cluster I, for which the activity combination could not be broken down. The specific activity patterns within the four different types of activity clusters are exhibited in Figure 8.
In addition, the information entropy was used to calculate the diversity of residents' daily activities to explore the balance of the activity distributions within different clusters. Moreover, for comparative analysis, the same method was employed to calculate the POI diversity to characterize the balance of activity facilities within the activity clusters ( Figure 9). It was found that the mean values of activity diversity in various activity clusters exhibited the following decreasing pattern: co-location cluster I > co-location cluster II > co-location cluster III > co-location cluster IV > single-activity-dominant areas. Moreover, the distributions of POI and activity diversities were found to have a high degree of matching. It indicates that the balanced co-location clusters have high activity diversity, which corresponds to the high accessibility of activity facilities and a variety of facility types. The monolithic co-location clusters were found to have the second-highest activity diversity, and single-activity-dominant areas were found to have low activity diversity, which corresponded to the low accessibility of facilities and a relatively homogeneous composition of activity types.

Main Influencing Factors and Explanatory Power
Factor detection was used to measure the factor explanatory power of various variables for the residential activity pattern. The results show that all kinds of explanatory variables have passed the 0.001 level significance test, indicating that these explanatory variables are important factors influencing the co-location clustering of residential activities in the Beijing. Figure 10 shows the explanatory power of specific influencing factors. The order of explanatory power of the factors is "X4 (distance to the nearest subway station) > X5 (distance to the city center) > X6 (urban planning positioning) > X2 (land price) > X1 (population density) > X3 (POI density)". The distance to the nearest subway station has the largest explanatory power, reaching 0.73, indicating that the residential activity pattern of Beijing is most strongly influenced by the accessibility of transport. The more convenient the transport conditions, the more significant the co-location cluster of the residents' daily activities. The distance to the city center is the next most important factor, indicating that macro-location conditions play an important role in the resident activities pattern. At the same time, the urban planning positioning in the area of the research unit also has an important influence in the formation of its activity clustering pattern. In addition, the factor explanatory power of land price and population density is also greater than 0.1, which also has a certain influence on the activity. However, the POI density in the research unit has a weaker influence on the type of activity cluster pattern, and its factor explanatory power is only 0.04, reflecting that the facilities configuration in the research area is not highly correlated with resident activity.

Analysis of Factor Interactions
Interaction detection was used to analyze whether the interaction of two different influencing factors enhances or weakens their explanatory power for the dependent variable, and can effectively reveal the impact of the joint action of two types of explanatory variables on the resident activity pattern ( Table 8). The results demonstrate that the explanatory power of any two influencing factors tends to increase after a two-by-two interaction, which indicates that the resident activity pattern is jointly constrained by the sub-factors of each dimension. Specifically, the type of factor interaction is "Enhance bi-", i.e., the explanatory power of the factors after interacting is significantly stronger than that of a single factor, but not higher than the sum of the explanatory powers of two factors acting independently. Overall, the order of the top five interaction results was found to be as follows: X4 ∩ X5, X2 ∩ X4, X1 ∩ X4, X4 ∩ X6, X3 ∩ X4. These results indicate that the resident activity pattern is most significantly affected by the combination of a micro-location condition, represented by the distance to the nearest subway station, and a macro-location condition, represented by the distance to the city center. Moreover, although the influences of land price and population density are minor when they act independently, they remain important basic factors that cannot be ignored.

Discussion and Conclusions
User-initiated social media data, based on a social network platform, contain a wealth of information on resident behavior dynamics, which is of great significance for the understanding of the spatiotemporal patterns and dynamic laws of resident activities in the information age [60]. Nevertheless, existing related research remains limited in terms of mining the resident activity information in the social media data, while various spatiotemporal clustering methods and their variants are usually used to cluster digital footprints from social media platforms, and while keyword extraction or topic model clustering is performed to identify various resident daily activities. Most of these methods only consider the location information of resident activity while ignoring the geographic background, which leads to certain problems in the identification of activity types [48]. In this regard, some scholars have pointed out that social media big data contain not only spatial information (e.g., locations, place names, etc.), but, more importantly, rich contextual and semantic information. Via NLP technology, spatiotemporal information can be extracted from text, and the semantics of the places, resident activities, and emotional experiences behind the text can also be mined [17].
The present study was based on a comprehensive integrated method that combines NLP technology and spatiotemporal analysis to achieve the organic integration of textual and spatiotemporal information. The results revealed that the BERT-based text classification model achieved excellent results in identifying residents' daily activities with an accuracy of more than 90%, which can effectively solve the current problems of the low utilization of social media text data and the poor integration of spatiotemporal and semantic information. Meanwhile, it also provides a solid data foundation for the full exploration of the spatiotemporal patterns and laws of human behavior activities hidden behind social media data, and provides a new research framework for the study of residents' daily activities in the mobile information era. Furthermore, based on the perspective of residents' daily activities, this study treated residents' daily activities and their spatiotemporal information as a complete system, and comprehensively explored the diversified and heterogeneous characteristics of resident activities, thereby solving the existing problem of the segmentation of the residents' activity types and providing a useful exploration and scientific guidance for the comprehensive and systematic revelation of urban dynamics, urban rhythms, and urban spatial structures from the perspective of the residents' daily activities.
The findings of this research can be summarized as follows. First, residents are more inclined to share their entertainment, eating, and studying activities in online communities as opposed to their work and social activities, with entertainment and study spaces becoming the main physical spaces corresponding to virtual online spaces. Second, the distribution differences and types of resident activities are closely related to the characteristics of the activities and holiday arrangements. However, compared to that on weekdays, the intensity of activity on weekends was found to be higher and longer, and the activity rhythm was found to be delayed by one hour. Third, there was significant spatial clustering of resident daily activities, with the main hotspot areas concentrated in the central city within the Fifth Ring Road and exhibiting differentiation characteristics between the north and south, with more activities in the north. The cluster patterns of resident daily activities can be divided into five modes, namely single-activity-dominant areas and multiple activity co-location clusters (co-location clusters I-IV). There are certain differences between the spatial distributions and activity combination types of various cluster patterns. In general, while the co-location cluster pattern has taken shape in Beijing, the proportion of balanced co-location cluster areas remains low, These cluster areas are mainly concentrated in the central urban district within the Sixth Ring Road and some urban districts in the outer suburbs and counties. Finally, the results indicate that the location conditions, especially the micro-location condition (distance to the nearest subway station), are the main factors that affect the resident activity cluster pattern. However, land price and population density, despite their limited influence when acting independently, remain fundamental influencing factors of the resident activity cluster patterns that cannot be ignored.
However, user data from various social media platforms are often affected by spurious correlation problems and their spatial and temporal dynamics may be partially linked to accidental events [75,76]. Therefore, the relevant data should be cleaned and filtered before being fed into the model for subsequent operations. Meanwhile, due to the biased nature of various social media data in terms of user groups, the relevant conclusions and patterns obtained by using social media data should be limited in scope (e.g., the relevant conclusions in this paper mainly reflect the activity patterns of relatively young groups). In order to improve the quality and reliability of the conclusions, other data from different crowd-sourcing platforms can be combined to corroborate them in a subsequent study [77]. In addition, the in-depth mining of social text data should continue to be strengthened, and multi-label technology should be employed to identify text information types and hidden content more efficiently. Moreover, spatiotemporal correlation data should be combined with information about other urban elements to effectively connect the users' places of residence, work, and activities, construct the daily life chains of residents, and improve the overall perception of residents' daily life space and the dynamic understanding of the urban spatial structure. Such research will provide more effective technical, theoretical support, and a governance basis with which to solve the practical problems of residents' daily activities and construct an efficient, convenient, and livable living pattern.