Next Article in Journal
Landscape Sustainability Evaluation of Ecologically Fragile Areas Based on Boltzmann Entropy
Next Article in Special Issue
Analysis of Green Spaces by Utilizing Big Data to Support Smart Cities and Environment: A Case Study About the City Center of Shanghai
Previous Article in Journal
PalmitoAR: The Last Battle of the U.S. Civil War Reenacted Using Augmented Reality
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Location-Based Social Network’s Data Analysis and Spatio-Temporal Modeling for the Mega City of Shanghai, China

1
School of Communication & Information Engineering, Shanghai University, Shanghai 200444, China
2
Institute of Smart City, Shanghai University, Shanghai 200444, China
3
School of Computer Science, University of Technology Sydney, Ultimo NSW 2007, Australia
*
Author to whom correspondence should be addressed.
ISPRS Int. J. Geo-Inf. 2020, 9(2), 76; https://doi.org/10.3390/ijgi9020076
Submission received: 15 December 2019 / Revised: 20 January 2020 / Accepted: 27 January 2020 / Published: 29 January 2020
(This article belongs to the Special Issue Geovisualization and Geo Visual Knowledge Discovery)

Abstract

:
The aim of the current study is to analyze and extract the useful patterns from Location-Based Social Network (LBSN) data in Shanghai, China, using different temporal and spatial analysis techniques, along with specific check-in venue categories. This article explores the applications of LBSN data by examining the association between time, frequency of check-ins, and venue classes, based on users’ check-in behavior and the city’s characteristics. The information regarding venue classes is created and categorized by using the nature of physical locations. We acquired the geo-location information from one of the most famous Chinese microblogs called Sina-Weibo (Weibo). The extracted data are translated into the Geographical Information Systems (GIS) format, and after analysis the results are presented in the form of statistical graphs, tables, and spatial heatmaps. SPSS is used for temporal analysis, and Kernel Density Estimation (KDE) is applied based on users’ check-ins with the help of ArcMap and OpenStreetMap for spatial analysis. The findings show various patterns, including more frequent use of LBSN while visiting entertainment and shopping locations, a substantial number of check-ins from educational institutions, and that the density extends to suburban areas mainly because of educational institutions and residential areas. Through analytical results, the usage patterns based on hours of the day, days of the week, and for an entire six months, including by gender, venue category, and frequency distribution of the classes, as well as check-in density all over Shanghai city, are thoroughly demonstrated.

Graphical Abstract

1. Introduction

The study of extracting valuable information and gaining useful insights from spatio-temporal data has become very important in recent years. Due to the popularity of Location-Based Social Networks (LBSNs) in the modern era, the availability of huge amounts of information generated by users has become invaluable, especially from a practical point of view. The information extracted from this data can be used in many application areas, such as analysis of public transit flows, location recommendations, population density estimation, route planning, disaster management, and many more [1]. Online services encourage different users to share their activities and interests with their social friends, and moreover generate enormous amounts of data, enabling researchers to understand users’ activities, patterns, and preferences more accurately. These online services provide and store the information of users by considering their real-time locations. The data collected through such online services are generally enriched with multimedia, text, geo-location, and metadata, which can be further used to conduct studies about various aspects of human behavior.
In recent years, numerous studies have been conducted to analyze and model human activities based on spatio-temporal data [2]. The most frequently used data in recent studies have included users’ check-ins from different LBSNs, despite the fact that they pose sampling problems, such as biases in gender, age, and social classes. The term “check-in” means that a user confirms his/her location on the LBSN while engaging in an activity at a specific location, or the user automatically shares his/her location with someone when sending a message on such a network [3,4]. With the exponential growth in use of applications that include location-based services, such as Facebook [5], Twitter [6], and Foursquare [7] worldwide, as well as Weibo [8] in China, many studies have been conducted to identify the relationships and extract patterns between users such as men and women, highly educated or less-educated groups, age classifications, and so forth. These patterns reflect functional characteristics within the city, as well as among different cities [9,10,11]. Weibo is not only famous among users but also among researchers, as the check-in data from Weibo can be utilized to carry out various kinds of studies in order to extract useful information based on the geo-location data it provides. For example, some recent studies have analyzed road crashes in Shanghai [12], investigated the growth of urban boundaries in Beijing [13], analyzed tourism venue attraction features by using Weibo data from the period 2012–2014 [14], and spatio-temporally analyzed gender in Beijing [15]. Most of these studies have covered the check-ins of specific users, or specific application fields, such as tourism, city boundaries, road crashes, spring festival rushes, and gender [16]. However, to the best of our knowledge, there is no comprehensive study based on the Weibo data of Shanghai combining both temporal and spatial analysis while considering the characteristics and nature of the check-ins from different venues. Therefore, we performed three different kinds of analyses on check-in data from Weibo for six months (January 1st 2017 to June 30th 2017) in Shanghai, covering spatio-temporal analyses with distinct venue classifications and various aspects of the users’ check-in data.
This article makes three key contributions. First, we present a temporal analysis from hourly to weekly patterns, and for the total study period (180 days), of check-in data acquired from the most famous Chinese LBSN, Weibo, and consider more than 220,000 check-ins gathered over six months. Second, we use the dataset to classify and study 10 different types of venue categories, where each venue was visited at least 100 times within the study period. Third, we performed a spatial analysis for the mapping location of each venue, and afterward estimated the density to yield results for a better understanding of the typical concentration of check-ins near the city center with an additional insight into where the educational institutes and residential areas were mainly contributing to the diversity of data all over Shanghai city. The temporal analysis was carried out by using IBM SPSS 25 [17], which is popular among researchers, to discover various patterns in the data with respect to time. We present both graphical charts and statistical results with detailed discussions for a clear understanding of the research findings. Check-in venue classification was done by using geo-data from the dataset and information about physical locations in Shanghai city. The classification was based on the nature, purpose, and characteristics of check-in locations. For example, a check-in with the latitude and longitude of a university belongs to the ‘Educational’ category, whereas a check-in location in different restaurants is in the ‘Food’ category. For the spatial analysis, we used ArcMap, the Kernel Density Estimation (KDE) technique [18], and geospatial data (Shape Files) from OpenStreetMap [19].
The rest of the paper is organized as follows. Section 2 covers related work on the LBSN, big data analysis through research on its applications in various fields, and check-in data-based research articles related to Weibo, China, and Shanghai. Section 3 includes an overview of the dataset and study area, a detailed description of the methodologies used for data collection and preparation, and the temporal and spatial analysis. The results, along with illustrations of our findings, are presented in Section 4. Section 5 concludes our study with a summary of the findings, the limitations of our dataset, and future research directions.

2. Related Work

In the last few decades, the interest of researchers in big data has increased exponentially and research into big data, compared to other fields of computer science, has attracted a tremendous amount of attention. The term itself and articles such as “Big data is opening doors, but maybe too many” [20] and “Big data: the greater good or invasion of privacy?” [21] suggest a perception of volume; however, there are more features to be considered regarding big data, such as complexity, structure, behavior, and the tools, techniques, and technologies used to process and analyze it [22]. Dumbill [23] discussed three different dimensions of big data: volume, velocity, and variety of contents. Mayer-Schönberger and Cukier [24] highlighted the three main challenges of big data: populations instead of samples, messy instead of clean data, and correlation instead of causality. Additionally, Miller and Goodchild [25] defined big data as data that cannot be analyzed using traditional tools. In 2013, Ovadia and Librarian [26] emphasized the importance of big data for social scientists and librarians, and suggested that it is much too important to be ignored, as most social science research is based on huge amounts of data and enormous datasets.
As a central focus of many study fields, including time and space geography, urban functionalities, and human mobility, big data analysis is a vast research field that was initially studied using statistical data from surveys, interviews, travel diaries, questionnaires, and other manual collections of datasets [27,28,29]. Statistical data collection may not be an efficient way to determine patterns in said fields and related studies; therefore, data from mobile devices, smart cards, global positioning system (GPS) navigators, and location-based and online applications containing users’ activities with geo-locations are widely used, and have been found to be more efficient for such studies in recent years [24,30,31,32,33,34]. With the advancement of mobile technologies and widespread use of mobile devices, it is easy to track users’ locations from their devices and activities. For example, Gonzalez et al. [35] introduced a dataset that contained data from 100,000 users over six months. Although the data only contained the nearby location of the mobile phone towers from where the phone call originated, it still proved to be very helpful in estimating the approximate locations of users with a certain margin of time, and was subsequently used in the prediction of human movement [36]. Various properties of Geographical Information System (GIS) functionalities and their potential role in urban mining studies were reviewed by Zhu [37] through a discussion on how GIS data can be utilized to analyze, visualize, report, and mine the temporal or spatial features of recyclable waste and its collection and recovery systems. The modern digitized world allows for researchers to conduct quantitative analyses of user activity patterns and related factors, such as living area, social contacts, and personal references [38,39,40]. Fan et al. [41] categorized user activity research into three different classes, namely location prediction, trajectory mining, and location recommendations. The authors also emphasized its role in our understanding of user activity patterns, and how it can be beneficial in many areas, such as traffic control, disaster relief, mobile marketing, city planning, and public health.
One of the most significant sources of big data is online social networks because of their widespread and ever-growing use in almost every part of the world [16]. LBSNs allow for users to share their current locations, activities, and interests, and generate data that provide us with the opportunity to conduct different kinds of studies in various fields. The analytical methods for, and the studies conducted on human activities from, mobile data are discussed in various works [42,43,44]. The use of LBSNs was investigated by Lindqvist et al. [45], followed by a number of studies on human activity patterns based on LBSN data [16,46,47,48]. Zhang and Chow [49] presented personalized geo-social recommendations based on LBSNs by using two different datasets (Foursquare and Gowalla), and observed similar patterns in both datasets. Preoţiuc-Pietro and Cohn [16] also investigated 10,000 Foursquare users for a better understanding of human activity patterns across different venue categories. They further divided the users into various clusters based on their behavior, and predicted their movements based on frequency. Colombo et al. [50] used similar data from two different cities in the United Kingdom to improve recommendation systems, by collecting more frequent check-ins at various venues. Li et al. [51] conducted a broader study by using Foursquare data from 14 different counties and 2.4 million venues to uncover the reasons for venue popularity. It was concluded that there are three main reasons influencing the popularity of a venue: (1) venue profile information, as venues with complete profile information are undoubtedly more popular; (2) venue age, as people tend to visit known and famous places; and (3) venue category, as venues under the ‘Food’ category were found to have the highest number of check-ins. Alrumayyan et al. [2] studied peoples’ patterns related to various venue categories in the capital city of Saudi Arabia, Riyadh, with the support of Foursquare data. The study was more focused on the ‘Food’ category because people are more interested in sharing their experience and leaving comments while visiting food venues. LBSN data have been used in a number of critical fields. Graham et al. [52] studied the importance of LBSNs in assisting local governments by conducting a survey of more than 300 local government officials from the United States. They discussed the contribution of social media tools to the management of a crisis, resulting in positive relationships with the ability of users to control the crisis situation. Other similar studies highlighting the use and ability of social media in crises include articles on the wildfires in California [53], Hurricane Sandy, and the earthquake in Haiti [54]. A study by Lin et al. [55] in New York city, San Francisco, and Hong Kong, from check-in data of more than 19,000 Swarm (an APP of Foursquare) users, discussed the user preferences and associations between different venue categories at different times of the day. Loo et al. [12] used LBSN data and the kernel density method to study the spatial distribution of road crashes in Shanghai.
Lots of research has been done to uncover different features in and from LBSN data in the last few years. Most researchers studied information from LBSNs such as Twitter and Foursquare to investigate a variety of patterns, including a user’s activity and mobility, urban planning, and venue categorization. Weibo is a famous LBSN in China, and has been proven to be an efficient source of data for this type of analysis. A case study of Shanzhen, China, introduced an approach to analyzing LBSN check-in data to analyze tourism venues’ attraction features by using Weibo data from the period 2012–2014 [14]. Long et al. [13] used human mobility and activity patterns based on Weibo data and proposed a framework to analyze the growth of urban boundaries for the city of Beijing. Another study by Shi et al. [56] used Weibo data for mining the tourism crowd in Shanghai. This study initially analyzed check-in data from Weibo to determine the popularity of tourism venues and afterward used spatial pattern analysis to find the association between these venues, followed by a sentiment analysis of tourist opinions from Weibo contextual information. Rizwan et al. [57] used Weibo data from early 2016 to observe check-in behavior and gender differences.
However, to our knowledge, there is no comprehensive study for the area of Shanghai that mines both the temporal and spatial characteristics of check-ins, and associates the Weibo check-in features with different venue classes within the city. Our goal is to study the volatility and patterns of users’ check-ins at different time scales (e.g., time of the day, day of the week, over six months) in association with the type of venues, prove their effectiveness under the consideration of venue categories, and finally show the locations of different venues and the density of check-ins from Weibo for the time period of January 2017 to July 2017.

3. Material and Methods

3.1. Data Source

The data used in the current study are from one of the most popular Chinese microblogs, Weibo. Facebook and Twitter are the most popular LBSNs in the world. In China, Weibo, a hybrid of Facebook and Twitter, is one of the most dominant LBSNs [56]. It has become a major platform, enabling users to share their activities, opinions, preferences, and locations along with audio, images, and videos through checking and writing posts, alongside communicating with their friends. Since Weibo was launched on 14 August 2009, the number of users, check-ins, and activities has increased rapidly. Weibo provides different types of geo-spatial resources; three of the main resources include user-profile locations, places mentioned in posts, and sharing real-time locations through check-ins. By the end of 2018, the total number of users increased to over 500 million, reaching 462 million monthly active and 200 million daily active users. Weibo launched an international version in March 2017 and claims to have users in more than 190 countries [57,58]. This study mined check-in patterns through different classes and further estimated the check-in density on a real map using SPSS, ArcMap, and OpenStreetMap via socially generated spatio-temporal data from Weibo in the famous city of Shanghai, China for a period of six months, from 1st January 2017 to 30th June 2017.

3.2. Study Area

This study was conducted on Weibo data taken for Shanghai, China, which is situated on the eastern edge of the Yangtze River Delta between 30′40″–31′53″ N and 120′52″–122′12″ E, with a total area of 8359 square kilometers, as shown in Figure 1. In 2016, Shanghai was divided into 16 districts and one county, namely Baoshan, Changning, Fengxian, Hongkou, Huangpu, Jiading, Jingan, Jinshan, Minhang, Pudong New Area, Putuo, Qingpu, Songjiang, Yangpu, Xuhui, and Chongming (which was not included as it is rarely visited by people) [57].
As the economic city of China, Shanghai connects China to the global economy. The total Gross Domestic Product (GDP) of Shanghai in 2016 was 2.7 trillion Chinese Yuan, with an average increase of 7.4% over the past five years, and the per capita GDP reached 15,290 USD (103,100 Yuan). With an average population of 3854 people per square kilometer in urban areas, Shanghai has become the first city in China and fifth in the world for population density, and around 0.66 million people move in annually. Its population grew from 16.74 million to 23.02 million during the last decade from 2000 to 2010, increasing by 37.53% and more than 24 million residents by the end of 2015 [59]. The main reason for this growth is the large number of migrants, who made up 39% of the total population of Shanghai in 2010. The recent master plan greatly emphasizes providing more facilities and administration for the betterment of the residents of Shanghai (Shanghai Master Plan (2016–2035)) [60].

3.3. Methodology

This section describes the data acquisition and preparation of the dataset used in our research, and provides an overview of our descriptive and analytical methods. Our methodology consists of the following: data collection acquisition and preparation, descriptive analysis, and spatial analysis.

3.3.1. Data Acquisition and Preparation

The primary inspiration for the use of LBSNs is to share interests and activities and thereby build new and close social relationships, enabling researchers to discover patterns in users’ activities and preferences from the big data generated by the LBSN. The data source for this research is Weibo, which is regarded as one of the most popular microblogs in China. We used a Python-based Weibo API (Application Program Interface) to collect data in specific regions of China, specifically Shanghai city. We collected our data during 2017, and initially there were approximately 3.5 million check-ins from about 2 million users. The data acquired from Weibo were in the standard API (Java Script Object Notation, JSON). The JSON format was converted into CSV (Comma-Separated Values) format using MongoDB for further analysis.
The initial dataset included several attributes, such as User_ID, Gender, Check-in Date/Time, account creation Date/Time, Location_ID, and text messages. The dataset was first filtered for anomalies, missing attributes, and attributes irrelevant to our study. In order to make it more significant and to consider only the important venues, we included venues with more than 100 check-ins within the study period of six months. The final dataset included 166,898 users with 222,525 check-ins at 722 different venues. A sample of the final dataset is shown in Table 1.

3.3.2. Temporal Methods

We performed a descriptive statistical analysis using IBM SPSS 25 on the dataset to reveal various patterns in the check-ins of users based on check-in frequencies at different hours of the day, different days of the week, and for all individual days throughout the study period of six months. Various check-in venue categories were examined to investigate from where people used LBSNs more frequently. All of the descriptive results include gender, in order to show the frequency patterns of both males and females.
The venue categorization was completed by comparing latitude/longitude and location names from the dataset with real locations all over the city. This study includes famous and frequently visited locations, and therefore highlights venue categories with the maximum number of check-ins. Each check-in was assigned a category according the check-in that is best suitable for the venue class. The overall flow of our research methodology is shown in Figure 2.

3.3.3. Spatial Methods

To observe the geo-data on a map, we collected the map attributes from OpenStreetMap and used Shape files in ArcMap with a built-in Python programming platform to show the actual locations of the venues and density of check-ins within the study area of Shanghai. OpenStreetMap is a geo-information platform providing real-time and user-generated content related to the global map, including various attributes of maps such as roads, canals, streets, and districts, and is available free of cost. It is widely used by researchers to analyze and visualize geo-spatial data [61].
In order to obtain a more accurate and smooth density, we used KDE. KDE is a multivariate method that uses a random sample of data to estimate the density. We can calculate the density as shown in Equation (1):
f K D ( l | D ) = 1 n j = 1 n K h j ( l , l j )
where l j is a two-dimensional location containing x and y , and D is the set of data. Using the bandwidth h for both spatial dimensions and the Gaussian Kernel Function K (   ) provides an efficient way of estimating the density [62].

4. Results

With the advancements in online services, wireless communication, mobile devices, and location-sharing technologies, LBSNs such as Facebook, Foursquare, Twitter, and Weibo are attracting researchers’ attention due to the huge amount of data generated by these LBSNs. The data can be used to extract very useful information for urban planning, crisis and disaster management, and for other fields of study involving big data with a high spatio-temporal resolution. The current study had three different aspects of analysis: temporal, check-in venue classification, and spatial analysis of the Weibo data for Shanghai. This section includes the results and discussion of these three aspects.

4.1. Temporal Patterns

The temporal check-in analysis further consists of three parts: daily patterns, weekly patterns, and check-in patterns for 180 days, from 1st January to 30th June 2017 (the study period of our research). All of these results also highlight the frequency of both male and female users.

4.1.1. Daily Patterns (Hours)

To investigate the check-in frequency pattern of Weibo users, we observed the distribution of check-ins for 24 h of the day, as shown in Figure 3. It can be observed that routine activities have a profound impact on the number and time of check-ins. For instance, the number of check-ins starts rising in the early morning, is considerable after 10 a.m. and is highest after 12 p.m., while the check-ins start declining after midnight. The peak of check-ins was 10 p.m. to 12 a.m., a typical time frame for social activities of many people.
It can be seen in Figure 3 that on the time scale from midnight to midnight (00 to 24 h), the check-in frequency is more skewed towards the right, showing more check-ins in the afternoon, evening, and before midnight. The figure shows the normal distribution of the data, shown by the kurtosis having a nearly normal value of 3. There are less check-ins after midnight and in the early morning because of the sleeping routine of Shanghai residents. As one of the most developed cities of China, the check-in frequency of both males and females is almost the same, but the number of check-ins differs because of the different numbers of male and female users in our dataset. The frequency is normal until the afternoon because the people are mostly at work, and it increases as they finish work and as they meet their friends and families or visit places, before eventually decreasing for the night period.

4.1.2. Weekly Patterns (Hours)

This section analyzes the weekly rhythm of check-ins. Weekly patterns suggest that the user’s check-ins are predominant in the weekends when compared to the weekdays. The full view of the total number of check-ins for each day of the week can be seen in Figure 4.
It can be observed from Figure 4 that most of the check-ins took place on Saturday and Sunday, suggesting the behavior of people using LBSNs on the holidays. Users tend to increase their social activities after work on Friday, Saturday, and Sunday, and this increase sometimes lasts until overnight on Sunday; therefore, more activities occur from Friday night until Monday morning. This figure illustrates that the frequency of check-ins on Saturdays and Sundays is the highest, followed by Friday and Monday. Tuesday, Wednesday, and Thursday show the minimum number of social activities throughout the week.

4.1.3. Patterns by Date (180 Days)

This section represents the daily trends of the total number of Weibo users for 180 days (1st January 2017 to 30th June 2017) in Shanghai. Figure 5 shows the variations of check-in frequencies for both males and females during the study period.
Figure 5 demonstrates that the maximum number of check-ins occurred in the first two weeks of April 2017. Some of the main reasons for this include Shanghai Fashion Week, the Shanghai Formula 1 Grand Prix, the Shanghai Ballet Company’s ‘Swan Lake’, and Easter, all from the ‘Entertainment’ category, which had the highest impact, and was also due to the number of venues belonging to this category in the dataset. The least number of check-ins occurred in the last two weeks of January 2017 and the first two weeks of February 2017 because of the periodic vast migration of people around the Chinese New Year or Chinese Spring Festival [63], wherein a massive number of Shanghai residents move back to their hometowns on vacation, accounting for 39% (in 2010) of the total population of Shanghai. The results also reveal that people tend to share their activities using LBSNs more on such occasions as visiting places and meeting with friends compared to being physically present at home or work.

4.2. Check-In Venue Categories

One of the primary advantages of using LBSN data is the ability to identify the location of the check-in activity, along with its purpose. Each check-in provides the latitude and longitude of the original location by the LBSN (e.g., Weibo) [64]. When searched for in the LBSN data, the latitude and longitude give a specific location on a geo-referenced map. This location can be used to obtain information about the visited venue. We classify these venues based on their type and the activities performed at them.
In this study, we use only the most general types of the hierarchy, containing 10 different venue types: ‘Educational’, ‘Entertainment’, ‘Food’, ‘General Location’, ‘Hotel’, ‘Professional’, ‘Residential’, ‘Shopping & Services’, ‘Sports’, and ‘Travel’, based on the most frequent checked-in latitude/longitude and real-world locations in Shanghai. The categories and examples of the check-in locations are given in Table 2 below.
According to the above criteria, the distribution of locations can be put into categories, as shown in Table 3. The check-in venue class ‘Entertainment’ contains 136 distinct locations, and 88, 27, 55, 30, 51, 172, 90, 26, and 47 locations are found in ‘Education’, ‘Food’, ‘General Location’, ‘Hotel’, ‘Professional’, ‘Residential’, ‘Shopping & Services’, ‘Sports’, and ‘Travel’, respectively. We investigated more interesting patterns by applying the same category distribution to the whole dataset through different characteristics of prescribed categories. The most common characteristics of the 10 venue categories are given below.
First, we looked at the number of locations of each category in our dataset. We can see that the categories with the highest number of users and check-ins are ‘Entertainment’ and ‘Shopping & Services’. ‘Residential’ and ‘Educational’ also have a significant number of users and check-ins as compared to the other categories. This shows the regular day-to-day behavior of people; as expected, people in entertainment and shopping places tend to use LBSNs more when compared to people working in their offices. Another insight is that students and people during their free time at home more frequently use LBSN services. Hence, the results are similar to expectations, with an additional trend of the check-in data being that the number of check-ins in the ‘Residential’ category is less than that in ‘Entertainment’ and ‘Shopping & Services’, despite these having a greater number of locations in the dataset.

4.3. Spatial Patterns

In this section, we investigate the spatial analysis by visualizing the location of check-in venue categories and the density of total check-ins by using the geo-location data from Weibo on a map of Shanghai. For this purpose, we used a map including features from OpenStreetMap, because it contains the most recent updates of the map features [65]. We can observe features such as city boundaries, districts and district boundaries, Shanghai Metro lines, and the road structure. With the help of these features, it is easy to evaluate and recognize the different locations on the map. For spatial analysis, we first plotted the locations of all the famous venues in Shanghai, as shown in Figure 6.
It can be observed from the above figure that, as per the planning of one of the major cities of China and ease of access, most of these locations are located either in the city center or near the Shanghai Metro. The seven districts, namely Changning, Huangpu, Putuo, Hongkou, Xuhui, Jingan, and Yangpu, situated in Puxi (Huangpu West) are collectively called the downtown area or the city center of Shanghai [66]. The downtown has a higher concentration of famous places, as would be expected in any major city; however, the Educational and Residential venues are relatively dispersed within the city.
Although plotting the venues gives us an abstract idea about the distribution of check-in locations, we need further investigation in order to analyze the spatial patterns in our dataset. Therefore, we used KDE to find the density of check-ins using ArcMap. We calculated the density based on the check-ins by all users, providing us with more accurate results for further analysis, as shown in Figure 7.
Figure 7 plots the density distribution of check-ins in different regions of Shanghai. Red represents the highest density and white represents the average, which eventually dissolves into the base color of the map according to the type of data [67]. Check-ins that did not satisfy the minimum criteria in our dataset were not considered; thus, such data does not appear on the map. This figure clearly indicates that check-ins in the city center are more dense as compared to the regions away from the city center (as expected). The areas of Hongkou, Huangpu, and Jingan are the most dense areas as compared to the other districts.
The spatio-temporal analysis was conducted by comparing weekly density for the first two weeks of April (having the maximum number of check-ins) with the last week of January and the first week of February (containing the minimum number of check-ins) as shown in Figure 8.
The above figure represents the density of check-ins in four different weeks. Two of them (Figure 8a,b) have the maximum number of check-ins (17,344 in the first week and 14,920 in the second week of April) and two (Figure 8c,d) having the minimum number of check-ins (4699 in the last week of January and 5952 in the first week of February) as shown in Figure 5. It can be observed that, although the density varies in different areas all over the city, the downtown area remains the denser area even with a smaller number of check-ins throughout the weeks of January and February; however, the overall check-in distribution covers a larger area during different periods of time. It is important to consider that the downtown area is considered to be the commercial center of Shanghai; therefore, these areas have more facilities in almost every way, including transportation, food, shopping malls, government offices, and nightspots. However, as Shanghai is a considerably developed and modern city with lots of parks and diverse Educational and Residential venues, the check-in clusters can be observed in different places throughout the city.

5. Discussion

The analysis shows that data from Weibo are an efficient resource for analyzing the distribution of user activities and preferences in terms of spatio-temporal aspects. One of the benefits of using LBSN data for spatio-temporal analysis is that we can extract and visualize large-scale information for a megacity such as Shanghai in more detail. Some areas in the downtown of Shanghai are crowded, while other suburban areas have less visitors. This study intended to observe the behavioral traits of users by providing evidence that the dynamics of a megacity can be influenced by various facilities, and the contribution of the nature of different venues. We explored the spatio-temporal patterns in the check-ins to show the distribution of users in Shanghai. In this study, we performed an empirical analysis of check-ins using graphs, tables, and density maps based on LBSN data. The spatio-temporal patterns were studied from various perspectives, including hours, days, and venue categories. From the chronological perspective, the results verified the frequency of check-ins rising from the middle of the day until late at night, and the obvious increase in weekend activities as compared to weekdays. From the spatial point of view, the level of spatial intensity of users in the city center was higher in the downtown area, as this is the center of activity for most of the venue classes.
This research used geo-tagged check-in data from an LBSN as a representation to approximate the general population of Shanghai, as it is more efficient than time- and labor-intensive questionnaires and surveys, and can therefore offer exceptional spatial and temporal coverage. Weibo provides an open geo-database and excludes all of the information related to the privacy of the users. However, this approach has its own limitations. For example, we do not have a way to measure the exact sample ratio of LBSN users and the population of Shanghai, so we can only determine the correlation between check-in data and actual people in the evaluation and planning of a megacity, as the connection between the check-ins of Weibo and actual residents may vary across different areas. Although the LBSN provides many attributes as compared to traditional census data, it generally does not directly include some demographic data such as age, marital status, and ethnicity, although there are other ways to extract these kinds of data indirectly, as discussed by Longley and Adnan [68].
The comprehensive spatio-temporal coverage of this research provides useful results and information that could be beneficial in analyzing user activities in an urban city and, therefore, it may be beneficial for the planning and development of large cities, and also provide a basis for using Weibo data to analyze individual categories, such as travel, food, and educational venues. The study of the influence of different venue types on the preferences of inhabitants in various urban areas has significant potential for planning and activity preferences among urban residents.

6. Conclusions

We used check-in data from Weibo to analyze geo-spatial data to uncover various temporal and spatial patterns throughout the most famous places in Shanghai. The study was carried out to look at three different aspects of analysis: a temporal analysis to reveal the patterns based on time, a check-in venue classification to provide insight into LBSN users in each category, and a spatial analysis, resulting in a clear observation of venues and check-ins through mapping. The findings demonstrated that people tend to use LBSNs more in the evening instead of the morning and work day. We also observed that LBSNs are more widely used while visiting locations and shopping. Check-ins from educational institutions are substantial, suggesting that students are frequent users of LBSNs. Though many of the results are similar to what we expected, we obtained some interesting facts about the use of LBSNs. For example, despite having more locations in the residential category in our dataset, the check-in number for the entertainment and shopping categories exceeded the residential check-ins. Another interesting pattern that we uncovered was that the density extends to suburban areas, mainly because of educational institutions and residential areas, an important fact that has not yet been discussed.
Data from LBSNs can play a strategic role in both the development and improvement of various aspects of mega cities’ “smartness”. The possibility to analyze the activities of urban agents has completely modified the representation of relationship between activities and spaces. This can assist in urban planning by providing the tools to attain objectives of sustainability and make mega cities livable and more efficient. For example, finding the factors that increasingly influence cities (venue categories), if not planned well, can affect both the objectives of sustainability and development. The study of user activities essentially requires the availability of big data and information; therefore, the use of LBSNs to collect data from people residing and moving inside a mega city could be beneficial to planning the distribution of different types of venues throughout the city. In this framework, information about the various activities of the city’s users and residents can describe the events occurring in the physical space. The current study attempts to address this using data from Weibo to better explore the activities of urban populations in Shanghai. Further study is indeed required to explore of the behavior of the population described here, with a more specific definition to strengthen the relation between the baselines of this study and the effect of various urban functions, such as restaurants, transport, and educational institutions. The results could provide insights into the linkage between urban entropy and urban magnets (types of venues attracting more people) in the city, therefore identifying the areas and aspects that need special attention and a well-planned distribution in the management of the city. These results are based on a dataset containing a minimum of 100 Weibo check-ins for a single venue, which is why the results concentrate on specific areas within the city. The analysis could be improved by using more micro-level data from different LBSNs. Similarly, the categories could be extended by covering more venues and specialized distributions. Another dimension of a future study might be the use of diverse datasets and extending the category classes to obtain more specific and accurate patterns. In this regard, we are working on analyzing user behavior in the “Food” and “Educational” categories, which were classified in this study.

Author Contributions

Naimat Ullah Khan and Wanggen Wan conceived the research; Naimat Ullah Khan designed the research, performed the simulations, and wrote the article; Wanggen Wan proofread the article; Wanggen Wan and Shui Yu supervised the research. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (61711530245) and a project of the Shanghai Science and Technology Commission (18510760300).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Singh, R.; Zhang, Y.; Wang, H. Exploring Human Mobility Patterns in Melbourne Using Social Media Data. In Databases Theory and Applications, Proceedings of the Australasian Database Conference, Gold Coast, Australia, 24–27 May 2018; Springer: Berlin/Heidelberg, Germany, 2018; pp. 328–335. [Google Scholar]
  2. Alrumayyan, N.; Bawazeer, S.; AlJurayyad, R.; Al-Razgan, M. Analyzing User Behaviors: A Study of Tips in Foursquare. In Proceedings of the 5th International Symposium on Data Mining Applications; Springer: Berlin/Heidelberg, Germany, 2018; pp. 153–168. [Google Scholar]
  3. Todd, A.W.; Campbell, A.L.; Meyer, G.G.; Horner, R.H. The Effects of a Targeted Intervention to Reduce Problem Behaviors: Elementary School Implementation of Check in—Check out. J. Posit. Behav. Interv. 2008, 10, 46–55. [Google Scholar] [CrossRef]
  4. Zhen, F.; Cao, Y.; Qin, X.; Wang, B. Delineation of an Urban Agglomeration Boundary Based on Sina Weibo Microblog ‘Check-in’ Data: A Case Study of the Yangtze River Delta. Cities 2017, 60, 180–191. [Google Scholar] [CrossRef]
  5. Facebook. Available online: https://www.facebook.com/ (accessed on 1 April 2019).
  6. Twitter. Available online: https://twitter.com/ (accessed on 1 June 2019).
  7. Foursquare. Available online: https://foursquare.com/ (accessed on 1 May 2019).
  8. Weibo. Available online: http://www.weibo.com (accessed on 11 April 2019).
  9. Hollenstein, L.; Purves, R. Exploring Place Through User-Generated Content: Using Flickr Tags to Describe City Cores. J. Spat. Inf. Sci. 2010, 2010, 21–48. [Google Scholar]
  10. Wang, B.; Zhen, F.; Wei, Z.; Guo, S.; Chen, T. A Theoretical Framework and Methodology for Urban Activity Spatial Structure in E-society: Empirical Evidence for Nanjing City, China. Chin. Geogr. Sci. 2015, 25, 672–683. [Google Scholar] [CrossRef]
  11. Bo, W.; Feng, Z.; Zongcai, W. The Research on Characteristics of Urban Activity Space in Nanjing: An Empirical Analysis Based on Big Data. Hum. Geogr. 2014, 29, 14–21. [Google Scholar]
  12. Loo, B.P.; Yao, S.; Wu, J. Spatial Point Analysis of Road Crashes in Shanghai: A GIS-Based Network Kernel Density Method. In Proceedings of the 19th International Conference on Geoinformatics, Shanghai, China, 24–26 June 2011; IEEE: Piscataway, NJ, USA, 2011; pp. 1–6. [Google Scholar]
  13. Long, Y.; Han, H.; Tu, Y.; Shu, X. Evaluating the Effectiveness of Urban Growth Boundaries Using Human Mobility and Activity Records. Cities 2015, 46, 76–84. [Google Scholar] [CrossRef]
  14. Gu, Z.; Zhang, Y.; Chen, Y.; Chang, X. Analysis of Attraction Features of Tourism Destinations in a Mega-City Based on Check-in Data Mining—A Case Study of ShenZhen, China. ISPRS Int. J. Geo-Inf. 2016, 5, 210. [Google Scholar] [CrossRef] [Green Version]
  15. Lei, C.; Zhang, A.; Qi, Q.; Su, H.; Wang, J. Spatial-Temporal Analysis of Human Dynamics on Urban Land Use Patterns Using Social Media Data by Gender. ISPRS Int. J. Geo-Inf. 2018, 7, 358. [Google Scholar] [CrossRef] [Green Version]
  16. Preoţiuc-Pietro, D.; Cohn, T. Mining User Behaviours: A Study of Check-in Patterns in Location Based Social Networks. In Proceedings of the 5th Annual ACM Web Science Conference, Paris, France, 2–4 May 2013; ACM: New York, NY, USA, 2013; pp. 306–315. [Google Scholar]
  17. Khan, S. Spatial and Temporal Analysis of Water Quality of Ganges System Upto Varanasi by Using GIS and SPSS Tool. Ph.D. Thesis, GB Pant University of Agriculture and Technology, Uttarakhand, India, 2018. [Google Scholar]
  18. Geenens, G.; Wang, C. Local-Likelihood Transformation Kernel Density Estimation for Positive Random Variables. J. Comput. Graph. Stat. 2018, 27, 822–835. [Google Scholar] [CrossRef] [Green Version]
  19. OpenStreetMap. Available online: https://www.openstreetmap.org (accessed on 18 April 2019).
  20. Lohr, S. Big Data is Opening Doors, But Maybe Too Many. New York Times, 23 March 2013. [Google Scholar]
  21. Chatterjee, P. Big Data: The Greater Good or Invasion of Privacy. The Guardian, 12 March 2013; 12. [Google Scholar]
  22. Ward, J.S.; Barker, A. Undefined by Data: A Survey of Big Data Definitions. arXiv 2013, arXiv:1309.5821. [Google Scholar]
  23. Dumbill, E. What is Big Data? An Introduction to the Big Data Landscape. Available online: http://radar.oreilly.com//01/what-is-big-data.html2012 (accessed on 11 May 2019).
  24. Mayer-Schönberger, V.; Cukier, K. Big Data: A Revolution That Will Transform How We Live, Work, and Think; Houghton Mifflin Harcourt: Boston, MA, USA, 2013. [Google Scholar]
  25. Miller, H.J.; Goodchild, M.F. Data-Driven Geography. GeoJournal 2015, 80, 449–461. [Google Scholar] [CrossRef]
  26. Ovadia, S. The Role of Big Data in the Social Sciences. Behav. Soc. Sci. Libr. 2013, 32, 130–134. [Google Scholar] [CrossRef]
  27. Yanwei, C.; Yue, S.; Zuopeng, X.; Yan, Z.; Ying, Z.; Na, T. Review for Space-time Behavior Research: Theory Frontiers and Application in the Future. Prog. Geogr. 2012, 31, 667–675. [Google Scholar]
  28. Kwan, M.-P.; Lee, J. Geovisualization of Human Activity Patterns Using 3D GIS: A Time-Geographic Approach. Spat. Integr. Soc. Sci. 2004, 27, 721–744. [Google Scholar]
  29. Polak, J.; Jones, P. The Acquisition of Pre-trip Information: A Stated Preference Approach. Transp. Res. Part C Emerg. Technol. 1993, 20, 179–198. [Google Scholar] [CrossRef]
  30. Che, Q.; Duan, X.; Guo, Y.; Wang, L.; Cao, Y. Urban Spatial Expansion Process, Pattern and Mechanism in Yangtze River Delta. Acta Geogr. Sin. 2011, 66, 446–456. [Google Scholar]
  31. Bo, W.; Feng, Z.; Hao, Z. The Dynamic Changes of Urban Space-time Activity and Activity Zoning Based on Check-in Data in Sina web. Sci. Geogr. Sin. 2015, 35, 151–160. [Google Scholar]
  32. Hu, Y.; Gao, S.; Janowicz, K.; Yu, B.; Li, W.; Prasad, S. Extracting and Understanding Urban Areas of Interest Using Geotagged Photos. Comput. Environ. Urban Syst. 2015, 54, 240–254. [Google Scholar] [CrossRef]
  33. Graham, M.; Shelton, T. Geography and The Future of Big Data, Big Data and the Future of Geography. Dialogues Hum. Geogr. 2013, 3, 255–261. [Google Scholar] [CrossRef] [Green Version]
  34. Kubíček, P.; Konečný, M.; Stachoň, Z.; Shen, J.; Herman, L.; Řezník, T.; Staněk, K.; Štampach, R.; Leitgeb, Š. Population distribution modelling at fine spatio-temporal scale based on mobile phone data. Int. J. Digit. Earth 2019, 12, 1319–1340. [Google Scholar] [CrossRef]
  35. Gonzalez, M.C.; Hidalgo, C.A.; Barabasi, A.-L. Understanding Individual Human Mobility Patterns. Nature 2008, 453, 779–782. [Google Scholar] [CrossRef] [PubMed]
  36. Song, C.; Qu, Z.; Blumm, N.; Barabási, A.-L. Limits of Predictability in Human Mobility. Science 2010, 327, 1018–1021. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  37. Zhu, X. GIS and Urban Mining. Resources 2014, 3, 235–247. [Google Scholar] [CrossRef]
  38. Yuan, J.; Zheng, Y.; Xie, X. Discovering Regions of Different Functions in a City using Human Mobility and POIs. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Beijing, China, 22–27 August 2012; ACM: New York, NY, USA, 2012; pp. 186–194. [Google Scholar]
  39. Wesolowski, A.; Qureshi, T.; Boni, M.F.; Sundsøy, P.R.; Johansson, M.A.; Rasheed, S.B.; Engø-Monsen, K.; Buckee, C.O. Impact of Human Mobility on the Emergence of Dengue Epidemics in Pakistan. Proc. Natl. Acad. Sci. 2015, 112, 11887–11892. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  40. Pappalardo, L.; Simini, F.; Rinzivillo, S.; Pedreschi, D.; Giannotti, F.; Barabási, A.-L. Returners and Explorers Dichotomy in Human Mobility. Nat. Commun. 2015, 6, 8166. [Google Scholar] [CrossRef] [Green Version]
  41. Fan, C.; Liu, Y.; Huang, J.; Rong, Z.; Zhou, T. Correlation Between Social Proximity and Mobility Similarity. Sci. Rep. 2017, 7, 11975. [Google Scholar] [CrossRef] [Green Version]
  42. Cheng, C.; Jain, R.; van den Berg, E. Location Prediction Algorithms for Mobile Wireless Systems. In Wireless Internet Handbook; CRC Press, Inc.: Boca Raton, FL, USA, 2003; pp. 245–263. [Google Scholar]
  43. Cho, E.; Myers, S.A.; Leskovec, J. Friendship and Mobility: User Movement in Location-Based Social Networks. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, 22–27 August 2011; ACM: New York, NY, USA, 2011; pp. 1082–1090. [Google Scholar]
  44. Gao, H.; Tang, J.; Liu, H. Exploring Social-Historical Ties on Location-Based Social Networks. In Proceedings of the Sixth International AAAI Conference on Weblogs and Social Media, Dublin, Ireland, 4–8 June 2012. [Google Scholar]
  45. Lindqvist, J.; Cranshaw, J.; Wiese, J.; Hong, J.; Zimmerman, J. I’m the Mayor of My House: Examining Why People Use Foursquare-A Social-Driven Location Sharing Application. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Vancouver, BC, Canada, 7–12 May 2011; ACM: New York, NY, USA, 2011; pp. 2409–2418. [Google Scholar]
  46. Noulas, A.; Scellato, S.; Mascolo, C.; Pontil, M. An Empirical Study of Geographic User Activity Patterns in Foursquare. In Proceedings of the Fifth international AAAI Conference on Weblogs and Social Media, Barcelona, Spain, 17–21 July 2011. [Google Scholar]
  47. Scellato, S.; Noulas, A.; Lambiotte, R.; Mascolo, C. Socio-spatial Properties of Online Location-based Social Networks. In Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media, Barcelona, Spain, 17–21 July 2011. [Google Scholar]
  48. Khan, N.U.; Wan, W.; Yu, S. Spatiotemporal Analysis of Tourists and Residents in Shanghai Based on Location-Based Social Network’s Data from Weibo. ISPRS Int. J. Geo Inf. 2020, 9, 70. [Google Scholar] [CrossRef] [Green Version]
  49. Zhang, J.-D.; Chow, C.-Y. iGSLR: Personalized Geo-Social Location Recommendation: A Kernel Density Estimation Approach. In Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, Orlando, FL, USA, 5–8 November 2013; ACM: New York, NY, USA, 2013; pp. 334–343. [Google Scholar]
  50. Colombo, G.B.; Chorley, M.J.; Williams, M.J.; Allen, S.M.; Whitaker, R.M. You Are Where You Eat: Foursquare Checkins as Indicators of Human Mobility and Behaviour. In Proceedings of the IEEE International Conference on Pervasive Computing and Communications Workshops, Lugano, Switzerland, 19–23 March 2012; IEEE: Piscataway, NJ, USA, 2012; pp. 217–222. [Google Scholar]
  51. Li, Y.; Steiner, M.; Wang, L.; Zhang, Z.-L.; Bao, J. Exploring Venue Popularity in Foursquare. In Proceedings of the IEEE INFOCOM, Turin, Italy, 14–19 April 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 3357–3362. [Google Scholar]
  52. Graham, M.W.; Avery, E.J.; Park, S. The Role of Social Media in Local Government Crisis Communications. Public Relat. Rev. 2015, 41, 386–394. [Google Scholar] [CrossRef]
  53. Sutton, J.; Palen, L.; Shklovski, I. Backchannels on the Front Lines: Emergent Uses of Social Media in the 2007 California Wildfires. In Proceedings of the Conference on Information Systems for Crisis Response and Management (ISCRAM), Washington, DC, USA, 4–7 May 2008. [Google Scholar]
  54. Preston, J.; Stelter, B. How Government Officials are Using Twitter for Hurricane Sandy. The New York Times, 2 November 2012. [Google Scholar]
  55. Lin, S.; Xie, R.; Xie, Q.; Zhao, H.; Chen, Y. Understanding User Activity Patterns of The Swarm APP: A Data-Driven Study. In Proceedings of the ACM International Joint Conference on Pervasive and Ubiquitous Computing and ACM International Symposium on Wearable Computers, Maui, Hawaii, 12–16 September 2017; ACM: New York, NY, USA, 2017; pp. 125–128. [Google Scholar]
  56. Shi, B.; Zhao, J.; Chen, P.-J. Exploring Urban Tourism Crowding in Shanghai via Crowdsourcing Geospatial Data. Curr. Issues Tour. 2017, 20, 1186–1209. [Google Scholar] [CrossRef]
  57. Rizwan, M.; Wan, W.; Cervantes, O.; Gwiazdzinski, L. Using Location-based Social Media Data to Observe Check-in Behavior and Gender Difference: Bringing Weibo Data into Play. ISPRS Int. J. Geo Inf. 2018, 7, 196. [Google Scholar] [CrossRef] [Green Version]
  58. Press Releases. Available online: http://ir.sina.com/ (accessed on 11 May 2019).
  59. Liu, C.Y.; Chen, J.; Li, H. Linking Migrant Enclave Residence to Employment in Urban China: The Case of Shanghai. J. Urban Aff. 2019, 41, 189–205. [Google Scholar] [CrossRef] [Green Version]
  60. Xiao, Y.; Wang, D.; Fang, J. Exploring the Disparities in Park Access Through Mobile Phone Data: Evidence from Shanghai, China. Landsc. Urban Plan. 2019, 181, 80–91. [Google Scholar] [CrossRef]
  61. Zhang, Y.; Li, X.; Wang, A.; Bao, T.; Tian, S. Density and Diversity of OpenStreetMap Road Networks in China. J. Urban Manag. 2015, 4, 135–146. [Google Scholar] [CrossRef] [Green Version]
  62. Lichman, M.; Smyth, P. Modeling Human Location Data with Mixtures of Kernel Densities. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 24–27 August 2014; ACM: New York, NY, USA, 2014; pp. 35–44. [Google Scholar]
  63. Zhang, J.; Wu, L. The Influence of Population Movements on the Urban Relative Humidity of Beijing During the Chinese Spring Festival Holiday. J. Clean. Prod. 2018, 170, 1508–1513. [Google Scholar] [CrossRef]
  64. Hasan, S.; Zhan, X.; Ukkusuri, S.V. Understanding Urban Human Activity and Mobility Patterns Using Large-Scale Location-Based Data from Online Social Media. In Proceedings of the 2nd ACM SIGKDD International Workshop on Urban Computing, Chicago, IL, USA, 22–27 August 2013; ACM: New York, NY, USA, 2013; p. 6. [Google Scholar]
  65. Huang, H.; Gartner, G. Current Trends and Challenges in Location-Based Services. ISPRS Int. J. Geo-Inf. 2018, 7, 199. [Google Scholar] [CrossRef] [Green Version]
  66. Rizwan, M.; Wan, W. Big Data Analysis to Observe Check-in Behavior Using Location-Based Social Media Data. Information 2018, 9, 257. [Google Scholar] [CrossRef] [Green Version]
  67. Netek, R.; Pour, T.; Slezakova, R.J.O.G. Implementation of Heat Maps in Geographical Information System–Exploratory Study on Traffic Accident Data. Open Geosci. 2018, 10, 367–384. [Google Scholar] [CrossRef]
  68. Longley, P.A.; Adnan, M. Geo-temporal Twitter Demographics. Int. J. Geogr. Inf. Sci. 2016, 30, 369–389. [Google Scholar] [CrossRef]
Figure 1. Study area.
Figure 1. Study area.
Ijgi 09 00076 g001
Figure 2. Research methodology.
Figure 2. Research methodology.
Ijgi 09 00076 g002
Figure 3. Check-in frequencies for 24 h.
Figure 3. Check-in frequencies for 24 h.
Ijgi 09 00076 g003
Figure 4. Check-in frequencies for days of the week.
Figure 4. Check-in frequencies for days of the week.
Ijgi 09 00076 g004
Figure 5. Check-in frequencies for 180 days.
Figure 5. Check-in frequencies for 180 days.
Ijgi 09 00076 g005
Figure 6. Location of venues in different categories in Shanghai.
Figure 6. Location of venues in different categories in Shanghai.
Ijgi 09 00076 g006
Figure 7. Density of the total check-in data in Shanghai.
Figure 7. Density of the total check-in data in Shanghai.
Ijgi 09 00076 g007
Figure 8. Comparison of weekly density. (a) First week of April. (b) Second week of April. (c) Last week of January. (d) First week of February.
Figure 8. Comparison of weekly density. (a) First week of April. (b) Second week of April. (c) Last week of January. (d) First week of February.
Ijgi 09 00076 g008
Table 1. Dataset sample.
Table 1. Dataset sample.
User IDDateTimeGenderLocation IDLatitudeLongitudeLocation
18…736/30/201722:06:16mB2…93121.32334631.258411HSBC_Court
58…966/30/201711:11:58fB2…93121.39508131.31339561Shanghai_University
31…236/30/201718:00:59mB2…9B121.5881731.310072Golf_Training_Center
51…166/30/201719:13:45fB2…9B121.34502331.283799Baili_Life_Plaza
Table 2. Check-in venue categories.
Table 2. Check-in venue categories.
CategoryCheck-In Venue Example
EntertainmentConcert_Hall_Daning_Theatre
Cinema_Shanghai_Paragon_Studios
Jumbo_KTV_(Guoding_Road)
Attractions_Disney_Town_(Disneyland)
EducationalShanghai_Information_Technology_School
Shanghai_University_Baoshan_Campus
Tongji_University_South_Campus
Cao_Yang_Second_Middle_School
FoodTang_Lian_Hot_Spring_Restaurant
Starbucks_(Gubei_Store)
Fast_Food_McDonald’s_(Hongqiao_Railway_Station)
Cafe_COSTA_COFFEE_(Shanghai_Hongqiao_Hub_2)
General LocationGeneral_Location_Mark_Bund
Shanghai_Nanjing_East_Road
Wanda_square
Changle_Road
HotelAsia_Pacific_Marriott_Hotel
Three_Star _Yunfeng_Hotel
Five_Star _Shanghai_Fujian_Hotel
ProfessionalHSBC_Court_Buildings
Ping_An_Bank_Headquarters
Hospital_Jinyang_Lot_Hospital
Building_ZTE_Corporation_Building
ResidentialResidential_area_Caojiang_Apartment
Residential_area_Fuyou_Jiayuan
Residential_area_Yonghe_Sancun
Xinli_Greenland_Apartments
Shopping & ServicesShopping_Mall_Baili_Life_Plaza
Mall_Australia_Square
Quyang_Life_Mall
Mall_Lotus_International_Plaza
SportsComprehensive_Gymnasium_Jiading_Sports_Center
Sports_venue_Discikat_Karting
Playground_Happy_Valley
Lushan_Golf_Club
TravelRailway_Station_Nanxiang_North_Station
Hongqiao_Airport
Yuhong_Road_Subway_Station
Line_2_Xujing_East_Bus_Station
Table 3. Characteristics of Categories.
Table 3. Characteristics of Categories.
CategoryNo. of Locations%Average no of Check-InsTotal no. of Check-InsAverage no. of UsersTotal no. of Users
Educational8812%319.7628,139233.3520,535
Entertainment13619%409.9055,747311.7142,393
Food274%296.858015182.594930
General Location558%247.6013,618199.5110,973
Hotel304%334.2710,028231.176935
Professional517%225.8211,517162.168270
Residential17224%168.1928,928136.6723,508
Shopping & Services9012%376.9233,923292.0026,280
Sports264%568.1214,771432.9211,256
Travel477%379.5517,839251.4511,818

Share and Cite

MDPI and ACS Style

Khan, N.U.; Wan, W.; Yu, S. Location-Based Social Network’s Data Analysis and Spatio-Temporal Modeling for the Mega City of Shanghai, China. ISPRS Int. J. Geo-Inf. 2020, 9, 76. https://doi.org/10.3390/ijgi9020076

AMA Style

Khan NU, Wan W, Yu S. Location-Based Social Network’s Data Analysis and Spatio-Temporal Modeling for the Mega City of Shanghai, China. ISPRS International Journal of Geo-Information. 2020; 9(2):76. https://doi.org/10.3390/ijgi9020076

Chicago/Turabian Style

Khan, Naimat Ullah, Wanggen Wan, and Shui Yu. 2020. "Location-Based Social Network’s Data Analysis and Spatio-Temporal Modeling for the Mega City of Shanghai, China" ISPRS International Journal of Geo-Information 9, no. 2: 76. https://doi.org/10.3390/ijgi9020076

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop