Next Article in Journal
Challenges and Applications of Emerging Nonvolatile Memory Devices
Next Article in Special Issue
IoT Technology Applications-Based Smart Cities: Research Analysis
Previous Article in Journal
Sizing CMOS Amplifiers by PSO and MOL to Improve DC Operating Point Conditions
Previous Article in Special Issue
Cascade of One Class Classifiers for Water Level Anomaly Detection

Categorization of Green Spaces for a Sustainable Environment and Smart City Architecture by Utilizing Big Data

School of Communication & Information Engineering, Shanghai University, Shanghai 200444, China
School of Information Engineering, Huangshan University, Huangshan 245041, China
Institute of Smart City, Shanghai University, Shanghai 200444, China
Raptor Interactive (Pty) Ltd., Eco Boulevard, Witch Hazel Ave, Centurion 0157, South Africa
Author to whom correspondence should be addressed.
Electronics 2020, 9(6), 1028;
Received: 16 May 2020 / Revised: 17 June 2020 / Accepted: 18 June 2020 / Published: 22 June 2020
(This article belongs to the Special Issue Transforming Future Cities: Smart City)


Urban green spaces promote outdoor activities and social interaction, which make a significant contribution to the health and well-being of residents. This study presents an approach that focuses on the real spatial and temporal behavior of park visitors in different categories of green parks. We used the large dataset available from the Chinese micro-blog Sina Weibo (often simply referred to as “Weibo”) to analyze data samples, in order to describe the behavioral patterns of millions of people with access to green spaces. We select Shanghai as a case study because urban residential segregation has already taken place, which was expected to be followed by concerns of environmental sustainability. In this research, we utilized social media check-in data to measure and compare the number of visitations to different kinds of green parks. Furthermore, we divided the green spaces into different categories according to their characteristics, and our main findings were: (1) the most popular category based upon the check-in data; (2) changes in the number of visitors according to the time of day; (3) seasonal impacts on behavior in public in relation to the different categories of parks; and (4) gender-based differences. To the best of our knowledge, this is the first study carried out in Shanghai utilizing Weibo data to focus upon the categorization of green space. It is also the first to offer recommendations for planners regarding the type of facilities they should provide to residents in green spaces, and regarding the sustainability of urban environments and smart city architecture.
Keywords: spatiotemporal analysis; smart city; environment; KDE; green parks; quality of life spatiotemporal analysis; smart city; environment; KDE; green parks; quality of life

1. Introduction

Urban green spaces, parks in particular, offer numerous advantages for people’s well-being, such as enhancing their psychological and social well-being, as well as their physical health [1,2,3]. Visits to green parks provide the opportunity to directly experience the extra benefits of “natural” environments, especially for people with limited contact with nature. Consequently, quantifying the visits of residents to urban green spaces, and recognizing the factors that influence their visits, are vital for urban park management and planning. It is important to measure visitation to parks and urban green spaces to recognize people’s recreational interests and to define the factors that influence them. Visitor polls, on-site counters and direct observation are the traditional methods for estimating the number of visitors [4,5]. These methods of organized observation usually pick a representative sample of urban green spaces and collect information regarding the use of green parks, for example, visitation numbers and features, as well as the activities and behavior of green park visitors. However, these approaches are typically time-consuming and site-specific, and therefore have restricted spatial coverage [6]. The advent of free social media data offers new and different methods to analyze urban green park visits. Data collected from social media, and other “big data” as reliable sources, grow in size every year, and can be utilized to investigate how the public interact with actual environments, and to accurately assess individual preferences through time and space [7].
Green areas or parks are obviously the best way to encourage people to take part in physical exercise. In this research, we analyzed visitors’ spatiotemporal behavior in green parks, taking into account 122 green parks with nearly 250,000 visitor check-ins, using the location-based social network (LBSN) service named Sina Weibo, also known as Weibo [8]. Weibo was created in August 2009 and is one of China’s largest social media sites, as well as a popular microblogging platform that allows people to post and share their daily activities with their circle of friends. The term “check-in” refers to a user willingly sharing their location on the LBSN, by publicizing it on a location-sensing smartphone or tablet while participating in certain activities [9].
In this research, we examined the effect of visits to green spaces on users’ check-in behaviors. We utilized a sample of data to research green park-related check-in behaviors, including likes and dislikes, as a summary of the behavior or preferences of the general public in terms of various activities while using park resources. We also implemented the linear regression model by including variables that have a mild correlation with the dependent variable. Similarly, the aim of the study was to respond directly to the following questions: (1) What type of parks do people choose to visit, and what are the characteristics of each category of park? (2) How often does the time of the day influence the behavior of the general public in relation to green parks? (3) How does the season influence the behavior of the general public in terms of green park visits? (4) Does gender have an impact on visits to green park? To properly grasp the link between green parks and visitor behavior, we analyzed users’ spatiotemporal trends in reference to urban green spaces by using a large-scale dataset that covered the period of July 2014 to June 2017. The research was conducted within Shanghai city in south-east China. Our main results indicate the distribution of users’ check-ins in different green parks, and exciting behavioral changes based on the time of day (i.e., a 24-h period) and the day of the week. We also investigated seasonal impact on the behavior of the public in relation to green spaces to show real-life patterns. Through examining these factors, we were able to analyze the use of urban green parks and the subsequent variance across a range of temporal scales. This method of capturing visitor activity in urban green spaces is an effort to solve the drawbacks of previous studies, and it paves the way for socio-ecological research using crowd-sourced and social network data, instead of relying on the results of subjective coverage and observational data.

2. Literature Review

Urban areas are generally marked by downgraded ecosystems, rising environmental damage, higher temperature and decreasing urban green parks. Urban green spaces (UGS) are important for improving the quality of life in metropolitan areas, for balancing thermal comfort and providing an appropriate thermal budget [10], and for helping the public heal from the physical and emotional pressures of their daily lives [11]. Green spaces in urban environments are vital for improving living conditions in urban areas by improving the quality of the air and the area’s esthetics, which eventually leads to higher real estate values and a reduction in energy use for cooling. UGS can also be applied as a cumulative resource for the development of sustainable urban areas [12]. UGS also provides children with much-needed space to play, which is significant for their social, cognitive and mental functioning [13].
Urbanization has become one of the worldwide agendas for development. Based upon the United Nations’ Sustainable Development Goal 11 (SDG 11)—which seeks to expand human settlements and urban communities and to ensure that they are safeguarded, strong and feasible—the United Nations expanded the sustainable development goal (SDG) Agenda by implementing the New Urban Agenda in 2016 [14]. Urban sustainable development and social justice rely on different requirements, including equitable spatial distribution, planning, environmental facilities, urban strategic planning, quality of green space, and socio-economic facilities. The social benefits issued by UGS to urban residents are crucial for maintaining and increasing the personal choices of urban residents [15]. Public urban green spaces (PUGS) are important for the mitigation of high summer surface temperatures [16], and are also essential for the elimination of pollution and for decreasing noise levels [17]. The temperature difference among urban and green areas is high in summer and low in winter. Moreover, in summer, the difference is greater during the day than during the night, while in winter, the opposite is true.
In recent years, several analytical efforts have been made to use social media data for the fields of application and urban planning, the purposes of which are dynamic, and range from basic tasks to quite complex analysis, for example, urban form and feature verification. In reality, Facebook, Weibo, Twitter, and the other big data networks are utilized to assess the mobility and behavior of people, with calculations ranging from the inter-urban to the global scales [18]. However, it would be extremely difficult to determine the finer spatial and temporal stages of these two phenomena by utilizing traditional methods, such as questionnaires or on-site observations. Furthermore, LBSN data can be utilized for socio-spatial analysis [19] using techniques such as the absorption of user’s tweet components, or by exploring their feelings and disparities over space and time [20], also taking into consideration health factors such as physical exercise or diet. Campagna [21] implemented the idea of “Social Media Geographic Information” as a means to investigate people’s insights, opinions and interests in space and time in order to promote spatial planning and geo-design by using space–time analysis. By using various sources of data (except for survey data), such as cellular data, it is feasible to better define the spatiotemporal elements of urban environments [22]. Meanwhile, other authors have worked on smart city transportation, technologies and applications [23,24,25].
The kernel density estimation (KDE) method has been used to spatially model geolocation data, and it offers an additional particular and comfortable structure for the spatial estimation of density. KDE has also been used for the assessment of environmental attributes, for example, healthcare properties [26], the diet environment [27] and green space access. It boosts up the rate of access, or measures distance by transforming point data into something similar to a nonstop surface, thus allowing the density of the characteristic to be measured at any point onto the map’s surface [28]. The researchers in [29] investigated the relationships among location, walking speed and adequate levels of physical activity by utilizing KDE. KDE was also utilized to research sites such as food stores in connection with factors such as overweight, body mass index (BMI) and dietary intake [30]. In [31], the authors availed the KDE approach to analyze the relative importance of external factors correlated with temporal and spatial user distribution in urban green spaces. The identification and analysis of accidents or catastrophes and their effect on daily urban development procedures also offers another interesting potential usage of social media network data [32]. The authors compared KDE and point density in their research [33].

3. Materials

3.1. Study Area

As part of the alluvial plain, Shanghai and the Yangtze River Delta have an average elevation of around 4 m. From east to west, there is a slight terrain gradient. The land is plain, except for some south-western foothills. Shanghai is part of the climate zone for the north subtropical humid monsoon. There are four seasons, with abundant sunshine and rainfall. Therefore, in such a metropolitan area, extreme events and urbanization have had a substantial influence on public health promotion services and the country’s economy. Shanghai is indeed the world’s 10th, and most prominent, agglomeration region. Therefore, the process of urbanization is relatively quick. In the United Nations’ (UN) future forecasts of urbanization, it was stated that Shanghai’s urban population ranks as the world’s second largest, and China’s first [34].
In 2016, Shanghai was divided into 16 regions: 15 districts (i.e., Baoshan, Changning, Fengxian, Hongkou, Huangpu, Jiading, Jing’an, Jinshan, Minhang, Pudong New District, Putuo, Qingpu, Songjiang, Xuhui and Yangpu) and one county (i.e., Chongming) [35]. Seven regions (i.e., Changning, Hongkou, Huangpu, Jing’an, Putuo, Xuhui and Yangpu) are located in Puxi (literally, Huangpu West). These seven areas are known as Shanghai’s downtown or city center [36]. The study area in this research includes 10 districts (i.e., Baoshan, Changning, Hongkou, Huangpu, Jing’an, Minhang, Pudong New District, Putuo, Xuhui and Yangpu), as shown in Figure 1. The locations of the green parks can also be seen within the study area.

3.2. Dataset

The dataset that we used to examine the number of visits to green parks came from the Chinese popular micro-blog Weibo. Weibo is considered to be comparable to China’s Twitter, which is the major social media blog in China. Weibo has a massive number of customers, representing the biggest dataset of obtainable geotags. According to the latest Weibo annual report, it was announced that above 500 million active users were registered on the platform in 2018, and it reached 462 million monthly active users in December 2018 [37]. The demographics of Weibo users are inconsistent with the total population; thus, it was decided to launch the public interface of the Weibo Location Based Service (LBS) on 28 May 2012. Weibo users have since been able to share their location on the internet in real-time. As a kind of scalable and available large-scale crowdsource dataset, Weibo sign-in data were the most appropriate that we could obtain to estimate actual park visitors. However, an earlier study of 87 city parks situated in Shanghai, China showed that there is an important link between data from Jiepang’s social media platforms and official visitor numbers [38].
In addition, earlier research has already revealed that Weibo data offer a perfect representation of the interests and behaviors of the individuals in urban areas [39]. Using the check-in data of Weibo [40], a new model was proposed, which integrates the parameters of urban environmental function networks, thereby enriching the definition of the structure of urban networks. However, while using Weibo check-ins as a proxy for visits is still uncommon, earlier research utilizing data from similar social media sites or platforms (for example, Facebook, Flickr and Instagram) found important beneficial ties amongst official visit statistics and the number of visitors reported on these sites [6,41]. Figure 2 represents the criteria that we deployed for collecting the check-in data.
The Weibo application program interface (API) helps in the processing of data collection. The dataset was obtained from Weibo for a time frame of 3 years, from July 2014 to June 2017. After retrieving this location information, we found that some locations that we had added were not green parks (e.g., sidewalks, former homes of famous people, and sculptures). We checked these locations one after another and removed those that were not in the green park category, but we still counted the parking lots or park areas linked to green parks. Some larger parks have more than two location IDs, for example, garden area, kids’ play areas and barbecue areas were all combined into one location ID [42]. After pre-processing, filtration and cleaning, a total of 250,632 geo-tagged visits to green parks in 10 districts of Shanghai were included. The data were collected with the help of the programming language of Python (version 2.7.12), and were filtered for exclusions such as invalid records and fake users, including:
  • The geographical location of data should exist only in Shanghai;
  • The lowest number of check-ins per green park should be 100 within the time period of the study;
  • Every record should have a geo-location (latitude and longitude), user id, time, gender, day, month and year;
  • Parks that are separated into several geo-locations inside the green spaces were combined into one geo-location.

3.3. Park Type Classification

The green parks in Shanghai were divided into six commonly selected categories: (1) community parks (n = 30), (2) cultural relic parks (n = 8), (3) large urban parks (n = 12), (4) natural parks (n = 09), (5) neighborhood parks (n = 46), and (6) recreational parks (n = 17) (Table 1). These parks were classified on the basis of their dominant functions and different kinds of administration [43,44]. Figure 1 reveals the distribution of the different types of urban green parks throughout the study area. Amongst the 122 green parks in our study area, the greatest proportion belonged to neighborhood parks, followed by community parks.

4. Methodology

4.1. Data Preparation

The data that we collected cover all check-ins from July 2014 to June 2017 made within Shanghai’s boundary. The data downloaded were included in many JSON files. Figure 3 represents the data preparation process.
In this study, the Weibo dataset included information such as a unique user ID and check-in date and time. In addition, information about the geographic location (latitude and longitude) and gender was collected through the Weibo API. The LBSN dataset therefore assumes that daily trends are evidence of users’ daily activities, behaviors on social media, and spatiotemporal patterns [45]. A typical Weibo “check-in” is represented as: check-in (B2094554D064ABF44293) = {1758115961, ####, B2094554D064ABF44293, Mon July 25 14:47:41 +0800 2016, m, 121.484566, 31.270601}, where B2094554D064ABF44293 denotes “location_id,” 1758115961 denotes “user_id,” #### denotes the “user_name,” Mon July 25 14:47:41 +0800 2016 denotes “day, month, date, time, and year,” m denotes “gender,” and 121.484566, 31.270601 denotes the geo-location. JSON is a Java platform programming platform format, which is the most commonly used data format though Java, and is considered to be the main programming language with open source accessible reader and writer modules. Using the selected software [46], the data were filtered into a CSV (comma-separated values) file format, so that all user data, which include geo-locations, could be identified and stored in the database regardless of the publication date. Table 2 shows an example of a “check-in” in CSV format.
In view of the heterogeneity problem, only green parks that include more than 100 check-ins were chosen to establish the user sample, in order to confirm a fairly high level of representativeness.

4.2. Social Media Data Analytics

In this research, we analyzed Weibo-based geo-location datasets in 10 districts of Shanghai, China (July 2014 to June 2017). Figure 4 shows a check-in behavior analysis framework in which the LBSN data analysis method includes the framework, the data pre-processing and cleaning, the temporal and spatial analysis of the LBSN data, and the statistics that indicate the worth of the LBSN data.
Across the three-year period, the number of visitors to the urban green spaces was obtained as statistical data. A spatiotemporal assessment, utilizing statistical graphs and tables, was performed based on the results, and the distribution of check-ins was examined using the KDE method.

4.3. Temporal Analysis

With the purpose of tracking variations in user behavior, we divided the check-in time stamps into different time classifications—daily, weekly and seasonal. The daily pattern shows the hourly distribution of check-ins during the day, and the weekly pattern shows the weekly distribution of check-ins. Seasonal trends and climate factors were also taken into account, as they may influence the characteristics of green parks. Winter trends could provide significant data regarding park usage throughout the colder months. We did not define seasons by different dates in our research, but as a substitute used simple categories based upon months: March–May is spring; June–August is summer; September–November is autumn; and December–February is winter.
SPSS v25 was used for statistical analysis. SPSS is a frequently-used program in social science for performing statistical analysis to solve various study problems, and it is used by industry and health researchers, survey agencies, government education specialists, data mining companies, marketing agencies and more. It provides various methods, including hypothesis checking and reporting, ad-hoc analysis and data management to facilitate analysis. We used Table 2019.2 for its visualization techniques to explore and analyze relational databases and data cubes.

4.4. Statistical Analysis

To assess the significance of the explanatory variables, it was necessary to statistically explore the predictors (i.e., explanatory variables) and their effect on the response variable (i.e., the number of check-ins). For this model, we used the following regression equation:
Y = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 3 + + β k x k + ϵ
Table 3 displays the parameters and the explanatory variables used in our regression model.
Y = β 0 + β 1 B a o s h a n + β 2 Changning + β 3 Huangpu + β 4 Jingan + β 5 Yangpu + β 6 Mon + β 7 Tue + β 8 Wed + β 9 Fri   + β 10 Sat + β 11 July + β 12 Aug + β 13 Feb + β 14 Mar + β 15 Apr + β 16 May + β 17 Jun + ϵ
Afterward, by applying the linear regression model, our fitted value equation converted to:
y ^ = b 0 + b 1 B a o s h a n + b 2 Changning + b 3 Huangpu + b 4 Jingan + b 5 Yangpu + b 6 Mon + b 7 Tue + b 8 Wed + b 9 Fri   + b 10 Sat + b 11 July + b 12 Aug + b 13 Feb + b 14 Mar + b 15 Apr + b 16 May + b 17 Jun + ϵ
We implemented the linear regression model by including variables that have a mild correlation with the dependent variable; in our scenario, the dependent variable was the number of check-ins, with correlation values from 0.10 to 0.50. As for the model’s inference, the p-value of the F-statistic model showed that the model is itself significant. It must be remembered that not all predictors have a significant p-value, since the model was built using Table 4’s highest adjusted R2. Table 3 defines the model coefficients, in which the modification shows that the number of check-ins increased by an average of approximately 2.73% for each unit, with a very small p-value. Likewise, for each unit of Huangpu, Jingan and Yangpu that increased, the average check-in period increased, with very small p-values, by approximately 1.84%, 1.38% and 1.66%, respectively. Analysis of ANOVA has been represented in Table 5.
All independent variables are significant predictors that rely on the p-values [47], as revealed in Table 4. For the statistical analysis, we used the statistical programming language R [48] and the program RStudio [49] to perform basic descriptive and regression analysis.

4.5. Spatial Analysis

We utilized the KDE method to create a smooth surface density for check-in hotspots in the geographic area. The KDE method is a non-parametric estimation technique for determining the density of a random sample of data [50]. KDE smooths each data point into small density bumps, and then all of these small bumps are combined to make a final estimation of the density. KDE is widely accepted for spatial distribution [51,52,53], and it describes the spatial density distribution combined with the distance–decay effect, and projects hotspots by transforming scatter point data into a continuous density surface [54]. KDE is an evolving spatiotemporal technique that has been used previously [55,56] to inspect several features of social media (but not restricted to LBSN) data analytics, such as users’ online activity and movement trends [57], check-in behavior [58], city boundary descriptions [59,60] and point-of-interest recommendations [61]. It also explores the distribution of destinations in communities, enabling researchers to see where there are densely scattered destinations, and where they are more sparsely scattered. Eventually, this method seeks to create a smooth surface of density within the geographical space of spatial point cases [46]. The authors utilized the KDE method for the analysis of spatiotemporal patterns in green parks [62,63].
KDE can effectively calculate the visitor density spatial structure within an area of study. KDE is a statistical method for estimating a smooth and continuous distribution from a small number of observations [64]. The data taken into account in our analysis were in the form of geo-tagged check-ins. Let E be a collection of historical data for check-in, i.e., E = {e1, …, en}, where ei = <x, y> is a check-in geo-location 1 < i < n, of individual i and at time t, where E represents the dataset we used. The total of the kernel’s functions was scaled to construct a smooth curve, i.e., a unit field. This resulted in a bivariate of KDE in the following form:
f K D ( e | E , h ) = 1 n   i = 1 n K h ( e , e i )
K h = 1 2 π h exp ( 1 2 ( e , e i ) t h 1 ( e , e i ) )
where e denotes the check-in location in dataset E, along with bandwidth h. h is supposed to be reliant on the estimated density fKD, producing a smooth density surface around E at the data point ei.
ArcGIS 10.0 was used to evaluate the spatial distribution of the check-ins in space. In particular, ArcGIS 10.0 (Environmental Systems Research Institute, Inc., Redlands, CA, USA) software, with a 2016-developed Shanghai map using the WGS 1984 geodesic coordinate system, was used. The base map also included the major transportation lines (i.e., the line layer) and the administrative districts (i.e., the polygon layer and the district layers) with the new OpenStreetMap subway lines and entries (i.e., the point layer).

5. Results

Shanghai city is among the fastest growing metropolises in the world, with a population of 22,125,000 per 4015 km2 [65]. The total number of green parks within the city of Shanghai is 366 [66], which encourages the city’s inhabitants to participate in various healthy activities. For this analysis, 122 of these green parks were chosen after processing the data collected from Weibo. The distribution of the different categories of green parks included in this research is shown in Figure 5, where the different colors reveal different categories of parks.
We utilized KDE to examine the spatial distribution of the check-in data, and ArcGIS for visualization, to investigate the Weibo geo-location check-in data. Figure 6 indicates the overall check-in density in Shanghai between July 2014 and June 2017, where the areas colored in red display a higher human density, a higher level of activity, and a higher percentage of social media usage. It is therefore no surprise that the downtown green parks have large clusters of activity.
Figure 7 presents the overall check-in density in all of the different categories of parks; the parks were divided into six categories, and our results reveal that the most check-ins were found in neighborhood parks and recreational parks as the research question was raised, and so the density is higher in these categories. The reason behind this is that the number of neighborhood parks is higher compared to the other types of parks, and they are located near residential areas.
Figure 8a shows the temporal differences in the number of visitors during a 24-h period. Although the visitors accessed the parks during all periods of the day, the maximum number of check-ins were made at 4:00 p.m.–6:00 p.m. and at 10 p.m. for all of the parks considered in the research. This trend continues to increase until midnight. These results indicate that Shanghai’s green parks are fast becoming popular recreational destinations for the people of Shanghai. Figure 8b shows the percentage of the total number of check-ins, including a record of every hour, and it also testifies that most people like to visit neighborhood parks up until midnight.
Although a nearly equal number of check-ins were made every weekday, more check-ins were made on Saturdays and Sundays, and this weekly pattern is consistent across all categories of parks. Figure 9a describes the general pattern of weekly check-ins, while Figure 9b reveals the weekly trend in all categories of green parks, further highlighting that the most check-ins were made on weekends across all park types.
We investigated the data based on gender (i.e., male and female) in 10 districts of Shanghai, to examine the check-in rates and behavior. Figure 10a, displays the overall check-in pattern among the districts. Moreover, Figure 10b reveals the number of check-ins across the different categories of parks. The important thing to note here is that female visitors were more active users of Weibo in contrast to male visitors.
A comparison of genders in terms of their check-ins (regarding both frequency and behavior), across week days and districts, was used to investigate the differences among male and female visitors in Shanghai. Table 6 and Table 7 represent the outcomes of this comparison across the days of the week, seasons, districts, and categories of parks.
The seasonal differences in visit check-ins to green parks were investigated for autumn, winter, spring and summer. In accordance with similar studies, significant seasonal differences in user check-ins were identified [62,67]. According to the research question, the seasonal pattern shows a higher percentage of check-ins in green parks throughout the summer and spring. It is worth noting that check-ins throughout the winter were slightly lower than in autumn. Figure 11b represents the number of check-ins across the different categories of green parks during the different seasons, and shows that neighborhood parks and recreational parks dominate the number of check-ins in all seasons.
Finally, the overall statistics regarding the number of check-ins are shown in Table 8, separated by seasons, districts and daily trends.

6. Conclusions and Recommendations

In this research, we applied big data methodology to revisit the issue of environmental justice linked to the spatial provision of urban green parks in Shanghai. We utilized geo-tagged Weibo check-in data as a park-visit indicator within 10 districts of Shanghai, and investigated the number of check-in visits accordingly. We focused upon the different categories of parks in Shanghai because, to the best of our knowledge, we are the first ones to address this problem in such a highly crowded and metropolitan area. We achieved an in-depth experimental analysis of check-in behavior in the current research, which used intensity maps and patterns from LBSN data. The conclusions reveal the distribution of users in parks by analyzing the check-in data, and the findings show that neighborhood green parks are much more crowded than other green spaces. Analysis of the seasons’ impacts upon people’s behavior toward green spaces in different categories of parks shows that the number of check-ins is much higher in summer and spring, compared to autumn and winter. The total number of Weibo users, depending on the hour of the day, shows that the peak time to visit green parks is from midday to midnight. Lastly, gender-based variations were measured in relation to green parks visits, and the findings reveal that female visitors are more involved in their use of social media services when visiting green parks.
Kernel density estimation is a technique for defining the probability density function, and is a must-have method that allows the user to examine the studied probability distribution more effectively than when using an old-style histogram. The core technique, different from the histogram, provides a uniform estimate, uses the locations of all the sample points, and, more importantly, involves multimodality. KDE is a function in which events are balanced according to their distances and the two necessary parameters. The first of these is bandwidth, the control distance. Bandwidth selection has a great influence on performance. The second parameter is the K-weighting function, more often a normal function. The kernel bandwidth is a free parameter that indicates a strong effect on the resulting estimate. Comparing to the commonly used histogram, the kernel density estimator offers many advantages. It is a smooth curve, and therefore shows details better, and uses the locations of all sample points, so the information contained in the sample should be better represented. KDE estimates smooth distributions by not including local noise to a particular degree, which minimizes flaws by providing a non-parametric probability distribution with optimal bandwidth.
The wide spatial coverage of this research offers valuable information, which can improve urban green space development and planning in other major cities. The results indicate that planners must start paying more attention to the importance of small neighborhood green parks in urban green space arrangements. This article provides encouragement for the entertainment value of these parks, by defining their high check-in visitation intensity. The presence of accessible, well-maintained, small green spaces in the urban park system is also crucial for meeting the recreational needs of local residents. More support services, including the area and location of the green park, can charm more visitors, as this research reveals that green urban parks in the city center or the downtown area attract more visitors than parks in the other parts of the city. These findings can also be useful in the urban development of smart cities for green spaces, by considering visitors’ preferences. In addition, similar research must be carried out in other megalopolis cities to assess the common issues affecting the utilization of urban green spaces.

7. Limitations and Future Work

Based on the outcomes of the present study, LBSN data have the ability to deliver a new outlook, in addition to providing observations of gender differences and check-in intensity. LBSN check-in data have some huge benefits, such as high spatial accuracy and minimal cost. However, some constraints are associated with this type of data, such as low sample size frequency, gender bias and location category bias. In conclusion, LBSN data is more likely to be supplementary to, than a replacement for, traditional sources of data. When implementing similar techniques, social media data that are shared or posted by green park visitors can be further understood to help visitors realize their feelings or sentiments, as well as to evaluate the numerous benefits offered by urban green parks.

Author Contributions

Q.L., H.U. and W.W. conceived the research; H.U. and S.A.H. designed the research; H.U., S.A.H., A.A.M.M., S.S.R., T.Q. and L.H. performed the simulations; H.U., S.A.H. and Q.L. wrote the article; W.W. and Z.P. proofread the article for language editing. All authors read and approved the final manuscript.


This work was supported by a project of Shanghai Science and Technology Commission (No: 18510760300), the Anhui Natural Science Foundation (No: 1908085MF178) and Anhui Excellent Young Talents Support Program Project (No: gxyqZD2019069).

Conflicts of Interest

The authors declare no conflict of interests.


  1. Hand, K.L.; Freeman, C.; Seddon, P.J.; Recio, M.R.; Stein, A.; van Heezik, Y. The importance of urban gardens in supporting children’s biophilia. Proc. Natl. Acad. Sci. USA 2017, 114, 274–279. [Google Scholar] [CrossRef] [PubMed][Green Version]
  2. Richardson, E.A.; Pearce, J.; Mitchell, R.; Kingham, S. Role of physical activity in the relationship between urban green space and health. Public Health 2013, 127, 318–324. [Google Scholar] [CrossRef] [PubMed][Green Version]
  3. Zhou, W.; Wang, J.; Cadenasso, M.L. Effects of the spatial configuration of trees on urban heat mitigation: A comparative study. Remote Sens. Environ. 2017, 195, 1–12. [Google Scholar] [CrossRef]
  4. Cohen, D.A.; Marsh, T.; Williamson, S.; Derose, K.P.; Martinez, H.; Setodji, C.; McKenzie, T.L. Parks and physical activity: Why are some parks used more than others? Prev. Med. 2010, 50, S9–S12. [Google Scholar] [CrossRef] [PubMed][Green Version]
  5. Wendel, H.E.W.; Zarger, R.K.; Mihelcic, J.R. Accessibility and usability: Green space preferences, perceptions, and barriers in a rapidly urbanizing city in Latin America. Landsc. Urban Plan. 2012, 107, 272–282. [Google Scholar] [CrossRef]
  6. Tenkanen, H.; Di Minin, E.; Heikinheimo, V.; Hausmann, A.; Herbst, M.; Kajala, L.; Toivonen, T. Instagram, flickr, or twitter: Assessing the usability of social media data for visitor monitoring in protected areas. Sci. Rep. 2017, 7, 17615. [Google Scholar] [CrossRef][Green Version]
  7. Wood, S.A.; Guerry, A.D.; Silver, J.M.; Lacayo, M. Using social media to quantify nature-based tourism and recreation. Sci. Rep. 2013, 3, 2976. [Google Scholar] [CrossRef]
  8. Weibo. Available online: (accessed on 25 June 2019).
  9. Zhen, F.; Cao, Y.; Qin, X.; Wang, B. Delineation of an urban agglomeration boundary based on Sina Weibo microblog ‘check-in’data: A case study of the Yangtze River Delta. Cities 2017, 60, 180–191. [Google Scholar] [CrossRef]
  10. Cetin, M. Determining the bioclimatic comfort in Kastamonu City. Environ. Monit. Assess. 2015, 187, 640. [Google Scholar] [CrossRef]
  11. Lee, A.C.; Maheswaran, R. The health benefits of urban green spaces: A review of the evidence. J. Public Health 2011, 33, 212–222. [Google Scholar] [CrossRef]
  12. Haq, S.M.A. Urban green spaces and an integrative approach to sustainable environment. J. Environ. Prot. 2011, 2, 601. [Google Scholar] [CrossRef][Green Version]
  13. Amoly, E.; Dadvand, P.; Forns, J.; López-Vicente, M.; Basagaña, X.; Julvez, J.; Alvarez-Pedrerol, M.; Nieuwenhuijsen, M.J.; Sunyer, J. Green and blue spaces and behavioral development in Barcelona schoolchildren: The BREATHE project. Environ. Health Perspect. 2014, 122, 1351–1358. [Google Scholar] [CrossRef] [PubMed]
  14. Klaufus, C.; Van Lindert, P.; Van Noorloos, F.; Steel, G. All-inclusiveness versus exclusion: Urban project development in Latin America and Africa. Sustainability 2017, 9, 2038. [Google Scholar] [CrossRef][Green Version]
  15. Kothencz, G.; Kolcsár, R.; Cabrera-Barona, P.; Szilassi, P. Urban green space perception and its contribution to well-being. Int. J. Environ. Res. Public Health 2017, 14, 766. [Google Scholar] [CrossRef] [PubMed][Green Version]
  16. Bowler, D.E.; Buyung-Ali, L.M.; Knight, T.M.; Pullin, A.S. A systematic review of evidence for the added benefits to health of exposure to natural environments. BMC Public Health 2010, 10, 456. [Google Scholar] [CrossRef] [PubMed][Green Version]
  17. Escobedo, F.J.; Kroeger, T.; Wagner, J.E. Urban forests and pollution mitigation: Analyzing ecosystem services and disservices. Environ. Pollut. 2011, 159, 2078–2087. [Google Scholar] [CrossRef]
  18. Hasan, S.; Ukkusuri, S.V. Urban activity pattern classification using topic models from online geo-location data. Transp. Res. Part C Emerg. Technol. 2014, 44, 363–381. [Google Scholar] [CrossRef]
  19. Longley, P.A.; Adnan, M. Geo-temporal twitter demographics. Int. J. Geogr. Inf. Sci. 2016, 30, 369–389. [Google Scholar] [CrossRef]
  20. Frank, M.R.; Mitchell, L.; Dodds, P.S.; Danforth, C.M. Happiness and the patterns of life: A study of geolocated tweets. Sci. Rep. 2013, 3, 2625. [Google Scholar] [CrossRef][Green Version]
  21. Campagna, M. The geographic turn in social media: Opportunities for spatial planning and geodesign. In Proceedings of the International Conference on Computational Science and Its Applications, Guimarães, Portugal, 30 June–3 July 2014; pp. 598–610. [Google Scholar]
  22. Sagl, G.; Resch, B.; Hawelka, B.; Beinat, E. From social sensor data to collective human behaviour patterns: Analysing and visualising spatio-temporal dynamics in urban environments. In Proceedings of the GI-Forum, Berlin, Germany, 36 July 2012; pp. 54–63. [Google Scholar]
  23. Ghani, A.; Zubair, M.; Saeed, M.I.; Singh, D. SOS: Socially Omitting Selfishness in IoT for smart and connected communities. arXiv 2020, arXiv:08948. [Google Scholar]
  24. Nouh, R.M.; Singh, D. Introducing blockchain for smart city technologies and applications. In Blockchain Technology for Smart Cities; Springer: Seoul, Korea, 2020; pp. 1–17. [Google Scholar]
  25. Agrahari, A.; Singh, D. Smart city transportation technologies: Automatic no-helmet penalizing system. In Blockchain Technology for Smart Cities; Springer: Seoul, Korea, 2020; pp. 115–132. [Google Scholar]
  26. Smiley, M.J.; Roux, A.V.D.; Brines, S.J.; Brown, D.G.; Evenson, K.R.; Rodriguez, D.A. A spatial analysis of health-related resources in three diverse metropolitan areas. Health Place 2010, 16, 885–892. [Google Scholar] [CrossRef][Green Version]
  27. Thornton, L.E.; Pearce, J.R.; Macdonald, L.; Lamb, K.E.; Ellaway, A. Does the choice of neighbourhood supermarket access measure influence associations with individual-level fruit and vegetable consumption? A case study from Glasgow. Int. J. Health Geogr. 2012, 11, 29. [Google Scholar] [CrossRef] [PubMed][Green Version]
  28. Thornton, L.E.; Pearce, J.R.; Kavanagh, A.M. Using Geographic Information Systems (GIS) to assess the role of the built environment in influencing obesity: A glossary. Int. J. Behav. Nutr. Phys. Act. 2011, 8, 71. [Google Scholar] [CrossRef] [PubMed][Green Version]
  29. King, T.L.; Thornton, L.E.; Bentley, R.J.; Kavanagh, A.M. The use of kernel density estimation to examine associations between neighborhood destination intensity and walking and physical activity. PLoS ONE 2015, 10, e0137402. [Google Scholar] [CrossRef] [PubMed][Green Version]
  30. Buck, C.; Börnhorst, C.; Pohlabeln, H.; Huybrechts, I.; Pala, V.; Reisch, L.; Pigeot, I. Clustering of unhealthy food around German schools and its influence on dietary behavior in school children: A pilot study. Int. J. Behav. Nutr. Phys. Act. 2013, 10, 65. [Google Scholar] [CrossRef] [PubMed][Green Version]
  31. Li, F.; Zhang, F.; Li, X.; Wang, P.; Liang, J.; Mei, Y.; Cheng, W.; Qian, Y. Spatiotemporal patterns of the use of urban green spaces and external factors contributing to their use in central Beijing. Int. J. Environ. Res. Public Health 2017, 14, 237. [Google Scholar] [CrossRef]
  32. Kovacs-Gyori, A.; Ristea, A.; Havas, C.; Resch, B.; Cabrera-Barona, P. London2012: Towards citizen-contributed urban planning through sentiment analysis of twitter data. Urban Plan. 2018, 3, 75–99. [Google Scholar] [CrossRef]
  33. Haidery, S.A.; Ullah, H.; Khan, N.U.; Fatima, K.; Rizvi, S.S.; Kwon, S.J. Role of big data in the development of smart city by analyzing the density of residents in shanghai. Electronics 2020, 9, 837. [Google Scholar] [CrossRef]
  34. Department of Economic and Social Affairs. World Urbanization Prospects the 2014 Revision; United Nations: New York, NY, USA, 2014. [Google Scholar]
  35. Xiong, X.; Jin, C.; Chen, H.; Luo, L. Using the fusion proximal area method and gravity method to identify areas with physician shortages. PLoS ONE 2016, 11, e0163504. [Google Scholar] [CrossRef]
  36. Shen, J.; Kee, G. Development and Planning in Seven Major Coastal Cities in Southern and Eastern China; Springer: Shanghai, China, 2017. [Google Scholar]
  37. Weibo Statistics. Available online: (accessed on 10 June 2019).
  38. Shen, Y.; Sun, F.; Che, Y. Public green spaces and human wellbeing: Mapping the spatial inequity and mismatching status of public green space in the central city of shanghai. Urban For. Urban Green. 2017, 27, 59–68. [Google Scholar] [CrossRef]
  39. Ebrahimpour, Z.W.; Wanggen, W.; Cervantes, O.; Luo, T.; Ullah, T. Comparison of Main Approaches for Extracting Behavior Features from Crowd Flow Analysis. ISPRS Int. J. Geo-Inf. 2019, 8, 440. [Google Scholar] [CrossRef][Green Version]
  40. Shen, Y.; Karimi, K. Urban function connectivity: Characterisation of functional urban streets with social media check-in data. Cities 2016, 55, 9–21. [Google Scholar] [CrossRef][Green Version]
  41. Keeler, B.L.; Wood, S.A.; Polasky, S.; Kling, C.; Filstrup, C.T.; Downing, J.A. Recreational demand for clean water: Evidence from geotagged photographs by visitors to lakes. Front. Ecol. Environ. 2015, 13, 76–81. [Google Scholar] [CrossRef][Green Version]
  42. Ullah, H.; Wan, W.; Haidery, S.A.; Khan, N.U.; Ebrahimpour, Z.; Muzahid, A.A.M. Spatiotemporal patterns of visitors in urban green parks by mining social media big data based upon WHO reports. IEEE Access 2020, 8, 39197–39211. [Google Scholar] [CrossRef]
  43. Xiaoli, T.; Mingxing, C.; Wenzhong, Z.; Yongping, B. Classification and its relationship with the functional analysis of urban parks: Taking Beijing as an example. Geogr. Res. 2013, 32, 1964–1976. [Google Scholar]
  44. Zhang, S.; Zhou, W. Recreational visits to urban parks and factors affecting park visits: Evidence from geotagged social media data. Landsc. Urban Plan. 2018, 180, 27–35. [Google Scholar] [CrossRef]
  45. Wang, Y.; Wang, T.; Ye, X.; Zhu, J.; Lee, J. Using social media for emergency response and urban sustainability: A case study of the 2012 Beijing rainstorm. Sustainability 2016, 8, 25. [Google Scholar] [CrossRef]
  46. Xie, Z.; Yan, J. Kernel density estimation of traffic accidents in a network space. Comput. Environ. Urban Syst. 2008, 32, 396–406. [Google Scholar] [CrossRef][Green Version]
  47. Abidi, S.; Hussain, M.; Xu, Y.; Zhang, W. Prediction of confusion attempting algebra homework in an intelligent tutoring system through machine learning techniques for educational sustainable development. Sustainability 2019, 11, 105. [Google Scholar] [CrossRef][Green Version]
  48. Language, R. Available online: (accessed on 12 May 2019).
  49. Studio, R. Available online: (accessed on 12 May 2019).
  50. Silverman, B.W. Density Estimation for Statistics and Data Analysis; Routledge: London, UK, 2018. [Google Scholar]
  51. Maroko, A.R.; Maantay, J.A.; Sohler, N.L.; Grady, K.L.; Arno, P.S. The complexities of measuring access to parks and physical activity sites in New York City: A quantitative and qualitative approach. Int. J. Health Geogr. 2009, 8, 34. [Google Scholar] [CrossRef][Green Version]
  52. Yu, W.; Ai, T.; Shao, S. The analysis and delimitation of Central Business District using network kernel density estimation. J. Transp. Geogr. 2015, 45, 32–47. [Google Scholar] [CrossRef]
  53. Yu, W.; Ai, T.; He, Y.; Shao, S. Spatial co-location pattern mining of facility points-of-interest improved by network neighborhood and distance decay effects. Int. J. Geogr. Inf. Sci. 2017, 31, 280–296. [Google Scholar] [CrossRef]
  54. Ying, L.; Shen, Z.; Chen, J.; Fang, R.; Chen, X.; Jiang, R. Spatiotemporal patterns of road network and road development priority in three parallel rivers region in Yunnan, China: An evaluation based on modified kernel distance estimate. Chin. Geogr. Sci. 2014, 24, 39–49. [Google Scholar] [CrossRef][Green Version]
  55. Wu, C.; Ye, X.; Ren, F.; Wan, Y.; Ning, P.; Du, Q. Spatial and social media data analytics of housing prices in Shenzhen, China. PLoS ONE 2016, 11, e0164553. [Google Scholar] [CrossRef]
  56. King, T.L.; Bentley, R.J.; Thornton, L.E.; Kavanagh, A.M. Using kernel density estimation to understand the influence of neighbourhood destinations on BMI. BMJ Open 2016, 6, e008878. [Google Scholar] [CrossRef] [PubMed][Green Version]
  57. Hasan, S.; Zhan, X.; Ukkusuri, S.V. Understanding urban human activity and mobility patterns using large-scale location-based data from online social media. In Proceedings of the 2nd ACM SIGKDD International Workshop on Urban Computing, Chicago, IL, USA, 11–14 August 2013; p. 6. [Google Scholar]
  58. Li, L.; Goodchild, M.F.; Xu, B. Spatial, temporal, and socioeconomic patterns in the use of Twitter and Flickr. Cartogr. Geogr. Inf. Sci. 2013, 40, 61–77. [Google Scholar] [CrossRef]
  59. Sun, Y.; Fan, H.; Li, M.; Zipf, A. Identifying the city center using human travel flows generated from location-based social networking data. Environ. Plan. B Plan. Des. 2016, 43, 480–498. [Google Scholar] [CrossRef]
  60. Yuan, N.J.; Zheng, Y.; Xie, X.; Wang, Y.; Zheng, K.; Xiong, H. Discovering urban functional zones using latent activity trajectories. IEEE Trans. Knowl. Data Eng. 2014, 27, 712–725. [Google Scholar] [CrossRef]
  61. Li, H.; Ge, Y.; Hong, R.; Zhu, H. Point-of-interest recommendations: Learning potential check-ins from friends. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 975–984. [Google Scholar]
  62. Ullah, H.; Wan, W.; Haidery, S.A.; Khan, N.U.; Ebrahimpour, Z.; Luo, T. Analyzing the spatiotemporal patterns in green spaces for urban studies using location-based social media data. ISPRS Int. J. Geo Inf. 2019, 8, 506. [Google Scholar] [CrossRef][Green Version]
  63. Liu, Q.; Ullah, H.; Wan, W.; Peng, Z.; Hou, L.; Qu, T.; Haidery, S.A. Analysis of green spaces by utilizing big data to support smart cities and environment: A case study about the city center of shanghai. ISPRS Int. J. Geo-Inf. 2020, 9, 360. [Google Scholar] [CrossRef]
  64. Maia, M.; Almeida, J.; Almeida, V. Identifying user behavior in online social networks. In Proceedings of the 1st Workshop on Social Network Systems, Glasgow, UK, 1–4 April 2008; pp. 1–6. [Google Scholar]
  65. Demographia World Urban Area. 2019. Available online: (accessed on 14 June 2019).
  66. Xiao, Y.; Wang, Z.; Li, Z.; Tang, Z. An assessment of urban park access in Shanghai–implications for the social equity in urban China. Landsc. Urban Plan. 2017, 157, 383–393. [Google Scholar] [CrossRef]
  67. Roberts, H.; Sadler, J.; Chapman, L. Using twitter to investigate seasonal variation in physical activity in urban green space. Geo Geogr. Environ. 2017, 4, e00041. [Google Scholar] [CrossRef]
Figure 1. Location of the study area and its green parks.
Figure 1. Location of the study area and its green parks.
Electronics 09 01028 g001
Figure 2. Criteria for the collection of check-in data.
Figure 2. Criteria for the collection of check-in data.
Electronics 09 01028 g002
Figure 3. The data preparation process.
Figure 3. The data preparation process.
Electronics 09 01028 g003
Figure 4. The methodology.
Figure 4. The methodology.
Electronics 09 01028 g004
Figure 5. The different categories of green parks.
Figure 5. The different categories of green parks.
Electronics 09 01028 g005
Figure 6. The distribution of the check-ins.
Figure 6. The distribution of the check-ins.
Electronics 09 01028 g006
Figure 7. The density of the different categories of parks.
Figure 7. The density of the different categories of parks.
Electronics 09 01028 g007
Figure 8. Daily check-ins: (a) Number of check-ins; (b) percentage of the total number of check-ins.
Figure 8. Daily check-ins: (a) Number of check-ins; (b) percentage of the total number of check-ins.
Electronics 09 01028 g008
Figure 9. Weekly check-ins: (a) Number of check-ins during the week; (b) number of weekly check-ins in different categories.
Figure 9. Weekly check-ins: (a) Number of check-ins during the week; (b) number of weekly check-ins in different categories.
Electronics 09 01028 g009
Figure 10. Gender differences: (a) In districts; (b) in categories.
Figure 10. Gender differences: (a) In districts; (b) in categories.
Electronics 09 01028 g010
Figure 11. Seasonal check-ins: (a) Number of check-ins during the seasons; (b) number of check-ins in the different categories of parks among the seasons.
Figure 11. Seasonal check-ins: (a) Number of check-ins during the seasons; (b) number of check-ins in the different categories of parks among the seasons.
Electronics 09 01028 g011
Table 1. Description of the green parks.
Table 1. Description of the green parks.
Park Type (N = 122)Description
Recreational park
(n = 17)
Recreational green parks consist of botanical gardens, children’s parks, zoos and sports fields, categorized by variable locations and sizes.
Cultural relic park
(n = 8)
Cultural relic green parks are home to ancient valuable relics and are places that are situated in urbanized areas, which are of value in terms of education and tourism.
Large urban park
(n = 12)
Large urban green parks are community spaces that serve a wide range of inhabitants, categorized by adequate services, various activities and large sizes.
Natural park
(n = 9)
Natural green parks are categorized by a natural background and have environmental value, situated in suburban areas.
Community park
(n = 30)
Community green parks are spaces that are situated in residential areas for local inhabitants’ recreational purposes.
Neighborhood park
(n = 46)
Neighborhood green parks are tiny or small residential urban green spaces that are situated in a certain residential district, serving a much smaller population in comparison to community parks.
Table 2. Example of a Weibo check-in in CSV (comma-separated values) format.
Table 2. Example of a Weibo check-in in CSV (comma-separated values) format.
Table 3. Final multiple linear regression model interpretation.
Table 3. Final multiple linear regression model interpretation.
Coefficients:EstimateStd. Errort-ValuePr (>|t|)-
July 2014–June 20150.0612760.0779470.7860.431952-
July 2015–June 20160.3208080.0828993.870.000115***
Significance codes: ‘***’ 0.001; ‘**’ 0.01; ‘*’ 0.05; ‘.’ 0.1; and ‘-’ 1.
Table 4. Regression statistics summary.
Table 4. Regression statistics summary.
Residual Standard ErrorDegrees of FreedomMultiple R-SquaredAdjusted R-SquaredF-Statisticp-Value
Table 5. Analysis of variance (ANOVA). Response: Number of Check-ins.
Table 5. Analysis of variance (ANOVA). Response: Number of Check-ins.
DfSum SqMean SqF-ValuePr (>F)
July 2014–June 20151138138.23.2580.071333.
July 2015–June 2016111061105.926.083.81E−07***
Significance codes: ‘***’ 0.001; ‘**’ 0.01; ‘*’ 0.05; ‘.’ 0.1; and ‘-’ 1.
Table 6. Gender differences among districts.
Table 6. Gender differences among districts.
Table 7. Gender differences among categories.
Table 7. Gender differences among categories.
GenderDayCommunity ParksCultural Relic ParksLarge Urban ParksNatural ParksNeighborhood ParksRecreational Parks
Table 8. Daily, seasonal, and district check-in trends.
Table 8. Daily, seasonal, and district check-in trends.
Back to TopTop