1. The Importance of Urban Green Areas and Ways to Analyze Their Role or Characteristics in the Urban System
While every city is unique in its characteristics, the universal aspect of all cities is their complexity [1
]. One element of this complexity is the constant movement of hundreds of thousands or even millions of people, who also spend time in public places, such as in parks. Parks are essential public places and play a central role in a city’s livability, primarily because of their role in offering social contact, exercise and restorative recreation. Furthermore, urban green areas have various effects on humans [2
], partially as ecosystem services [3
]. It is proven that access to green spaces is directly related to well-being through the influence of these areas on physical and mental health [4
]. This influence is discernible mostly on changes in air quality [6
], land surface temperature [6
], physical activity [6
], social cohesion [7
], community identity [15
], and stress reduction [7
]. Therefore, analyzing the various effects of parks and how they are perceived is gaining increasing interest among researchers from different fields [18
]. For instance, a growing body of literature deals with the analysis of factors determining urban green space use among the residents. The most relevant factors for parks are the functionality and facilities [7
], safety and access [11
] or even size or perceived greenness [25
]; whereas, from the park visitors’ side there are many personal characteristics ranging from age or ethnicity to health conditions that are determinant when selecting a park to visit [12
Good access to urban green spaces is of increasing relevance in the design of livable, healthy and sustainable cities [7
]. Having a park within 10–15 min walking distance from the residents’ homes is also often considered as a factor of livable cities. However, in the literature, there are contradictory observations regarding the distance to a park from home and its relevance in people’s decisions regarding which park to go to. There are studies that completely neglect spatial aspects (mostly when using Twitter data and performing sentiment analysis) [7
], or, on the contrary, studies that only consider spatial aspects of park visits but not the functionality or other attracting factors [31
]. In some other studies, either only the closest green area is considered, or the results show that having the park within less than a kilometer is more important than other factors [20
], also for improving health [35
]. However, there are also results showing that people visited parks that are further away, even if they had green areas nearer to their home, partially due to the differences between perceived and real distances [29
]. Also, if the purpose of the park visit is performing physical activity, distance might be less likely to be a predictor of choice [38
]. Only a limited number of studies focused on the issue of accessibility in a holistic way [39
] even analyzing its direct effect on physical activity or health [41
Most of the decision makers and urban planners intend to make public places livable [43
]. However, livability strongly depends on the people’s values and, therefore, their expectations, which means that planners should try to explore these expectations on an individual scale [46
]. Asking people directly about their trips’ characteristics or, for example, their expectations when visiting a park—as a traditional method in the form of questionnaires, which may even be combined with in situ observations—might be time- and resource-consuming while providing less transferable and only site-specific results. Also, the information produced as a result of such investigations still only represents a subset of temporal and spatial characteristics. At the same time, Twitter data analysis is mostly limited in data accessibility, thereby, once the required data is available, the analysis can be performed on scales ranging from intra-urban to even global for any period ranging from a few hours to several years. Recent developments in Geographic Information Systems (GIS)-based social media analysis offer the possibility to explore spatial, temporal and even affective aspects of users’ behavior, even for public spaces and park visits [28
]. However, some of these analyses still have limitations due to the manual interpretation of only a relatively low number of social media posts.
Several analysis efforts have used social media data for urban planning purposes over the last years, and the field of application is diverse and growing, ranging from more straightforward tasks to rather complex analysis, e.g., the detection of urban form and function [49
]. In general, Twitter and other social media platforms are often used to analyze human activity and mobility on scales ranging from intra-urban to global [50
], because these two phenomena are almost impossible to trace on finer spatial and temporal scales by using traditional methods such as questionnaires or quantitative observations (e.g., population counts). Furthermore, social media data can be used for socio-spatial analysis [59
], for instance, by extracting the content of the tweets [60
] or by investigating emotions and how they vary over space and time [64
] also considering health factors such as diet or physical activity [65
]. Campagna [70
] proposed the concept of “Social Media Geographic Information” (SMGI) as a way of investigating “people[’s] perceptions and interest in space and time” and thereby supporting spatial planning and geodesign, also by means of Spatial-Temporal Textual Analysis (STTx). Combined with other sources of data, such as mobile phone data, spatiotemporal characteristics of the urban environment can be described even more accurately [71
]. Due to their fine spatial and temporal scale, another great potential of social media data is the detection [72
] and analysis of events [73
], or disasters [78
], and their effect on daily urban planning routines [80
The goal of our analysis—similarly to SMGI—was to illustrate the possibilities of using social media (Twitter) data to extract spatial and temporal patterns of park visits for urban planning purposes, along with the sentiment of the tweets to represent how positive or negative a given post was, focusing on frequent Twitter users. Thereby, we intended to answer the following research questions:
Spatial aspects: What are the spatial characteristics of the selected users’ tweeting behavior and how do these characteristics relate to their park visits? In terms of parks, how far do the visitors travel on average to visit a given park from their center of activity?
Content aspects: Are tweets in parks more positive than in other urban areas? What feelings do the visitors have when spending time in a park? How does this vary between parks?
Temporal aspects: How do the spatial and sentiment characteristics vary over time? Are there any significant differences during the day, week or year?
Profiles: What types of parks and park visitors can we classify based on the identified spatial, temporal, and sentiment characteristics? What do we learn about them?
Indubitably, every park and park visitor can be unique, and, in a large city, it is hard to answer these questions for every individual. Compared to traditional questionnaires where most of the focus is on only one or a few locations, big data or social media data allows every park and thousands of visitors to be considered within the city—not only as individual entities in isolation but also as a set of comparable characteristics. To overcome some of these limitations, a combined approach has emerged in planning, which can use the advantages of both quantitative and qualitative data analysis to some degree. Geo-questionnaires and public participatory GIS (PPGIS) has been developed over the past decade and has advanced our understanding of public preferences or even legitimizing decisions [81
]. At the same time, we must recognize that, depending on the purpose of the study, social media analysis may not reach accuracies comparable to individual on-site studies [25
], but can still produce valuable input or added value as an overview of the general patterns. In that sense, social media analysis should be considered a complement to, not a replacement of, on-site field studies [85
In this paper, we analyzed the spatiotemporal park visiting behavior of more than 4000 Twitter users for almost 1700 parks along with users’ feelings extracted from over 78,000 tweets posted in London, UK. The novelty of our research is the combination of spatial and temporal aspects of Twitter data analysis for park visits in a transferable way while applying sentiment and emotion extraction to also explore the content of the tweets, to overcome the limitation of traditional methods. In summary, the findings are aggregated to identify different types of parks and their visitors, serving as an input for further investigations.
As this analysis demonstrates, Twitter data is one promising resource to assess the characteristics of urban parks, analyzing spatial, temporal and content-specific aspects. The advantage of using this type of social media for urban green space analysis is that we can derive qualitative, fine-scale information for the entire city as input for more specific, in-depth investigations.
Despite the potential of big data and social media data in this kind of analysis, the representativeness of the gathered information does have limitations due to uncertainties in the demographics of the users. However, we can infer at least that extremes in age (both very young and very old populations) and social situation (mostly poorer populations) translate into lower rates of social media use, and therefore under-representation in the data. In urban planning and in the analysis of urban green areas, such demographic data, and other data such as sex, ethnicity, etc., are important factors. It is a problem that social media data usually contains no direct information regarding these factors, unlike traditional census data, but there are methods to extract them indirectly [59
]. While this was not part of the current study, it could add new insights to the analysis results.
The main limitation in our case, besides the general one of representativeness, was the low number of tweets per user, especially in the case of park tweets. This resulted in a less than ideal number of data points per user. Similarly, when we extracted sentiments and emotions, due to the limitations of the algorithm, most of the tweets were classified as neutral or had no identifiable emotion. This again strongly affected the number of tweets used for the analysis. Also, the sentiment analysis itself has uncertainty as it considers words individually and not in context, while some “strong” words can bias the overall sentiment score. Finally, our selection criteria did not specify whether the content of a given tweet from a park is about the park or not. Therefore, a higher number of negative tweets does not necessarily indicate bad park quality, but further investigation can specify the connection and reason for the observation.
Another issue emerged from the categorization of users as residents. The method has uncertainties, as we are not able to validate it with official data sources. Although the selected users might not be residents, based on the number of tweets and their temporal distribution, we can at least conclude that their behavior is appropriate for investigation, as it has the potential to represent the activity patterns of average users, including patterns over the course of the year. Through this averaging process, we can derive more information, which is a common practice to improve the credibility of the results, e.g., [50
Regarding the tweeting behavior in general, another consideration is that while users will not necessarily tweet every time they are in a park, the tweet frequency is still able to show relative differences between parks. On the other hand, even if users tweet, they might not share their location. As with demographic information, location data can also be extracted indirectly through various methods, which might be relevant for future investigations.
Finally, the transferability of the methods is important and was considered in our analysis. As mentioned above, this is the reason we used OpenStreetMap, because, depending on the availability of Twitter data, urban green areas in any city can be analyzed following this methodology, making even global comparisons possible. However, access to tweets, in general, is usually not free of charge, especially for longer periods of time, and this constraint can negatively influence the transferability of the methods.
This study has tested an exploratory methodology to investigate spatial, temporal, and affective patterns of park visits for urban planning purposes using Twitter data of frequent users, and thereby to define profiles of parks and their visitors. The performed analyses yielded new insights about the visitors and use patterns of urban parks in London. In particular, we found that most users tend to tweet from parks that are located 3–4 km away from their COM and the average distance between a park and its visitors’ COM increases towards the outer areas of the city. Even though social media data is not appropriate to investigate motivation and determining factors of park visits directly, these higher average distance values suggest that absolute distance to a park has a lower priority in deciding which park to visit. Nevertheless, the larger absolute distance can still imply good accessibility.
In terms of sentiments, statistical analysis confirmed a significantly higher number of positive tweets in parks than in other urban areas, when considering tweets on an individual level. However, if we do not distinguish individual users in the analysis, this difference is already less obvious. Regarding emotions, joy and anticipation are significantly more frequent in parks than outside of them, but all the other emotions are more common in non-park areas, and these proportions can also vary from park to park. The temporal distribution of the tweets mostly corresponded to general expectations with more tweets in the afternoon, weekend, and summer, although surprisingly there were more tweets from parks during the winter than fall in our analysis period. Interestingly, on the park level, there was hardly any observable temporal trend for the number of visitors or the sentiment and emotion of the tweets. In summary, we identified four groups of parks based on their visitors’ characteristics, emotions, and how the visiting of these parks and the emotions in the tweets posted there varies over time.
While the methodologies and technologies of spatiotemporal social media analyses are developing fast, it seems that GIScience needs to work towards the exploration of causal relationships and the realization of a GIS of place. This study may contribute to the incorporation of the traditional quantitative spatial analytical tools of GIS with “non-traditional” data towards the realization of GIS as a hypothesis-generator. What is needed is not so much the development of many more analysis models, but rather, new ways of integrating mixed-methods approaches that incorporate a sense of place.
Although social media data analysis has its limitations, it could be shown that an exhaustive spatial, temporal and content analysis can provide valuable information through grasping general trends, serving as input for more in-depth analysis and field research, providing more specific purposes for urban planners and decision makers.