Evaluating Cultural Impact in Discursive Space through Digital Footprints

: The research presented in this paper describes an evaluation of the impact of spatial interventions in public spaces, measured by social media data. This contribution aims at observing the way a spatial intervention in an urban location can affect what people talk about on social media. The test site for our research is Domplatz in the center of Hamburg, Germany. In recent years, several actions have taken place there, intending to attract social activity and spotlight the square as a landmark of cultural discourse in the city of Hamburg. To evaluate the impact of this strategy, textual data from the social networks Twitter and Instagram (i.e., tweets and image captions) are collected and analyzed using Natural Language Processing intelligence. These analyses identify and track the cultural topic or “people talking about culture” in the city of Hamburg. We observe the evolution of the cultural topic, and its potential correspondence in levels of activity, with certain intervention actions carried out in Domplatz. Two analytic methods of topic clustering and tracking are tested. The results show a successful topic identiﬁcation and tracking with both methods, the second one being more accurate. This means that it is possible to isolate and observe the evolution of the city’s cultural discourse using NLP. However, it is shown that the effects of spatial interventions in our small test square have a limited local scale, rather than a city-wide relevance.


Digital Traces of Discursive Space
A notion of something beyond streets and buildings in urban space matures over modern and postmodern understandings of cities as twofold constructions assembled by tangible space and the people inhabiting it. This notion emerges with the acknowledgment of certain aspects of space, continues with the study of its role in identity formation from a psychosocial perspective [1], and expands into sophisticated socio-spatial theories. Looking at the definition of space from a multispectral perspective, its human dimension is mentioned in most descriptions. Rapoport's [2] human-environment formulation described reciprocity in which people shape the environment, and places influence people. According to this theory, people and places cannot be conceptualized without each other; individuals construct places as a consequence of everyday social practice, and spaces have an impact on people's cognition, behavior, identity, and the whole construction of the self [3][4][5].
This dualism between the phenomenological component of places (social, symbolic, discursive, intangible) and the formal reality of space (tangible physical elements) is addressed by several research fields that share a common standpoint: the intersection between the two. The study of cities as a set of social practices from an ethnographic perspective is based on Di Masso and Dixon's [6] definition of place-assemblage, which recognized the intertwining and interdependence of physical space as a continuum and individual human practices, discourses, and interactions that take place on it. In this context, it is possible to extrapolate these approaches to the contemporary era, where many of these processes take place in the digital realm: a digital space constructed from a socio-discursive fabric. More in detail, Saker, and Evans [7] explored the way online data are not only a consequence of the use of space, but in many cases they also affect users' experience of urban space by generating new interactions, creating a sense of familiarity, or modifying the way they walk through the city [8][9][10].
Several scholars acknowledge the complexity of the current approach to space at multiple levels of tangibility. The term "hybrid space" defines a combination of physical and digital spaces in such a way that activity in one entails a consequence in the other [7,11,12], making both mediators of the other [7,[13][14][15]. Thanks to the data mining of geotagged data, virtual information can be accessed to understand the way spaces function from this broader perspective. In this context, Kozinets [16] uses the concept of "netnography" to refer to a comprehensive area of research based on the digital traces of social practices. Ginzarly et al. [17] propose a framework of using digital tools based on crowd-sourced data to study urban cultural heritage. In doing so, they assume that the human-centered nature of social media data promotes the constitution of a shared heritage contained in collective memory, fostering the senses of identity and continuity. By posting images on social media as a form of cultural expression, people establish a conceptual link to "individual memory, narratives, and identity" [17] (p. 7). This approach validates research on immaterial and symbolic realms through crowd-sourced data.

Domplatz Square and SmartSquare Project
The case study on which this paper focuses is a small square and park in the core of Hamburg, Germany. The Cathedral Square (Domplatz) was the original site where the city's first settlement "Hammaburg" built its fortress and was subsequently the center of intellectual and cultural discourse for centuries. Used as a parking lot until the last decade of the 20th century, the square has recently been transformed into an accessible park with garden areas to give the historic site its place in the city. Due to its location between the Mönckebergstraße shopping street and the business area of HafenCity, Domplatz represents a transit space. Initial observations indicate that people use the square for walking through rather than stationary activities; it is used more as a street than a plaza. With the goal of revitalizing the square and inviting people in, the German Federal Ministry of Education and Research (BMBF) funded a project to investigate the potentials of this place. Several actions and interventions have taken place in Domplatz and its surroundings, such as the implementation of a WhatsApp chatbot with an autonomous conversation bot "Hammabot", the physical placement of information screens in buildings facing the square, or scheduling physical events as part of the citywide initiative "Long Night of the Museums" (LNDM), offering tours and talks in several museums and cultural institutions. The ultimate intention of these interventions is to increase the visibility of Domplatz as a landmark in the city's cultural discourse.

Objectives of This Work
After the aforementioned events that took place in 2017, 2018, and 2019, the question of whether they were perceived not only by the physical visitors but also by a wider potentially asynchronous public, remained unresolved. For this reason, we raised the need to develop a methodology capable of assessing the impact of cultural interventions-physical or digital-in the discursive space of a city. Due to the unprecedented characteristics of our case, the main hypothesis raised would be the validity of the approach and the suitability of the methodology, answering the questions "Does culture have a measurable impact in the discursive space of (smart) cities?" and "Can we measure it?". Answering these questions would not only help to increase knowledge about our particular case study space, but the methodology developed would also create a replicable set of procedures for addressing Sustainability 2021, 13, 4043 3 of 14 similar cases in different spatial and temporal contexts "intended to enable other public places to be revitalized later on" [18] (pp. [1][2]. Therefore, the primary goals of this work are to detect and track changes in discursive space related to digital and physical interventions, and to establish a standardized methodology for social media topic detection and topic tracking. To do so, we tested two analytic methods on Twitter and Instagram: (1) top expert-term set selection, and (2) artificial topic construction and tracking.

Materials and Methods
The methodology followed in this work consisted of two main sections, the preparation of data (Section 2.1) and the construction and tracking of topics (Sections 2.2 and 2.3). The first section included collecting the data, filtering and refining them, and preprocessing according to natural language processing (NLP). After these processes were complete, the second section explained the way topics are modeled and tracked, addressing "what people talk about". Two procedures were tested in a comparative methodology: (1) top expert term set selection, and (2) artificial topic construction and tracking. The first method models topics independently over several weeks and defines a set of relevant topics based on predefined "expert terms". The second method creates an artificial topic that contains "expert terms," and models subsequent topics using the parameters of the artificial topic. To test our experimental methodology, the processes described in this paper were applied to a sample of "relevant weeks" when events, interventions, or implementations of solutions within the boundaries of the case-study area occurred. In addition, we added a set of "control weeks" in which no specific event took place (Table 1). Each week was treated separately as an independent data set. After a closer look at the content of the posts, it was found that a remarkable number of them were uploaded by a reduced number of users and that the content was an automatically generated message repeated several times. Several researchers have confirmed the increasing amount of automated posts or "bots" in social media, which have generated more traffic than human posts on Twitter since 2016 [19]. Since our focus was placed on the human discursive space, posts written by bots were filtered out of the topic detection process. To do so, we set a threshold for the maximum frequency of posts per user for a user to be included in the data set. While it is true that highly engaged human users risked being excluded, it is also true that the larger goal of our research justified this bias, as the method excluded a large portion of bot-generated posts and spam, such as the weather app, job listings, and train station schedules. An alternative and more detailed method is described in the "Further Development" section below, following the work of Efthimion, Payne and Proferes [19] and considering additional variables.

Preprocessing Pipeline
Once the posts were retrieved from their sources and structured into independent data sets for each of the sample weeks, the data sets were preprocessed using an NLP pipeline. The preprocessing phase was a semi-supervised set of algorithms in which the textual content of each post was interpreted and pre-classified. For a correct interpretation of the content, the following sub-processes were defined within the NLP preprocessing pipeline:

1.
Filtering domain-specific stopwords: Stopwords were filtered out in order to eliminate the most common terms in a language (e.g., "the", "a", "and"), which contain no semantic information thus have no influence on the topic identification and modeling.

2.
Detection of phrases: the NLP pipeline was designed to work with words as independent entities. However, some case-specific phrases composed of more than one word were considered. To do so, a separate training process took a manually-generated set of the most common phrases expected to appear, in order to identify them as such.

3.
Identification and classification of emojis and emoticons as (1) emotion or sentiment, (2) action or activity, (3) other. The first two categories were integrated within separate processes of mood extraction and activity detection.

4.
Detection and grammatical categorization of "clusterable" words: following partof-speech tagging (POS-tagging) procedures in which each term was attached to a grammatical category determined by its own definition and its contexts within the sentence.
The computational processes of the NLP pipeline output the following information from each social media post: • "Clusterable words": A list of terms with semantic load and able to be clustered, and their grammatical category, i.e., adjective, verb, and noun. In addition, the following metadata were extracted from each post and merged with the NLP output:

•
Date and time in which the post was generated by the user. • Numeric ID generated as a random integer for username to ensure anonymization.

•
The social network in which the post was generated, Twitter or Instagram.

Topic Modeling
After all the posts were preprocessed by the NLP pipeline, topics were modeled using the "clusterable words" extracted from each post. The topic modeling process took the number of topics to be generated as input through the well-known "k-means" computational cluster generation method, where a number of output clusters "k" must be specified in advance. In order to establish a consistent and replicable methodology that can be implemented in data sets of different sizes and sources, the number "k" must be determined by a standardized procedure. To this end, a coherence analysis was performed in each data set prior to topic modeling. This analysis performed a series of correlations between different numbers of topics and the degree of coherence measured in each of them and set an average value as output. Topic coherence was measured by combining segmentation, probability estimation, a confirmatory measure, and setting a final aggregation value. The evaluation of coherence output tended to describe a curve with a distinct peak in the average aggregation value for all topics, the "coherence score" (Figure 1). The number of topics corresponding to the first peak in the coherence score was selected as the optimal number of topics "k" to extract from each data set. Topics were extracted using the Latent Dirichlet Allocation (LDA) method.

•
Language in which the text was written, filtering only English and German posts (case-specific process).

•
URLs links included in the text.
In addition, the following metadata were extracted from each post and merged with the NLP output:

•
Date and time in which the post was generated by the user.

•
Numeric ID generated as a random integer for username to ensure anonymization.

•
The social network in which the post was generated, Twitter or Instagram.

Topic Modeling
After all the posts were preprocessed by the NLP pipeline, topics were modeled using the "clusterable words" extracted from each post. The topic modeling process took the number of topics to be generated as input through the well-known "k-means" computational cluster generation method, where a number of output clusters "k" must be specified in advance. In order to establish a consistent and replicable methodology that can be implemented in data sets of different sizes and sources, the number "k" must be determined by a standardized procedure. To this end, a coherence analysis was performed in each data set prior to topic modeling. This analysis performed a series of correlations between different numbers of topics and the degree of coherence measured in each of them and set an average value as output. Topic coherence was measured by combining segmentation, probability estimation, a confirmatory measure, and setting a final aggregation value. The evaluation of coherence output tended to describe a curve with a distinct peak in the average aggregation value for all topics, the "coherence score" (Figure 1). The number of topics corresponding to the first peak in the coherence score was selected as the optimal number of topics "k" to extract from each data set. Topics were extracted using the Latent Dirichlet Allocation (LDA) method.

Expert Terms
Expert terms were domain-specific words from a semantic domain that were listed manually. Expert terms did not have a role in modeling the topics, but they were used to evaluate the relevance of the topics modeled. We evaluated posts that might be related to cultural activities by observing the presence of terms from the same semantic field such as "culture, museum, or exhibition".
The expert terms were set by selecting the most frequent words from the book Mythos Hammaburg, a publication covering the most important cultural topics related to the spatial location of our case [21], together with a manually set list of terms related to the semantic field of culture. Both lists of terms were extended with an algorithm that used Word2Vec, a pre-trained language model on news and articles in Wikipedia. By using this model, it was possible to find at least 10 synonyms for each of the terms from the original lists and output a joint list of terms composed of (1) case-specific keywords from the book Mythos Hammaburg, (2) generic terms from the semantic field of culture, and (3) 10 synonyms for each of the words in (1) and (2). After the expansion of the terms was completed, a semi-supervised refinement process was performed in the output "expert term" list to eliminate duplicate terms.

Top Expert-Term Set Selection
This methodology was characterized by a phase in which the presence of expert terms was evaluated in each week. To this end, after preprocessing each week (Section 2.1.1, Preprocessing Pipeline) and generating the different topics posted about (Section 2.1.2, Topic Modeling), statistics on the frequency of expert terms mentioned in each week were extracted and normalized with the total number of posts per week to obtain an average ratio of expert terms per post measured for each week. The week with a higher ratio of expert terms per post was considered a "key week"-week 39 of 2018 with a ratio of 0.018 expert terms per post, measured in 45,297 posts-in which the discursive space may contain a higher number of references mentioning general cultural topics or case-specific events and spatial interventions related to Domplatz.
After a "key week" was identified, the topics generated were observed in detail. Topics with a higher ratio of expert terms per post were considered "relevant topics." To determine the number of relevant topics to consider, a clustering procedure was performed by looking at the ratio of expert terms per post in each week separately and grouping the topics from the highest to the lowest value. The result of this calculation was three clusters, with cluster 1 being the highest ratio of expert terms per post and cluster 3 being the lowest, as shown in Table 2. These "relevant topics" were formed by specific posts during the "key week". Consequently, the topics in all other weeks needed to be generated again, following the definition of the "relevant topics" in the "key week". The non-key weeks went through the topic modeling process (Section 2.1.2, Topic Modeling) using a testing algorithm in which the topics were not generated from scratch, but the properties of the topics in the "key week" were replicated to model new topics in the non-key weeks.
As a result of this second topic modeling, all weeks were standardized to the same topics, so that each topic could be identified in all weeks. The evaluation of the presence of the "relevant topics" in all weeks allowed the detection of changes in the discursive space.

Artificial Topic Construction and Tracking
This methodology consisted of creating an artificial data set of posts that were intended to contain strong topic segmentation. Artificial posts were created to intentionally contain a large number of expert terms. To do so, phrases from the publication Mythos Hammaburg were used to create an artificial data set, which combined these phrases with a sample of tweets from "control weeks" (in which no cultural events took place in our case-study area) in equal proportions. The resulting data set was preprocessed using the NLP pipeline, and topics were modeled. This methodology generated two topics, (1) a "relevant topic" containing a majority of artificial posts generated by Mythos Hammaburg related to the case-specific cultural topic, and (2) a "non-relevant" topic generated by a majority of real posts and unrelated to our case-specific cultural topic. The same process was repeated for all control weeks ( Table 3). The control week with the highest number of artificially generated posts and the lowest number of real posts from social media was considered as the "key week" (bolded in Table 3). Table 3. Summary of messages by week, and topic in which they are modeled. The relevant topic was expected to gather more artificially generated posts since they are created from a text with a strong cultural discourse. After identifying the "relevant topic" in the "key week", all other weeks were standardized with the same topics by using a testing algorithm in which the characteristics of the topics in the "key week" (artificial data set) were replicated to model the topics in all other weeks. As a result, all weeks were standardized with the same two topics-one relevant and one irrelevant-so that the relevant topic could be identified in all weeks. The evaluation of the presence of the relevant topic over time was also associated with the detection and evaluation of changes in the discursive space, which allowed us to observe the activity level of the cultural topic.

Activity in Social Media
Visualizing tweets and Instagram posts by frequency over time allowed for the detection of general activity trends without specific linkage to a particular topic: the aggregate activity of the discursive space as a consequence of physical events in the city (e.g., a marathon) and caused by immaterial actions (e.g., a TV contest). This visualization represented someone being active online during a particular period of the day, and therefore, statistics about the average activity per hour could be extracted.
Comparing Instagram against Twitter in terms of their daily activity patterns, one noticed some differences, such as (1) Twitter users started to be active earlier than Instagram users, with a sharper increase in tweets in the early morning, and (2) Twitter's activity peak was closer to the morning hours between 9 and 11 A.M. and decreased at 5 P.M. (tweets during working hours), while Instagram users' peak was closer to the late evening between 5 and 7 P.M. (Instagram pictures after work) (Figure 2).
The consistency of these two trends across all weeks, and their similarity to other activity studies from different disciplines, such as energy demand curves [22] or traffic congestion curves [23], indicated the validity of the data collection and cleaning methods presented in this paper from a conceptual perspective. Furthermore, social media use was associated with strong routine activity patterns, in line with Manovich's [24] approach to social media to portraying "the everyday" rather than "the extraordinary", which situates online social content within the narrative of mass cultural production. Comparing Instagram against Twitter in terms of their daily activity patterns, one noticed some differences, such as (1) Twitter users started to be active earlier than Instagram users, with a sharper increase in tweets in the early morning, and (2) Twitter's activity peak was closer to the morning hours between 9 and 11 A.M. and decreased at 5 P.M. (tweets during working hours), while Instagram users' peak was closer to the late evening between 5 and 7 P.M. (Instagram pictures after work) (Figure 2). The consistency of these two trends across all weeks, and their similarity to other activity studies from different disciplines, such as energy demand curves [22] or traffic congestion curves [23], indicated the validity of the data collection and cleaning methods presented in this paper from a conceptual perspective. Furthermore, social media use was associated with strong routine activity patterns, in line with Manovich's [24] approach to social media to portraying "the everyday" rather than "the extraordinary," which situates online social content within the narrative of mass cultural production.

Topic Detection and Validation
To validate the topic identification process, we evaluated the presence of "expert terms," as it was independent of the topic modeling process. Following the methods described in Sections 2.2 and 2.3, we modeled topics in a "key week" and exported the properties of those topics to the rest of the weeks, so all the weeks were standardized by containing the same topics. Therefore, all weeks contained a "relevant topic" with the highest ratio of "expert terms " per post. To validate the topic identification process, we calculated the ratio of "expert terms" in all topics of all weeks and check whether the highest ratio was defined by the "relevant topic" (Table 4).

Topic Detection and Validation
To validate the topic identification process, we evaluated the presence of "expert terms", as it was independent of the topic modeling process. Following the methods described in Sections 2.2 and 2.3, we modeled topics in a "key week" and exported the properties of those topics to the rest of the weeks, so all the weeks were standardized by containing the same topics. Therefore, all weeks contained a "relevant topic" with the highest ratio of "expert terms" per post. To validate the topic identification process, we calculated the ratio of "expert terms" in all topics of all weeks and check whether the highest ratio was defined by the "relevant topic" (Table 4). Table 4. Sample of results displaying the ratio of expert terms per post in relation to the relevance of the topics extracted following the method "Artificial topic construction".

Source
Reference Forty-eight topics were modeled using the "Artificial topic construction". As explained above, these topics were marked as relevant or non-relevant. Looking at the ratio of expert terms per post, 14 of the relevant topics were successfully identified (87.5%). Only two topics identified as relevant (week 23 of 2018 and week 41 of 2017) had a lower ratio of expert terms than the non-relevant topic.
The same validation was performed with topics generated using the "Top expert-term set selection" method, with a total of 176 topics. In this case, we performed a cluster calculation in the original "key week" to determine the "relevant topics". In this case, validation using the ratio of expert terms per post revealed that 13 of the 16 (81.25%) relevant topics were successfully identified.
In both methods, the topic generated under the label "no topic" contained the lowest percentage of expert terms per post (100%) in all cases.

Long-Term Topic Tracking
The success rate of topic identification following the "Artificial topic construction" method was over 87%. That methodology was replicated and upscaled within a single data sample of 787,550 tweets continuously from April 2017 to June 2019. By implementing the bot filtering method described in Section 2.1, we reduced the number of tweets to 271,305.
The continuous data set of 271,305 tweets was preprocessed using the NLP pipeline described in Section 2.1. Then, an artificial data sample was created with 50% of the phrases from the book and 50% random tweets, and two topics were modeled from them. These two topics were later replicated within the continuous data set, as described in Section 2.3. As a result, the relevant topic was detected within the entire time span and could be tracked as a continuous procedural unit.
The timeline graph in Figure 3 represents the continuous data sample where the number of tweets for each topic was measured, as well as the total number of tweets found in the database. It was possible to observe certain peaks standing out from the overall trend, with events such as demonstrations (e.g., Fridays for Future), cultural events (e.g., Eurovision contest, UNESCO heritage report), sports (e.g., Formula 1 preparation, Bundesliga events), etc. The gray-shaded vertical sections are located in the periods when intervention was carried out by the research team, namely the start kick-off of the WhatsApp chatbot "Hammabot", the events and tours in the context of the Long Night of the Museums (LNDM18), the installation of an information display screen "Backhus", and the start of an audio tour. The absence of peaks in the line "relevant topic", during the events in the gray-shaded vertical section indicates that the effects of the interventions in Domplatz did not have an urban-scale impact in the discursive space that could be noticed.  Despite not being able to distinguish peaks in the expected interventions in Domplatz, large-scale events can easily be spotted in Figure 3. Based on the presence of peaks in the "relevant topic" line and/or the general line representing "all tweets," it was possible to detect three types of events: (1) Non-cultural events such as the airport being closed showed an impact on the general discourse, but not on the cultural topic. They had a citywide scale relevance and were not necessarily related to the culture. (2) Only cultural events such as the UNESCO heritage report showed a high impact in the cultural discourse ("relevant topic") and not in the overall discursive space of the city ("number of tweets"), which indicated the strong segregation of the discussion within the cultural cluster. (3) Events with a cultural and non-cultural impact, such as the Anti-Merkel Demonstration, showed a general impact in both the cultural discourse and the general discourse, indicating that discussions on the Despite not being able to distinguish peaks in the expected interventions in Domplatz, large-scale events can easily be spotted in Figure 3. Based on the presence of peaks in the "relevant topic" line and/or the general line representing "all tweets", it was possible to detect three types of events: (1) Non-cultural events such as the airport being closed showed an impact on the general discourse, but not on the cultural topic. They had a citywide scale relevance and were not necessarily related to the culture. (2) Only cultural events such as the UNESCO heritage report showed a high impact in the cultural discourse ("relevant topic") and not in the overall discursive space of the city ("number of tweets"), which indicated the strong segregation of the discussion within the cultural cluster. (3) Events with a cultural and non-cultural impact, such as the Anti-Merkel Demonstration, showed a general impact in both the cultural discourse and the general discourse, indicating that discussions on the topic were not only limited to the cultural cluster.

Discussion
Open and crowd-sourced data services are a popular source to describe the multiple dynamics hosted by urban spaces. A set of tools and methods in the sphere of digital urbanism are being used to theorize about the nature of cities and gain a broader understanding of places beyond their tangible topology. The present work explored two different data sources, Twitter and Instagram, aiming at gaining insights into a small-scale public open urban space and its social impact on an intangible discursive level. We defend that spaces with an exceptional symbolic load define landmarks and have a potentially major effect on the processes of identity generation and the development of a feeling of belonging to places, from individuals' perspectives [25]. The symbolic dimension of urban spaces-and more specifically, the narratives developed around places-defines a powerful field of research that involves locations in the city and the discourses related to them. Domplatz being a square of high historical value, it is reasonable to assume that its symbolic load is equivalent: it is a landmark. Interestingly, the results described in Section 3.3. Imply the opposite, opening up the discussion on spatial symbolism of places.

Theoretical Implications
The research located in the intersection between tangible space and intangible dynamics of cities has been expanded in recent years, and this contribution adds content to the urban knowledge base. As explained in the first section of this contribution, several researchers considered cities as "hybrid" systems where the immaterial overlays the tangible. The relation between tangible and discursive spaces in the city was addressed by Zukin's [26] "symbolic economy", which speaks of representational agents with a symbolic load beyond the physical spatial reality. According to Lefebvre [27], social-discursive and physical spaces are merged and shape the reality of the cities. Under the perspective of Soja [28], cities are an aggregation of behavioral events accumulated in time with a major human dimension. Space is socially constructed from its use and the experiences of individuals [4,27], and from individual narratives linked to specific locations [29].
In this context, Stewart [30] studied the way the "discursive space" is a key element in space production. She claimed that discourses shape urban spaces through disruption and discussion. Through collective bargaining involving designers and the general public in active discussions, decisions are made and space is shaped accordingly. Brought by the example of fin-de-Siècle dynamics in the European context, coffeehouse discussions among intellectuals were generators of a social-discursive space, which eventually had a direct effect on physical space. In practice, urban spaces are built or modified according to what is said in public discussions, through public participation, or public consultation. In parallel, urban interventions are also assessed and criticized by the public discourse. In consequence, the physical and the discursive spaces of the city are intertwined, as one has always an impact on the other.
In this piece of research, social media data are looked at through Stewart's prism of discursive-tangible interplay. Besides being information linked to a place in the city, the social networks utilized in this work are defined as a collection of individual perceptions, interactions, and behaviors [31] and the result of the socio-cultural practices of contentsharing that generate a certain discursive space. Underneath this theoretical approach lies the logical relation concerning research on observable variables, which enables moving forward from theorization to actual spatial analysis. Places contain social activity. Defining and observing the characteristics of this activity can provide information about socio-spatial dynamics and places themselves. In this particular research case, we focused on social activity on the discursive layers and their linkage to physical actions. In a broader context, the methodology of topic identification and tracking presented in this contribution can also be used beyond the cultural discourse of heritage places and be replicated to different topics and scales of impact. A potential field of implementation would be data mining for urban intervention assessment embedded in urban planning processes.

Implications for the Urban Practice
Social media data can contribute to the normative processes of traditional urban planning, such as the land-use definition, by matching the temporal pulse of activity in social media [32], or the identification of the main city landmarks as main hotspots of activity to characterize urban landscapes [33].
Several approaches using NLP and topic clustering methods have already been implemented in urban planning processes, specifically in public participation and consultation phases. The "Digital Participation System" (DIPAS) initiative utilizes advanced techniques to identify topics from large texts inputted from participants in the consultation about the development of an area [34,35]. The integration of NLP techniques in public consultation could also be deployed in assessing the impact of a spatial intervention by directly asking participants. However, social-media-based data inputs could be employed for curtailing the main limitations of collaborative planning and public consultation, since the main biases-the subject knowing that it is an actual subject and intending to get a certain output-are precluded by the fact that reality mining is "honest" in the sense that the sample has no knowledge of the fact of being a sample itself.
Social media data establish a link between information generated by people and the spatial context where the information is posted. Along with this technology, the integration of GPS traces in social media interactions has generated a new concept of platform beyond Website 2.0 allowing geosocial networking to empower communication and interaction features between users and between the users and the physical environment. Thus sociospatial analytics are performed through the study of the digital traces that constitute the intangible and discursive layers of information generated by social interactions, activities, and practices overlapping the physical boundaries beyond the formal space.
Taking into consideration the aforementioned approaches to this field, it can be derived that a hybrid sociotechnical approach in urban design and development in terms of public participation and consultation is needed and should be widespread, as the combination of advanced, data-driven technology and user-centric, co-creation approaches can eminently support urban development and place-making activities. The new demands arising from the combination of data-driven urban development (information-richness, quick communication, and feedback) and a socio-psychological perspective (moderated debate, sociable setting, facilitative space) have become explicit. It is obvious that successful citizen participation depends on transparency, usability, and broad outreach, and these qualities can be easily achieved by digital means of information processing, reality augmentation, data analysis. These technologies and approaches-if seamlessly integrated-can become standards for urban planning and development.

Temporal Activity Patterns
The consistency of trends in the data and their proximity to routine pattern representations from other disciplines-e.g., household energy consumption or urban traffic flow-suggests the validity of the data collection and cleaning methods from the conceptual perspective of observing digital traces in virtual space as indicators of social dynamics occurring offline. The results show that topics can be tracked, i.e., topics detected in a data set by combining the subsequent NLP preprocessing and LDA modeling methods can be used to train an LDA model on a second data set that generates the topics of the first data set, and thus it is able to track the topics of the first data set into the second data set. By applying this train of thought to our 16 weeks, we were able to identify a cultural topic in all of the weeks and observe the presence of the cultural narrative in the discursive space of the city. Ultimately, we were able to observe its continuous evolution over time and distinguish specific events with impact in the discursive layer of urban space.

Detecting and Tracking the Cultural Discourse
The fact that culture-related events were detected as peaks, as well as the high topic detection success rate of 87.5%, indicate a successful methodology. Cultural and noncultural events were indeed detected in either the relevant topic or the general discourse; we were able to distinguish three event categories: (1) Non-cultural, (2) Only-cultural, and (3) Events with a cultural and non-cultural impact.

Did the Interventions in Domplatz Make an Impact on the Urban Scale?
While it is true that some events with cultural value were recognized, the main objective of this work concerned the small-scale public space of Domplatz. The recognition of location-specific events that influence the discursive space of the city at urban-scale remains unresolved. As the statistics show, no distinguishable peaks are found before or after our target interventions. This fact implies that spatial interventions in Domplatz could not be detected due to their lack of impact in the discursive space of social media at the urban scale, as they are small local interventions compared to other major events (e.g., the publication of the UNESCO heritage report, citywide demonstration, or the Eurovision contest) that stand out instead.

Limitations and Future Directions
It was noticeable that although bots and automated posting services were supposed to be eliminated in a preliminary filtering step, many of them got through. Our preliminary filtering of posts generated by non-human automated machines set a maximum number of tweets posted by a single user in order for the user to be considered within our study. We also eliminated duplicate messages posted by the same user. However, we found multiple posts that contained the same text or portions of text posted by different users, implying that multiple automated posting accounts were connected under the same service. We also found that most of the messages were not repeated completely, but some text fragments (e.g., "The weather forecast for tomorrow is sunny" versus "The weather forecast for tomorrow is rainy" or "The weather forecast for today is sunny"). With deeper filtering, several complex methods can be implemented, such as Levenshtein distance (how similar text phrases are) and/or user-specific metrics, such as the number of followers versus following. Implementing deeper and more detailed cleaning of data sets prior to the NLP pipeline would significantly reduce the number of bots.
Further, a mixed approach comprising the integration of the methods described in this research within normative urban planning processes could help solve some of the biases of public consultation and public participation mentioned in Section 4.2. An integrated approach would set a new field of action in which space-makers, space-users, and stakeholders produce socio-spatial information actively, which can be merged further with data retrieved from passive digital sources, ensuring the integrity of the social information and integration in a greater level of the social source.