Next Article in Journal
Land Cover Classification in the Antioquia Region of the Tropical Andes Using NICFI Satellite Data Program Imagery and Semantic Segmentation Techniques
Previous Article in Journal
Spectrogram Dataset of Korean Smartphone Audio Files Forged Using the “Mix Paste” Command
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Data Descriptor

An Urban Image Stimulus Set Generated from Social Media

1
Department of Advertising and Public Relations, Michigan State University, East Lansing, MI 48824, USA
2
Centre of Geographical Studies, Institute of Geography and Spatial Planning, University of Lisbon, 1600-276 Lisbon, Portugal
3
Institute of Physiology, Lisbon School of Medicine, University of Lisbon, 1649-004 Lisbon, Portugal
4
Institute of Molecular Medicine, University of Lisbon, 1649-004 Lisbon, Portugal
5
Centre of Geographical Studies, Associate Laboratory TERRA, Institute of Geography and Spatial Planning, University of Lisbon, 1600-276 Lisbon, Portugal
*
Author to whom correspondence should be addressed.
Data 2023, 8(12), 184; https://doi.org/10.3390/data8120184
Submission received: 3 October 2023 / Revised: 20 November 2023 / Accepted: 24 November 2023 / Published: 1 December 2023

Abstract

:
Social media data, such as photos and status posts, can be tagged with location information (geotagging). This geotagged information can be used for urban spatial analysis to explore neighborhood characteristics or mobility patterns. With increasing rural-to-urban migration, there is a need for comprehensive data capturing the complexity of urban settings and their influence on human experiences. Here, we share an urban image stimulus set from the city of Lisbon that researchers can use in their experiments. The stimulus set consists of 160 geotagged urban space photographs extracted from the Flickr social media platform. We divided the city into 100 × 100 m cells to calculate the cell image density (number of images in each cell) and the cell green index (Normalized Difference Vegetation Index of each cell) and assigned these values to each geotagged image. In addition, we also computed the popularity of each image (normalized views on the social network). We also categorized these images into two putative groups by photographer status (residents and tourists), with 80 images belonging to each group. With the rise in data-driven decisions in urban planning, this stimulus set helps explore human–urban environment interaction patterns, especially if complemented with survey/neuroimaging measures or machine-learning analyses.
Dataset: The urban image stimulus set is available for download on the Open Science Framework (OSF) (https://osf.io/c79w5/, accessed on 23 November 2023).
Dataset License: The urban image stimulus set is being made available under the CC0 1.0 Universal license.

1. Summary

Social media have become an integral part of people’s everyday lives. More than four billion individuals use social media globally, and it is anticipated that this number will grow to almost six billion by 2027 [1]. The widespread use and influence of social media have drawn the interest of researchers from diverse areas, leading to an increase in scientific studies [2,3,4]. Social media platforms like Facebook, Instagram, X/Twitter, and TikTok provide rich and valuable data sources in the form of text, photos, videos, likes, comments, shares, and other interactions [5]. Some visual media-sharing platforms, like Flickr, also offer the possibility for users to upload, store, and share photo and video data with other users [6,7]. Moreover, certain platforms (Facebook, X/Twitter, etc.) also allow users to add geographical location tags to their posted content to indicate the user’s location and/or where the content was captured. Since our usage and dependency on mobile devices continue to rise, people have become a crucial source for collecting, retrieving, sharing, and disseminating various types of information.
Geotagged social media data also form a significant aspect of volunteered geographic information (VGI), which is a concept that depends on the voluntary contributions of individuals who are willing to share location-based information through digital platforms [8,9,10]. Within VGI, there is a subset that falls in the category of citizen science, involving non-professional scientists or citizens in data collection and, to some extent, analysis [11]. Social media users also serve as passive contributors to citizen science initiatives (a subset of VGI) when they share geotagged content that has the potential to be utilized in scientific research [12]. Crowdsourcing geotagged social media data can be a valuable technical and methodological tool for researchers and analysts in various fields, including environmental studies, psychology, transport and urban planning, tourism, behavioral economics, and other social sciences [7,11,13,14].
Geotagged social media content plotted on maps allows for visualizing the distribution and concentration of these data across different locations, giving insights into preferences (most sought-after locations), popular activities, interests, and emerging trends [13,15,16]. Furthermore, the number of views and “likes” that these geotagged media receive often indicates their popularity on social media, which can help researchers understand both online and offline social behavior [17]. For instance, Tenkanen et al. [18] compared the monthly official visitor statistics in 56 recreational protected areas (national parks) in 2014 to the visitation metric derived from Flickr, Instagram, and Twitter (currently known as X). The official visitor statistics comprised monthly data from installed electronic counters and sold entrance tickets, and the number of social media users posting content from these national parks aggregated for each month was the measure of social media-derived visitation statistics. The results showed a consistent relationship between monthly official visitor counts and social media-derived visitation statistics across all three platforms combined. Thus, geotagged social media data can serve as a valuable source of information for monitoring visitor numbers and gaining insights into a place’s popularity and visiting patterns over time.
It is important to highlight that global urbanization has been rapidly increasing, with 57% of the world’s population currently living in urban areas [19], and, according to the United Nations, this percentage will reach 68% by 2050 [20]. Consequently, numerous studies have emerged that link the well-being of individuals to the urban environment that they live in [21,22,23]. A recent study spanning 60 developed countries and including 230 million people established a positive link between urban green space and a nation’s happiness level [24]. This study’s urban green space amount was measured using the Normalized Difference Vegetation Index (NDVI) computed from high-resolution satellite images for different countries. Geotagged data can also be employed at a regional and community level by analyzing neighborhood characteristics such as demographics and environmental factors to improve the quality of life for individuals. Another study by Stier and colleagues [25] used geotagged Twitter (currently known as X) datasets to identify words related to depressive symptoms in users’ tweets. They found that larger urban areas in the US with denser socioeconomic network connections had lower rates of depression. Geotagged data can thus provide vital insights into the impact of the urban environment on mental health and well-being [26]. In sum, geotagged data can play an important role in urban planning and development initiatives, enabling planners to make data-driven decisions to improve urban livability and sustainability and help policymakers to derive policy recommendations tailored to local needs.
To facilitate research in the domains mentioned above, we created a rich stimulus set of 160 urban space images of Lisbon. We sourced these images from the Flickr social media platform, and all images were geotagged, with a linked owner identification tag and upload date. We divided the city into 100 × 100 m cells to calculate the cell image density (number of images in each cell) and cell green index (Normalized Difference Vegetation Index for each cell). Then, we assigned these values to each image based on their geotagged information. In addition, we also provide the popularity of each image (normalized views on the social network). Finally, we categorized these images into two putative groups by photographer status (residents and tourists), with 80 images belonging to each group.
In addition to our stimulus set, other researchers have created valuable datasets that include urban space images, such as Placepulse [27], Cityscapes [28], and ADE20K [29]. Of note, Placepulse also contains geotagged images from Lisbon, similar to our stimulus set. However, this dataset does not explicitly provide several relevant metrics we computed, such as popularity or the amount of greenery associated with each image. Instead, it offers pairwise comparisons and rankings for each image based on attributes like safety, liveliness, and aesthetics. ADE20K and Cityscapes, on the other hand, do not have geotagged images, nor do they provide other specific metrics (for greenery or popularity of the image). We believe that our stimulus set (used alone or possibly in conjunction with other available urban space datasets) aligns with the collaborative nature of VGI and could provide a strong foundation for future studies, especially for exploring urban environment–human interaction patterns. This stimulus set can be a valuable tool for urban planners to better understand how people move across space through time and what sort of activities they perform through behavioral and neurocognitive measures, as well as for social media researchers to assess various aspects of online social behavior.

2. Data Description

We have provided this urban image stimulus set for download on the Open Science Framework (OSF) (https://osf.io/c79w5/, accessed on 23 November 2023). The stimulus set images are stored in the folder labeled “Urban image stimulus set” and are numbered from 1–160. The data also include an Excel sheet named “Urban image stimulus set variables” with columns that provide information on the variables associated with each image, namely the image number, owner tag, photo ID (PID), secret tag, category (presumed residents/tourists), bin number (one to eight, representing different ranges of the cell image density), cell image density, cell green index, number of views, normalized popularity, brightness, and contrast.

3. Methods

We developed our image stimulus set in a four-step process. The first step consisted of extracting bulk geotagged urban environment images from the Flickr social media platform (image extraction process). The second step involved determining four variables associated with the images extracted in the first step (image variable determination). The third step consisted of selecting images that depicted urban spaces with respect to our four variables (image selection process). The fourth and final step involved assessment of the brightness and contrast of the images selected in the third step (image brightness and contrast assessment). We adjusted any images that had brightness and contrast values beyond three standard deviations and did not discard any images during this step. We describe each of these steps below, in greater detail.

3.1. Step 1: Image Extraction Process

We obtained images from the city of Lisbon (Portugal) that were posted to the social media platform Flickr. To do this, we used the Flickr API and created a Python query to search for geotagged images with respect to the date they were uploaded and the specific location where the image was taken. Figure 1 depicts these input parameters, with the date duration between 1 January 2016 and 29 September 2021 and the bounding box covering the boundaries of the Lisbon municipality [search coordinates in WGS84: −9.23, 38.69, −9.09, 38.80; search area: 158.43 km2]. The above process resulted in 75,233 images taken in the city of Lisbon. We retrieved the geotagged location, the information about the photographer (owner id), and details related to the Flickr account (highest number of views, etc.); such data extraction was the basis to determine the variables explained in the subsequent step.

3.2. Step 2: Image Variable Determination

In this step, we determined the following variables for each of the extracted images (see Figure 1).

3.2.1. Geotagging

All images extracted in Step 1 were geotagged. We, therefore, fetched each image’s associated latitude and longitude coordinates.

3.2.2. Normalized Popularity

We calculated this variable by dividing the number of views for each image by the highest number of views for any image that the photographer of the image had posted. In this way, we created a normalized index score for the popularity of each image that ranges from 0 to 1. For example, if the image in question had been seen by 100 people, and the photographer’s most popular image had been seen by 1000 people, then 100/1000 = 0.1 normalized popularity score.

3.2.3. Cell Image Density

We divided the entire 158.43 km2 search area of Lisbon into 100 × 100 m cells using the Fishnet method in ArcGIS 10.7. Using each image’s geotag data, we calculated the total number of extracted images in each 100 × 100 m cell included within the geographical boundaries of the city of Lisbon. This provided a cell image density specific to each cell—which reflects the total number of geotagged photographs shared by Flickr users in the corresponding area of the cell during the specified time period. We then associated this cell image density with each image that was taken within that cell.

3.2.4. Cell Green Index

We quantified the overall greenness of the cell by using the Normalized Difference Vegetation Index (NDVI) [30]. NDVI measures the ratio of the difference in the red and near-infrared portions of the spectrum to their respective sum. Healthy vegetation (chlorophyll) reflects more near-infrared light than other wavelengths, but it absorbs more red light. Thus, NDVI ranges from −1.0 to 1.0, with larger positive values indicating green vegetation. Non-vegetated areas, including bare soil, open water, snow/ice, and most construction materials, have much lower NDVI values [31]. The NDVI is preferred to the simple index for global vegetation monitoring because it compensates for changing illumination conditions, shadows, surface slope, and aspect, among other factors. NDVI has achieved good results in detecting green cover, monitoring land surfaces and vegetation canopies, estimating leaf area index, estimating grass cover vegetation biomass, and quantifying the percentage of grass cover [32]. We employed satellite imagery from the Sentinel-2 data source and created an NDVI map for the city of Lisbon in ArcGIS. We ensured accuracy by creating a synthesis map that averages all months for each year (2016–2021). This provided a comprehensive overview of vegetation changes in Lisbon over a six-year period. For instance, if an image taken in 2016 shows an empty spot, but a building is constructed on that same spot by 2020, using the NDVI from 2020 would not give an accurate reflection of the vegetation from 2016. We overlayed this map onto a grid of 100 × 100 m cells, and each cell and image within it received the same green index value for the year it was photographed.

3.3. Step 3: Image Selection Process

Past research has focused on understanding the dynamics of spatial interactions between residents and tourists and their implications on urban planning [33,34]. Thus, to expand the scope of research that can be performed utilizing these images, we divided the images into two categories: images photographed by residents and images photographed by tourists. The following paragraphs describe our approach in performing this segregation and the successive steps to select the relevant images depicting the urban spaces of Lisbon.
We segregated the 75,233 images obtained in Step 1 into two categories (presumed residents or tourists) based on data from the photographer’s Flickr account (see Figure 2). We assigned an image to the presumed resident category if the photographer’s Flickr account uploaded images within our defined geographical boundaries in the city of Lisbon for more than three consecutive months. Conversely, we assigned an image to the presumed tourist category if the photographer’s Flickr account uploaded images for less than three consecutive months. Of note, according to data published by the National Institute of Statistics of Portugal (NISP) in 2020 [35], tourists stay an average of around three nights in Portugal. Importantly, NISP provide this average stay datapoint without a standard deviation value (s.d.) to calculate a threshold that could help us improve our residents vs. tourists categorization (e.g., average + 3 s.d.). Thus, given the missing s.d., we adopted a conservative categorization threshold of three months. To note, however, we acknowledge that some residents’ photographs will likely be labeled as tourists’ (and vice versa). In support of this, we have termed this categorization “presumed” residents and tourists throughout this manuscript. In addition, it is important to clarify that we only considered the number of unique days on which a Flickr user posted images, rather than the total quantity of images they posted. Please see Figure 2 for a depiction of the cell image density distribution for the 75,233 images, as well as the images we categorized as photographed by presumed residents (26,585 images) and presumed tourists (48,648 images).
We next took further steps to refine our stimulus set by independently handling images belonging to the presumed resident and tourist categories. All the analyses were conducted using MATLAB R2022a. Firstly, we divided the images in each category into eight bins based on cell image density (Figure 3A). Next, we aimed to select 10 relevant images depicting Lisbon’s urban spaces for each bin in each category, which would result in 160 total images (80 images for tourists and 80 images for residents). Importantly, we intended to select an image set where our key image variables (popularity, cell image density, and cell green index) were orthogonal to each other. We did this to help researchers avoid collinearity effects when conducting future investigations with these image variables. We achieved this with an iterative image selection process involving first plotting all images in a bin and then subjecting individual images within the bin to an in-house algorithm (Figure 3B). To explain in more detail, we first created a bivariate histogram with all images in a single bin plotted according to their popularity and cell green index. Second, we divided the histogram into three equal parts along the cell green index axis. We decided to divide this axis into thirds of its original divisions after analyzing the histogram’s distribution across all bins in both categories. For example, if the original divisions in the histogram along the cell green index for a particular bin were 33, they were divided into three parts, 11 divisions each, for further steps. To note, before proceeding with this three-part division, there were a small number of bins where the number of original divisions along this axis was even, so we replotted the histogram by modifying the original division number to the following odd number. Next, we randomly selected one image from each part, obtaining three images in one complete iteration. We then ran four iterations of this selection process to retrieve 12 distributed images from the bin we were working on. At this step, we visually inspected the 12 images to confirm that they depicted urban spaces in Lisbon. If some images were visually not appropriate (e.g., a close-up image of a face), we removed them and, if needed, ran more iterations until we obtained 10 visually relevant images for each bin. In a few bins with a high percentage of visually irrelevant images, we had to manually select some images after our algorithm reached saturation (repeatedly selecting irrelevant images). In these bins, our algorithm selected ~70% of the final images. We stopped the process when we obtained 10 images for each bin.
To validate our selection process and confirm orthogonality between variables, we used a MATLAB-based robust correlation toolbox [36] to conduct three different types of correlations (Pearson, percentage bend, and Spearman) on the entire set of 160 images (Figure 4). We also conducted these correlations on the presumed residents and tourists category sets of 80 images each (Table 1). All correlations were weak and non-significant (p > 0.05), confirming the orthogonality between the output variables of popularity, cell image density, and cell green index for the entire stimulus set (and also for each residents and tourists category sets of images).

3.4. Step 4: Selected Images’ Brightness and Contrast Assessment

As part of the selection process, we also assessed the brightness and the root mean square (RMS) contrast values for the 160 images [37,38]. For each image, we calculated the image brightness by averaging the images’ RGB pixel values and the RMS contrast, which is the standard deviation of the brightness values as follows:
B a v g = 1 N k = 1 N L k
C r m s = 1 N k = 1 N ( L k B a v g ) 2
where N is the total number of pixels, k is the pixel index, L k is the pixel value, B a v g is the average brightness value of an image, and C r m s is the RMS contrast of an image. We then estimated the mean and std. dev of these values in the entire set of 160 images and in presumed residents and tourists category sets of 80 images each to check if they lay within three std. dev from the mean. Barring the RMS contrast of one image belonging to the presumed residents category, all the images’ brightness and contrast values satisfied this criterion. After reducing the contrast value of this image by 12%, all values were within three standard deviations from the mean for the entire set and separate sets of presumed residents and tourists categories.

4. User Notes

We used the social media platform Flickr to create an urban image stimulus set of the city of Lisbon. This image set is a unique and valuable resource that can help research in a variety fields, such as understanding social media use and urban planning. The dataset comprises 160 geotagged urban space images, and these images can be further categorized into 80 images of presumed residents and tourists. In addition to latitude and longitude information, we created variables/attributes associated with these images, such as cell image density, cell green index, and normalized popularity. We conducted robust correlations among these output variables to ensure their orthogonality in both the overall dataset and the category subsets. Further, the brightness and contrast values for the cumulative and within-category image set are within three standard deviations from the mean.
The presented urban image stimulus set is helpful for studying and comprehending different facets of urban settings, such as people’s perceptions, preferences, and behaviors. For example, the stimulus set can be effectively used to assess urban perception by seeking responses to factors such as valence, arousal, feelings of safety, etc. They can also be combined with personality and mental health questionnaires. By leveraging this combined methodology, researchers can better understand the complex dynamics between physical spaces, individual characteristics, and digital behavior linked to urban spaces. For instance, researchers can use the stimulus set to assess how much digital popularity (measured by normalized popularity) overlaps with real-life location popularity (based on people’s preferences and perceptions of the urban environment). This image set can also be employed in neuroimaging experiments using techniques such as functional magnetic resonance imaging (fMRI) and electroencephalography (EEG) [39]. This approach enables researchers to uncover the underlying neural mechanisms and cognitive processes involved in perceiving urban spaces in Lisbon. For example, Chang et al. [40] used fMRI to shed light on the neural mechanisms contributing to the positive mental health outcomes associated with green spaces. In another study, Olszewska-Guizzo et al. [41] conducted a study using EEG to examine the linkage between urban green spaces and mental health outcomes. Our stimulus set includes a metric for measuring greenness called the cell green index and other measurements that offer possibilities for investigating similar and varied neuroimaging-based research questions.
Finally, researchers can combine data from neuroimaging experiments conducted using the urban image stimulus set with machine learning algorithms and explore promising ways to identify patterns and predict future trends across various domains, including urban planning, social media, and interventions in clinical populations. However, it is important to acknowledge that while our set of stimuli covers a significant expanse of 158.43 km2 within Lisbon, other researchers can develop a similar set of stimuli encompassing a diverse range of geographical regions for their respective studies. Further, it should be noted that we divided the Lisbon region into cells of 100 × 100 m, and all images within each cell were assigned the same cell green index and cell image density. Therefore, while a particular cell may have a high level of greenery overall, certain images within that cell may contain less or no green coverage.
It is important to note that the urban stimulus set is limited to a specific time frame and is therefore static in nature. Future research holds the potential to enhance our understanding of urban dynamics comprehensively by integrating dynamic data sources and leveraging the capabilities of both community-based geoportals and Spatial Data Infrastructures (SDIs). These combined contributions form a holistic approach that benefits both localized and broader perspectives on the urban environment [42,43]. In addition, it is important to encourage future research in developing stimulus sets specifically tailored to rural areas. This will enable valuable comparisons with urban settings and help uncover unique challenges and opportunities. By doing so, targeted interventions can be created to improve rural life and foster sustainable development. Despite the above points, our urban image stimulus set can be very useful to researchers and contribute to developing innovative solutions. Researchers can tackle complex environmental and social science challenges by combining this stimulus set with self-reported evaluations, neuroimaging methods, and advanced machine learning algorithms. Additionally, the output variables included in this set allow for analysis of the intricate relationship between human behavior, decision-making, social dynamics, and the environment.

Author Contributions

A.K. contributed to stimulus set creation, created stimulus selection methodology, wrote the original draft, and revised the manuscript. A.L.R. contributed to stimulus set creation and wrote and edited the manuscript. S.H. contributed to stimulus set creation. D.A.B.-M. contributed to stimulus set creation and edited the manuscript. B.M. conceptualized and supervised stimulus set creation, secured funding, and edited the manuscript. P.M. conceptualized and supervised stimulus set creation, secured funding, and wrote and edited the manuscript. D.M. conceptualized and supervised stimulus set creation, secured funding, and wrote and edited the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This work is a part of the eMOTIONAL Cities project, which received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 945307.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

We have provided this urban image stimulus set and the MATLAB code for the algorithm utilized for stimulus set selection for download on the Open Science Framework (OSF) (https://osf.io/c79w5/, accessed on 23 November 2023).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Statista 2023. Available online: https://www.statista.com/statistics/278414/number-of-worldwide-social-network-users/ (accessed on 29 August 2023).
  2. Meshi, D.; Tamir, D.I.; Heekeren, H.R. The Emerging Neuroscience of Social Media. Trends Cogn. Sci. 2015, 19, 771–782. [Google Scholar] [CrossRef] [PubMed]
  3. Kapoor, K.K.; Tamilmani, K.; Rana, N.P.; Patil, P.; Dwivedi, Y.K.; Nerur, S. Advances in Social Media Research: Past, Present and Future. Inf. Syst. Front. 2017, 20, 531–558. [Google Scholar] [CrossRef]
  4. Karim, F.; Oyewande, A.; Abdalla, L.F.; Ehsanullah, R.C.; Khan, S. Social Media Use and Its Connection to Mental Health: A Systematic Review. Cureus 2020, 12, e8627. [Google Scholar] [CrossRef] [PubMed]
  5. Statista 2023. Available online: https://www.statista.com/statistics/272014/global-social-networks-ranked-by-number-of-users/ (accessed on 27 October 2023).
  6. Kennedy, L.; Naaman, M.; Ahern, S.; Nair, R.; Rattenbury, T. How flickr helps us make sense of the world: Context and content in community-contributed media collections. In Proceedings of the 15th ACM International Conference on Multimedia2007, Augsburg, Germany, 25–29 September 2007. [Google Scholar] [CrossRef]
  7. Hollenstein, L.; Purves, R.S. Exploring place through user-generated content: Using Flickr tags to describe city cores. J. Spat. Inf. Sci. 2010, 1, 21–48. [Google Scholar]
  8. Goodchild, M.F. Citizens as sensors: The world of volunteered geography. GeoJournal 2007, 69, 211–221. [Google Scholar] [CrossRef]
  9. Sui, D.; Goodchild, M. The convergence of GIS and social media: Challenges for GIScience. Int. J. Geogr. Inf. Sci. 2011, 25, 1737–1748. [Google Scholar] [CrossRef]
  10. Anselin, L.; Williams, S. Digital neighborhoods. J. Urban. Int. Res. Placemaking Urban Sustain. 2015, 9, 305–328. [Google Scholar] [CrossRef]
  11. Haklay, M. Citizen Science and Volunteered Geographic Information: Overview and Typology of Participation. In Crowdsourcing Geographic Knowledge: Volunteered Geographic Information (VGI) in Theory and Practice; Sui, D., Elwood, S., Goodchild, M., Eds.; Springer: Dordrecht, The Netherlands, 2013; pp. 105–122. ISBN 9789400745872. [Google Scholar] [CrossRef]
  12. Edwards, T.; Jones, C.B.; Perkins, S.E.; Corcoran, P. Passive citizen science: The role of social media in wildlife observations. PLoS ONE 2021, 16, e0255416. [Google Scholar] [CrossRef]
  13. Kisilevich, S.; Krstajic, M.; Keim, D.; Andrienko, N.; Andrienko, G. Event-Based Analysis of People’s Activities and Behavior Using Flickr and Panoramio Geotagged Photo Collections. In Proceedings of the 2010 14th International Conference Information Visualisation (IV), London, UK, 26–29 July 2010; pp. 289–296. [Google Scholar]
  14. Yin, J.; Chi, G. Characterizing People’s Daily Activity Patterns in the Urban Environment: A Mobility Network Approach with Geographic Context-Aware Twitter Data. Ann. Assoc. Am. Geogr. 2021, 111, 1967–1987. [Google Scholar] [CrossRef]
  15. Ratti, C.; Claudel, M. The City of Tomorrow; Yale University Press: New Haven, CT, USA, 2016. [Google Scholar]
  16. Girardin, F.; Calabrese, F.; Fiore, F.D.; Ratti, C.; Blat, J. Digital Footprinting: Uncovering Tourists with User-Generated Content. IEEE Pervasive Comput. 2008, 7, 36–43. [Google Scholar] [CrossRef]
  17. Highsnobiety. Available online: https://www.highsnobiety.com/p/social-media-impact-popularity/ (accessed on 19 May 2017).
  18. Tenkanen, H.; Di Minin, E.; Heikinheimo, V.; Hausmann, A.; Herbst, M.; Kajala, L.; Toivonen, T. Instagram, Flickr, or Twitter: Assessing the usability of social media data for visitor monitoring in protected areas. Sci. Rep. 2017, 7, 17615. [Google Scholar] [CrossRef] [PubMed]
  19. Statista 2022. Available online: https://www.statista.com/statistics/270860/urbanization-by-continent/ (accessed on 7 November 2023).
  20. United Nations, Department of Economic and Social Affairs, Population Division. World Urbanization Prospects: The 2018 Revision (ST/ESA/SER.A/420); United Nations: New York, NY, USA, 2019. [Google Scholar]
  21. Mouratidis, K. Built environment and social well-being: How does urban form affect social life and personal relationships? Cities 2018, 74, 7–20. [Google Scholar] [CrossRef]
  22. Sadeghi, A.R.; Ebadi, M.; Shams, F.; Jangjoo, S. Human-built environment interactions: The relationship between subjective well-being and perceived neighborhood environment characteristics. Sci. Rep. 2022, 12, 21844. [Google Scholar] [CrossRef]
  23. Kabisch, N.; Qureshi, S.; Haase, D. Human–environment interactions in urban green spaces—A systematic review of contemporary issues and prospects for future research. Environ. Impact Assess. Rev. 2015, 50, 25–34. [Google Scholar] [CrossRef]
  24. Kwon, O.H.; Hong, I.; Yang, J.; Wohn, D.Y.; Jung, W.S.; Cha, M. Urban green space and happiness in developed countries. EPJ Data Sci. 2021, 10, 28. [Google Scholar] [CrossRef]
  25. Stier, A.J.; Schertz, K.E.; Rim, N.W.; Cardenas-Iniguez, C.; Lahey, B.B.; Bettencourt, L.M.A.; Berman, M.G. Evidence and theory for lower rates of depression in larger US urban areas. Proc. Natl. Acad. Sci. USA 2021, 118, e2022472118. [Google Scholar] [CrossRef] [PubMed]
  26. Helbich, M. Toward dynamic urban environmental exposure assessments in mental health research. Environ. Res. 2018, 161, 129–135. [Google Scholar] [CrossRef]
  27. Dubey, A.; Naik, N.; Parikh, D.; Raskar, R.; Hidalgo, C.A. Deep Learning the City: Quantifying Urban Perception at a Global Scale. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; pp. 196–212. [Google Scholar]
  28. Cordts, M.; Omran, M.; Ramos, S.; Rehfeld, T.; Enzweiler, M.; Benenson, R.; Franke, U.; Roth, S.; Schiele, B. The Cityscapes Dataset for Semantic Urban Scene Understanding. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 3213–3223. [Google Scholar] [CrossRef]
  29. Zhou, B.; Zhao, H.; Puig, X.; Xiao, T.; Fidler, S.; Barriuso, A.; Torralba, A. Semantic Understanding of Scenes Through the ADE20K Dataset. Int. J. Comput. Vis. 2018, 127, 302–321. [Google Scholar] [CrossRef]
  30. Rhew, I.C.; Stoep, A.V.; Kearney, A.; Smith, N.L.; Dunbar, M.D. Validation of the Normalized Difference Vegetation Index as a Measure of Neighborhood Greenness. Ann. Epidemiol. 2011, 21, 946–952. [Google Scholar] [CrossRef]
  31. Lillesand, T.; Kiefer, R.W.; Chipman, J. Remote Sensing and Image Interpretation, 5th ed.; John Wiley & Sons: Hoboken, NJ, USA, 2004; ISBN 0471152277. [Google Scholar]
  32. Dwivedi, R.S. Remote Sensing of Soils; Springer: Berlin/Heidelberg, Germany, 2017. [Google Scholar]
  33. Li, D.; Zhou, X.; Wang, M. Analyzing and visualizing the spatial interactions between tourists and locals: A Flickr study in ten US cities. Cities 2018, 74, 249–258. [Google Scholar] [CrossRef]
  34. Chen, Z.; Yang, J.; Liu, X.; Guo, Z. Reinterpreting activity space in tourism by mapping tourist-resident interactions in populated cities. Tour. Recreat. Res. 2022, 1–15. [Google Scholar] [CrossRef]
  35. Statistics Portugal, Estatísticas do Turismo, 2020; Instituto Nacional de Estatística: Lisbon, Portugal; ISBN 978-989-25-0569-5.
  36. Pernet, C.R.; Wilcox, R.; Rousselet, G.A. Robust Correlation Analyses: False Positive and Power Validation Using a New Open Source Matlab Toolbox. Front. Psychol. 2013, 3, 606. [Google Scholar] [CrossRef] [PubMed]
  37. Lakens, D.; Fockenberg, D.A.; Lemmens, K.P.H.; Ham, J.; Midden, C.J.H. Brightness differences influence the evaluation of affective pictures. Cogn. Emot. 2013, 27, 1225–1246. [Google Scholar] [CrossRef]
  38. Harrison, W.J. Luminance and Contrast of Images in the THINGS Database. Perception 2022, 51, 244–262. [Google Scholar] [CrossRef] [PubMed]
  39. Ancora, L.A.; Blanco-Mora, D.A.; Alves, I.; Bonifácio, A.; Morgado, P.; Miranda, B. Cities and neuroscience research: A systematic literature review. Front. Psychiatry 2022, 13, 983352. [Google Scholar] [CrossRef] [PubMed]
  40. Chang, D.H.; Jiang, B.; Wong, N.H.; Wong, J.J.; Webster, C.; Lee, T.M. The human posterior cingulate and the stress-response benefits of viewing green urban landscapes. NeuroImage 2021, 226, 117555. [Google Scholar] [CrossRef]
  41. Olszewska-Guizzo, A.; Sia, A.; Fogel, A.; Ho, R. Can Exposure to Certain Urban Green Spaces Trigger Frontal Alpha Asymmetry in the Brain?—Preliminary Findings from a Passive Task EEG Study. Int. J. Environ. Res. Public Health 2020, 17, 394. [Google Scholar] [CrossRef]
  42. Vahidnia, M.H.; Vahidi, H. Open Community-Based Crowdsourcing Geoportal for Earth Observation Products: A Model Design and Prototype Implementation. ISPRS Int. J. Geo-Inf. 2021, 10, 24. [Google Scholar] [CrossRef]
  43. De Longueville, B. Community-based geoportals: The next generation? Concepts and methods for the geospatial Web 2.0. Comput. Environ. Urban Syst. 2010, 34, 299–308. [Google Scholar] [CrossRef]
Figure 1. Process for extracting 75,233 urban environment images from the Flickr website. We selected images based on the input parameters (left) and determined the output variables (right) for all images.
Figure 1. Process for extracting 75,233 urban environment images from the Flickr website. We selected images based on the input parameters (left) and determined the output variables (right) for all images.
Data 08 00184 g001
Figure 2. (A) Process for segregating 75,233 images into two groups: photographs uploaded by presumed residents or photographs uploaded by presumed tourists; (B) cell image density distribution in each 100 × 100 m cell for the entire set of 75,233 images; (C) cell image density distribution for images uploaded by presumed residents; (D) cell image density distribution for images uploaded by presumed tourists. The color bar denotes the cell image density, where blue represents the minimum values, and red represents the maximum values.
Figure 2. (A) Process for segregating 75,233 images into two groups: photographs uploaded by presumed residents or photographs uploaded by presumed tourists; (B) cell image density distribution in each 100 × 100 m cell for the entire set of 75,233 images; (C) cell image density distribution for images uploaded by presumed residents; (D) cell image density distribution for images uploaded by presumed tourists. The color bar denotes the cell image density, where blue represents the minimum values, and red represents the maximum values.
Data 08 00184 g002
Figure 3. Process for selecting 160 urban environment images (80 each from the presumed resident and tourist categories). (A) Segregating images based on cell image density into bins for both presumed residents and tourists (B) Selecting each bin sequentially and feeding them to an in-house algorithm to select 10 images per bin orthogonal in popularity and cell green index. This process allowed for images distributed across our image cell density variable and assured that images were not correlated with respect to our selected variables (popularity and cell green index).
Figure 3. Process for selecting 160 urban environment images (80 each from the presumed resident and tourist categories). (A) Segregating images based on cell image density into bins for both presumed residents and tourists (B) Selecting each bin sequentially and feeding them to an in-house algorithm to select 10 images per bin orthogonal in popularity and cell green index. This process allowed for images distributed across our image cell density variable and assured that images were not correlated with respect to our selected variables (popularity and cell green index).
Data 08 00184 g003
Figure 4. Pearson, Bend, and Spearman correlation plots with the 95% bootstrapped confidence intervals (shaded areas) for the entire set of 160 urban environment images. (A) Correlations between cell image density and popularity, (B) correlations between cell image density and cell green index, and (C) correlations between popularity and cell green index. For the bend correlation, red indicates data bent in the variable on the x-axis, green in the variable on the y-axis, and black indicates data bent in both variables.
Figure 4. Pearson, Bend, and Spearman correlation plots with the 95% bootstrapped confidence intervals (shaded areas) for the entire set of 160 urban environment images. (A) Correlations between cell image density and popularity, (B) correlations between cell image density and cell green index, and (C) correlations between popularity and cell green index. For the bend correlation, red indicates data bent in the variable on the x-axis, green in the variable on the y-axis, and black indicates data bent in both variables.
Data 08 00184 g004
Table 1. Pearson, Bend, and Spearman correlation values with significance levels for correlation among cell image density, popularity, and cell green index performed separately for the images uploaded by presumed residents and tourists. For the identical analyses performed on the full set of images, please see Figure 4.
Table 1. Pearson, Bend, and Spearman correlation values with significance levels for correlation among cell image density, popularity, and cell green index performed separately for the images uploaded by presumed residents and tourists. For the identical analyses performed on the full set of images, please see Figure 4.
CategoryCorrelation VariablesPearson
Correlation
Bend
Correlation
Spearman Correlation
rprprp
Images by Presumed ResidentsCell Image Density and Popularity−0.0690.546−0.0610.588−0.0390.731
Cell Image Density and Cell Green Index−0.1220.281−0.1950.082−0.2120.06
Popularity and Cell Green Index−0.0370.744−0.0980.387−0.1210.284
Images by Presumed TouristsCell Image Density and Popularity−0.1360.228−0.1490.188−0.1750.120
Cell Image Density and Cell Green Index−0.0920.419−0.1070.345−0.0400.724
Popularity and Cell Green Index−0.0590.602−0.0960.394−0.1030.363
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kaur, A.; Rodrigues, A.L.; Hoogstraten, S.; Blanco-Mora, D.A.; Miranda, B.; Morgado, P.; Meshi, D. An Urban Image Stimulus Set Generated from Social Media. Data 2023, 8, 184. https://doi.org/10.3390/data8120184

AMA Style

Kaur A, Rodrigues AL, Hoogstraten S, Blanco-Mora DA, Miranda B, Morgado P, Meshi D. An Urban Image Stimulus Set Generated from Social Media. Data. 2023; 8(12):184. https://doi.org/10.3390/data8120184

Chicago/Turabian Style

Kaur, Ardaman, André Leite Rodrigues, Sarah Hoogstraten, Diego Andrés Blanco-Mora, Bruno Miranda, Paulo Morgado, and Dar Meshi. 2023. "An Urban Image Stimulus Set Generated from Social Media" Data 8, no. 12: 184. https://doi.org/10.3390/data8120184

Article Metrics

Back to TopTop