On How Crowdsourced Data and Landscape Organisation Metrics Can Facilitate the Mapping of Cultural Ecosystem Services: An Estonian Case Study

Social media continues to grow, permanently capturing our digital footprint in the form of texts, photographs, and videos, thereby reflecting our daily lives. Therefore, recent studies are increasingly recognising passively crowdsourced geotagged photographs retrieved from location-based social media as suitable data for quantitative mapping and assessment of cultural ecosystem service (CES) flow. In this study, we attempt to improve CES mapping from geotagged photographs by combining natural language processing, i.e., topic modelling and automated machine learning classification. Our study focuses on three main groups of CESs that are abundant in outdoor social media data: landscape watching, active outdoor recreation, and wildlife watching. Moreover, by means of a comparative viewshed analysis, we compare the geographic information systemand remote sensing-based landscape organisation metrics related to landscape coherence and colour harmony. We observed the spatial distribution of CESs in Estonia and confirmed that colour harmony indices are more strongly associated with landscape watching and outdoor recreation, while landscape coherence is more associated with wildlife watching. Both CES use and values of landscape organisation indices are land cover-specific. The suggested methodology can significantly improve the state-of-the-art with regard to CES mapping from geotagged photographs, and it is therefore particularly relevant for monitoring landscape sustainability.


Introduction
Almost 50 years ago, in the 1970s, Philippe Saint-Marc interpreted the outdoor environment as a social service supporting a good quality of life and public well-being [1]. Ever since then, this logic has been elaborated upon with the concept of cultural ecosystem services (CESs) [2,3] and a geographic perspective connecting the ecosystem (landscape) structure and functions with benefits and values [4]. Accordingly, the capacity of landscapes to provide CESs among other ecosystem services is now considered a prerequisite for landscape sustainability in connecting the Earth's patterns and processes to individual values and preferences [5][6][7].
However, CESs have proven difficult to quantify, and consequently they are difficult to manage. Therefore, many authors have discussed CESs in the context of metrics, including economic assessment and quantitative mapping [8][9][10][11][12]. Currently, having a proper understanding, quantitative assessment, and an incorporation of CES into decision-making processes is considered crucial for achieving sustainable development goals and other policy targets [13][14][15][16]. The most advanced approach to the classification of the CES is being developed in the Common International Classification of Ecosystem Services (CICES) [17]. Our work examines, in the Estonian context, the following classes of CESs according to the CICES: a) characteristics of living systems that enable aesthetic experiences (experiencing landscape beauty, passive recreation); b) characteristics of living systems that enable activities promoting health, recuperation, or enjoyment through active or immersive interactions (active outdoor recreation); and c) characteristics of living systems that enable activities promoting health, recuperation, or enjoyment through passive or observational interactions (e.g., watching organisms: plants, animals and mushrooms).
The vast majority of CES assessments are based on surveys, interviews, participatory mapping, workshops, and other kinds of offline engagements with pre-selected individuals, such as local communities, key stakeholders, or experts [8,[18][19][20]. However, the last two decades have seen a growing trend towards crowdsourcing applications in this field. In particular, the use of publicly available location-based social media (LBSM) data-mainly geotagged photographs-stored in online photo repositories (Flickr and Panoramio), applications (Instagram and Strava), and social networks (VK.com and Twitter) has proliferated [21]. Passively crowdsourced digital footprint has been used for (a) the assessment of touristic place visitation rates [22], (b) mapping landscape values across spatial scales [23,24], (c) mapping landscape aesthetic flow [25], (d) analysing spatial CES distributions [12,26], etc.
However, the amount of geotagged data in the online repositories of varying and often non-relevant content poses an issue for content selection and classification. The most common approaches of content analysis include manual selection [25][26][27] or photo-user-days mapping within the InVEST ecosystem service models [22,28,29]. Therefore, image recognition services and machine learning models have been gaining attention more recently. For instance, machine learning algorithms provided by Clarifai (Clarifai Inc., New York, NY, USA) and Google Cloud Vision were recently reported to be very promising for CES recognition and mapping [30,31], and natural language processing was applied to categorise social media users in relation to outdoor recreation [32].
In our study, the objectives are to (a) identify and map CES use in Estonia by using a combination of automated content image recognition and topic modelling on photos from selected social media platforms, and (b) quantify the association between two types of landscape attributes reflecting subjective landscape organisation, i.e., the landscape coherence and colour harmony of land cover, and CES flow. Landscape coherence is a landscape attribute, which, according to existing reports, rather positively influences landscape preferences by generalising order and organisation of recognisable elements of landscape pattern [33]. It can be mapped with a geographic information system (GIS)-based indicator in relation to photographing preferences [34]. Colour harmony is also discussed as an important aesthetic variable of visual landscape [35] and is recently mapped with satellite imagery and textural metrics [36], but it has received much less attention in literature compared to landscape coherence. Suggested objective indicators of landscape coherence and colour harmony of land cover remain understudied in the context of CES use and require testing across various environmental settings and scales.
The paper is developed around a simple framework of CES use classification and its linkage to landscape attributes, assessable with remote sensing-and GIS-based indicators. Section 2 justifies the study area choice and introduces the methods used to extract knowledge on CES use and landscape attributes from geolocated photographs and GIS data, respectively. Section 3 presents the results of CES use mapping in relation to the GIS-and remote sensing-based indicators, as well as land cover types. Section 4 discusses the results in the wider context of added research value compared to existing research papers. Section 5 concludes with the main findings and directions of further work.

Study Area
According to DataReportal, 98% of Estonians are Internet users to some extent, and 57% are active users of social media [37]. This high level of Internet penetration, combined with a well-developed touristic policy and infrastructure, as well as the significant share of the Russian-speaking community in the total population (VK.com is based in Russia) render Estonia a good study area for social media and CES-related studies. Moreover, the diverse environmental conditions and numerous protected areas in Estonia enhance opportunities for analyses from geographic and nature conservation perspectives.

Mapping of Cultural Ecosystem Service (CES) Represented in Social Media in Estonia
To test the applicability of topic modelling for CES identification and classification, we used geotagged photo-series analysis [38], retrieving metadata by means of application programming interface (API) calls (including geographic coordinates, user and photo ID, date of taking, web-links to photographs) for publicly available images uploaded to Flickr.com and VK.com services from 2015 to 2018. Flickr and VK.com continue to provide access to their non-private geolocated content, while Panoramio discontinued its service and Instagram has not shared its data with third parties since 2015. We additionally used the GIS-data for buildings in Estonia [39] to remove the metadata for indoor photographs. In total, metadata for 21,242 geographically outdoor photographs were retrieved and combined into a single dataset. We then applied content image recognition to these photographs with automated Python API requests to Clarifai's service (Clarifai Inc., New York, NY, USA). We used the general model with a cut-off greater than 90% for the probability that the tag is correct.
We then tested topic modelling (Latent Dirichlet Allocation (LDA) algorithm) implemented in the Orange data mining software [40] to classify the tags into a number of topics and deleted the irrelevant ones (assuming that photographs sharing the same tags represent the same "topic"). As a result, the pre-processed dataset consisted of 9983 photographs. After some initial testing, we decided on three topics for the LDA analysis. The LDA algorithm was useful in two aspects: (a) identification of the non-relevant photographs (for example, we removed the photographs, sharing topics of tags related to driving and cars, indoor design, architecture, fashion and beauty, military service) and (b) identification of the relevant topics in the rest of the photographs' tags.
As the LDA algorithm calculates the probability score, indicating the likelihood of the set of tags for each photograph belonging to each topic, we assumed that the assigned photographs belong to the topic with the highest probability score. Owing to the potential overlap with this fuzzy distinction between the topics of each photograph, we decided to post-process the results manually by interpreting the context of each photograph in addition to its content. For instance, photographs of pets were transferred from the topic of wildlife watching to outdoor recreation, and photographs with a minor presence of people or their recreation-related equipment were moved from landscape watching to outdoor recreation. We devised an a priori hypothesis about the small number of relevant CESs, according to the CICES classes (3)(4)(5), and the very first test of LDA algorithm resulted in three relevant CES-related topics. In case we applied LDA with a higher number of intended topics, some minor subclasses of recreation appeared, but these minor classes of recreational CES are beyond the scope of this study. We identified the following topics corresponding to the groups of CESs: a.
Landscape watching. This consists of the following tags: nature, outdoors, landscape, tree, nobody, wood, sky, travel, water, and summer (6154 photographs; 17 manually transferred from topic 3).
b. Active outdoor recreation. This consists of the following tags: people, recreation, adult, fun, man, leisure, outdoors, one, sport, and action (2345 photographs; 770 manually transferred from topic 1, and 114 from topic 3). c.
Wildlife watching. This consists of the following tags: nature, outdoors, no one, flora, leaf, wild, wildlife, season, animal, growth (1484 photographs; 124 manually transferred from topic 1, and 2 from topic 2).
We mapped CES use from the photo locations to examine whether people (subconsciously) consider some selected aesthetic landscape attributes that represent landscape organisation [34,36], and these attributes can be derived from remote sensing data [41].

Impact of Landscape Organisation on CES Use
The colour harmony of land cover is a landscape attribute often neglected in landscape studies [41] but is potentially responsible for visual landscape quality [35] and is assessable using remotely sensed data. We used Landsat 8 OLI cloudless summertime mosaics for the territory of Estonia with a 5 km buffer zone pre-processed with the Google Earth Engine. The red (B4), green (B3), and blue (B2) bands, corresponding to the natural colours band combination, were converted into the hue-saturation-value (HSV) colour space to quantify colour harmony. We assumed that the hue and chroma (saturation in HSV space) similarity, which is listed among the universal principles of colour harmony [42], can be quantified for the hue and saturation raster datasets. Such assessment can be done using the grey level co-occurrence matrix (GLCM) homogeneity index (GLCMH, Equation (1)) [43], which measures the similarity of image pixel pairs [44] (Figure 1a,b): where P(i,j) is the probability of co-occurrence of pixels i and j, and N g is the number of distinct grey levels in the quantised image (64 in this study).
Land 2020, 9, x FOR PEER REVIEW 4 of 19 c. Wildlife watching. This consists of the following tags: nature, outdoors, no one, flora, leaf, wild, wildlife, season, animal, growth (1,484 photographs; 124 manually transferred from topic 1, and 2 from topic 2). We mapped CES use from the photo locations to examine whether people (subconsciously) consider some selected aesthetic landscape attributes that represent landscape organisation [34,36], and these attributes can be derived from remote sensing data [41].

Impact of Landscape Organisation on CES Use
The colour harmony of land cover is a landscape attribute often neglected in landscape studies [41] but is potentially responsible for visual landscape quality [35] and is assessable using remotely sensed data. We used Landsat 8 OLI cloudless summertime mosaics for the territory of Estonia with a 5 km buffer zone pre-processed with the Google Earth Engine. The red (B4), green (B3), and blue (B2) bands, corresponding to the natural colours band combination, were converted into the huesaturation-value (HSV) colour space to quantify colour harmony. We assumed that the hue and chroma (saturation in HSV space) similarity, which is listed among the universal principles of colour harmony [42], can be quantified for the hue and saturation raster datasets. Such assessment can be done using the grey level co-occurrence matrix (GLCM) homogeneity index (GLCMH, Equation (1)) [43], which measures the similarity of image pixel pairs [44] (Figure 1a-b): where P(i,j) is the probability of co-occurrence of pixels i and j, and Ng is the number of distinct grey levels in the quantised image (64 in this study). It should be mentioned that colour harmony depends on "how strongly an observer experiences the colours in the combination as going or belonging together, regardless of whether the observer It should be mentioned that colour harmony depends on "how strongly an observer experiences the colours in the combination as going or belonging together, regardless of whether the observer likes the combination or not", (p. 551, [45]). Therefore, it is rather a component of the formal landscape aesthetics and, in theory, does not necessarily reflect landscape preferences.
We interpret landscape coherence as a degree of order inherent to the landscape pattern that is composed of diverse and distinct landscape elements and features [46]. Landscape coherence is one of the classic subjective landscape attributes responsible for the emergence of landscape values [33]. Increasing the landscape coherence extent generally leads to a moderate increase in landscape preferences [34]. Therefore, we assess the vertical landscape coherence using the landscape coherence index (LCI, Figure 1c, Equation (2)) proposed by Karasov et al. [34], which is based on the concepts of the emergent theory of information, as presented by Lutsenko [47]. We calculate the LCI within a circular neighbourhood of seven pixels for the CORINE land cover model and the Topographic Position Index (TPI) landform classification, obtained with the respective SAGA GIS module [48].
where LCI is the landscape coherence index; I land cover and I landforms are the Hartley functions for the land cover/land use (LU/LC) model and the TPI-based landform classification based on the digital elevation model [43], respectively; and I landscape is the Hartley function for the parametric composite (digital landscape model) of the LU/LC model and TPI-based landforms. The landscape coherence index benefits from the feature of additivity of the Hartley function (Equation (3)), which is a particular case of Shannon's information entropy (Shannon diversity index): where m is the total number of observations (landscape or land cover classes, types of landforms), and n is the number of observations in neighbourhood i. The logic of landscape coherence calculation is based on the following assumption: for independent landforms and land cover, the algebraic sum of the amount of information, according to the Hartley function for landforms and land cover, will be equal for the amount of information for their parametric composite or digital landscape model. If the landforms and land cover models, which compose the digital landscape model, interact and are not independent, the summarised Hartley functions for these datasets will give a smaller value than the value of the Hartley function for the pixels of a digital landscape model. The ratio between Hartley functions for the digital landscape model and its components highlights the extent of systematic features of landscape and can be related to landscape preferences. Hypothetically, the increase in landscape coherence contributes to the visual landscape quality and therefore to CES use.
We then performed a viewshed analysis, identifying the visible surface from the set of observation points, namely the geolocations of the selected photographs from each group of CESs (see Section 2.2) and for the same number of randomly selected locations, which serve as pseudo-absence data ( Figure 2). We used the PixScape software [49] on the European Digital Elevation Model (EU-DEM), version 1.1 [50] with the maximum visible distance and observer height set to 5 km and 1.6 m, respectively. The median LCI, hue homogeneity, and saturation homogeneity were calculated for each viewshed and compared between the actual (presence) and random (pseudo-absence) geolocations using Wilcoxon's rank-sum test with continuity correction (see Appendix A for details), implemented in the Exploratory software (Exploratory Inc. (Delaware US) Sacramento, CA, USA).  Figure 3 presents the results of the CES mapping obtained with the application of topic modelling (geographical coordinates of the photographs from the combined Flickr and VK.com dataset, classified into three categories of CES groups). The clear linear patterns of the photographs highlight the main flows of people alongside the main roads and coastlines of Estonia. The exploratory buffer analysis for OpenStreetMap road data indicates that transport accessibility is extremely important for CES use. To be precise, 6,148 out of 6,153 landscape-watching photographs, 2,311 out of 2,345 outdoor recreation photographs, and 1,483 out of 1,484 wildlife-watching photographs have been taken no farther than 500 m from the roads and trails of all types. Although indoor photographs have been removed from the analysis (see Section 2.2), many photographs were taken in the main cities (Tallinn, Tartu, Narva, etc.), especially in their suburban zones. Additionally, the protected areas are conspicuous as approximately 59% of the total number of selected photographs were taken within these regions. A full list of the protected areas is presented in the Table S1 (Supplementary materials).  To be precise, 6148 out of 6153 landscape-watching photographs, 2311 out of 2345 outdoor recreation photographs, and 1483 out of 1484 wildlife-watching photographs have been taken no farther than 500 m from the roads and trails of all types. Although indoor photographs have been removed from the analysis (see Section 2.2), many photographs were taken in the main cities (Tallinn, Tartu, Narva, etc.), especially in their suburban zones. Additionally, the protected areas are conspicuous as approximately 59% of the total number of selected photographs were taken within these regions. A full list of the protected areas is presented in the Table S1 (Supplementary Materials).

Mapping of CES Represented in Social Media in Estonia
An exploratory analysis of land cover (CORINE land cover 2018, Figure 4) shows that most of the photographs were taken in coniferous forests, agricultural areas, mixed forests, and transitional woodland-shrub areas. All the CES groups under consideration are well represented in these land cover classes. On the contrary, water bodies and courses, sea, peat bogs, inland marshes, and natural grasslands are frequented more for landscape watching than for the other groups of CES. Outdoor recreation is present in complex cultivation patterns and green urban areas. Wildlife watching frequently occurs in broad-leaved forests and pastures. In this way, more "natural" land cover classes are much better represented in the study datasets of passively crowdsourced photographs. However, land, which is principally occupied by agriculture, is among the leaders in enabling CES use. An exploratory analysis of land cover (CORINE land cover 2018, Figure 4) shows that most of the photographs were taken in coniferous forests, agricultural areas, mixed forests, and transitional woodland-shrub areas. All the CES groups under consideration are well represented in these land cover classes. On the contrary, water bodies and courses, sea, peat bogs, inland marshes, and natural grasslands are frequented more for landscape watching than for the other groups of CES. Outdoor recreation is present in complex cultivation patterns and green urban areas. Wildlife watching frequently occurs in broad-leaved forests and pastures. In this way, more "natural" land cover classes are much better represented in the study datasets of passively crowdsourced photographs. However, land, which is principally occupied by agriculture, is among the leaders in enabling CES use.

Impact of Landscape Organisation on CES Use
As is clear from Figure 5, we see that the median hue and saturation similarity values are remarkably higher for the actual rather non-vegetated (median value of the normalized difference vegetation index (NDVI) lower than 0.1) viewsheds corresponding to landscape watching and outdoor recreation than for the pseudo-absence viewsheds. The indicators used exhibit similar behaviours for the landscape watching and outdoor recreation viewsheds, whilst colour harmony

Impact of Landscape Organisation on CES Use
As is clear from Figure 5, we see that the median hue and saturation similarity values are remarkably higher for the actual rather non-vegetated (median value of the normalized difference vegetation index (NDVI) lower than 0.1) viewsheds corresponding to landscape watching and outdoor recreation than for the pseudo-absence viewsheds. The indicators used exhibit similar behaviours for the landscape watching and outdoor recreation viewsheds, whilst colour harmony does not seem to influence wildlife watching. According to the Wilcoxon rank sum test with continuity correction, all the distribution differences except for colour harmony indices for wildlife watching are statistically highly significant, suggesting that most CES-related photographs were taken with consideration for land cover of higher According to the Wilcoxon rank sum test with continuity correction, all the distribution differences except for colour harmony indices for wildlife watching are statistically highly significant, suggesting that most CES-related photographs were taken with consideration for land cover of higher colour harmony and landscape coherence ( Figure 6). It is highly likely that colour harmony values affect landscape watching and outdoor recreation, while landscape coherence seems to have a clear positive influence on wildlife watching and a weaker positive influence on landscape watching and outdoor recreation.
Land 2020, 9, x FOR PEER REVIEW 10 of 19 colour harmony and landscape coherence ( Figure 6). It is highly likely that colour harmony values affect landscape watching and outdoor recreation, while landscape coherence seems to have a clear positive influence on wildlife watching and a weaker positive influence on landscape watching and outdoor recreation. A visual exploration of land cover with regard to LCI and colour harmony indices reveals a complementary character of the considered landscape organisation indices (Figure 7). Landscape coherence is the highest for culturally modified land covers-urban fabric, urban green areas, and agricultural areas-and lower for natural areas, the minimum being observed for peat bogs and water Figure 6. Density plots representing the results of the Wilcoxon rank sum test with continuity correction, applied to the medians of landscape coherence and harmony-based visual quality indices for each group of CESs within viewshed areas for actual geotagged photographs ("real") and randomly simulated locations ("random"): (a) landscape watching; (b) outdoor recreation; (c) wildlife watching. Significance levels: *** p-value less than 0.001; ** p-value less than 0.01; ns-not significant. Alternative hypothesis: two-sided. Confidence level: 0.95.
A visual exploration of land cover with regard to LCI and colour harmony indices reveals a complementary character of the considered landscape organisation indices (Figure 7). Landscape coherence is the highest for culturally modified land covers-urban fabric, urban green areas, and agricultural areas-and lower for natural areas, the minimum being observed for peat bogs and water bodies. Colour harmony, in contrast, is the highest for water forest and peat bogs. Therefore, colour harmony and landscape coherence extents are highly dependable on the land cover type: higher cultural modification of landscape results in the increasing orderliness and complexity, while colour harmony increases for homogeneous and predominantly natural (while often managed) land cover.
Land 2020, 9, x FOR PEER REVIEW 11 of 19 bodies. Colour harmony, in contrast, is the highest for water forest and peat bogs. Therefore, colour harmony and landscape coherence extents are highly dependable on the land cover type: higher cultural modification of landscape results in the increasing orderliness and complexity, while colour harmony increases for homogeneous and predominantly natural (while often managed) land cover.

Mapping of CES Represented in Social Media in Estonia
Our results contribute to addressing the challenge of CES mapping by studying the relationships among the three categories of outdoor geotagged photographs and remote sensing-based landscape characteristics [24]. The distribution pattern of the CES-related photographs is in line with previous findings [23] as we confirm transport accessibility and naturalness to be the main factors influencing the probability of taking outdoor photographs [51]. Photographs from different CES groups often overlap spatially, indicating landscape multifunctionality. Landscape multifunctionality is important for the overall distribution of landscape values; hence, our approach can contribute to evidence-based trade-off analyses and the detection of hotspots of cultural landscape functions through CES patterns [52,53]. There is also a synergy between our nationwide CES mappings and the ESMERALDA project [54]. Our cross-disciplinary approach, integrating bio-physical and socio-cultural methods, allows for CES mapping and assessment across various spatial and temporal scales and is applicable to both urban and non-urban environments.
Much of the CES use seems concentrated within nature protection areas, revealing the efficiency and efficacy of the nature conservation policy in Estonia as well as the potential for further expansion of protected areas, which can contribute to an increase in nature-based tourism [55]. Thereby, we confirm the LBSM data as a valuable source of data for nature conservation as well as for CES mapping [21,56,57].

Mapping of CES Represented in Social Media in Estonia
Our results contribute to addressing the challenge of CES mapping by studying the relationships among the three categories of outdoor geotagged photographs and remote sensing-based landscape characteristics [24]. The distribution pattern of the CES-related photographs is in line with previous findings [23] as we confirm transport accessibility and naturalness to be the main factors influencing the probability of taking outdoor photographs [51]. Photographs from different CES groups often overlap spatially, indicating landscape multifunctionality. Landscape multifunctionality is important for the overall distribution of landscape values; hence, our approach can contribute to evidence-based trade-off analyses and the detection of hotspots of cultural landscape functions through CES patterns [52,53]. There is also a synergy between our nationwide CES mappings and the ESMERALDA project [54]. Our cross-disciplinary approach, integrating bio-physical and socio-cultural methods, allows for CES mapping and assessment across various spatial and temporal scales and is applicable to both urban and non-urban environments.
Much of the CES use seems concentrated within nature protection areas, revealing the efficiency and efficacy of the nature conservation policy in Estonia as well as the potential for further expansion of protected areas, which can contribute to an increase in nature-based tourism [55]. Thereby, we confirm the LBSM data as a valuable source of data for nature conservation as well as for CES mapping [21,56,57].
Our results continue the methodological approach that has been initiated in previous research [30][31][32]. The LDA topic modelling algorithm significantly facilitated the process of LBSM data assigned with the content-related tags as a result of automated image recognition with Clarifai. Therefore, we confirm that the LDA method of topic modelling is highly relevant and valuable for a rapid assessment of cultural ecosystem services use over large areas [31]. When all the photographs representing non-relevant topics have been removed, we applied the LDA algorithm to the relevant tags only, and some testing showed that three topics of tags sufficiently represent their diversity and meet our needs. The automated character of topic modelling often results in a meaningless classification, so the exact number of relevant topics (3) was found by trial and error (while a priori we assumed that there are just a few major CES categories). Further analysis would result in accounting for minor CES categories, such as picnicking, cycling, or playing tennis outdoors, but such detailed classification was beyond the scope of our study.
Obviously, the proposed methodological combination is not a complete substitute for traditional visual content analysis; instead, it should be used as an initial procedure for CES use assessment and followed with a quick visual verification. For example, we transferred to the outdoor recreation category those photographs that were automatically selected for landscape watching if they contained minor presence of people or their equipment; since presence of pets was automatically interpreted as wildlife (the general machine learning model provided by Clarifai does not account specifically for this distinction), we also manually moved these photographs to the category for outdoor recreation. Photographs with minor presence of wild animals classified as related to landscape watching were also manually transferred to the wildlife watching category.

Impact of Landscape Organisation on CES Use
In line with previous studies, landscape coherence was found to have a positive but rather weak association with CES use in our countrywide study [34]. Vertical landscape coherence increases for places of significant cultural modification (more legible urban and agricultural areas). Thereby, we confirm LCI performance as indicative of the orderliness of the landscape pattern, but unexpectedly it has a rather small impact on CES use. Wildlife watching occurs in places with higher LCI. This is potentially because people are more likely to take photographs of animals, plants, and mushrooms near their homes (such as green urban areas) in some understandable settings rather than in a more natural environment. Other authors have additionally explored the hotspots of wildlife watching near cities [29]. As some photographed areas have higher LCI, compared with the values for random locations, signs of anthropic modification (parks, suburban areas, agricultural fields, and other elements of cultural landscape) can be additionally important for CES use, complementing pure naturalness [58].
Colour harmony indicators (HSV hue and saturation similarities, indicated with GLCM homogeneity) showed a larger difference between the photographed viewsheds and random background viewsheds, suggesting that people tend to take photographs with a preference for land covers of greater colour harmony. However, as there is an association between land cover classes (CORINE land cover 2018) and colour harmony, the bias may be caused by the effect of the land cover itself; for instance, sea, water bodies, forests, and peat bogs additionally have powerful intrinsic and other values. Therefore, our results should be treated with caution, and colour harmony mappings should contribute to the general understanding of landscape rather than perform as the standalone indicators of landscape preferences.

Other Sources of Bias
It is most likely that elderly persons and children are the least represented age strata in LBSM. However, Flickr and VK.com were launched in 2004 and 2006, respectively, and have become very popular among diverse user groups, while general Internet penetration in Estonia is growing also [59]. We can expect that in the coming years, LBSM social media will become more orientated towards elderly people owing to the regular ageing of active Internet-users. Unfortunately, the LBSM data-unlike surveys-provide little or no information on the individuals' sex and gender, age, education level, family status, ethnic origin, etc. Nevertheless, the LBSM data are free from some survey-specific issues, such as recollection and mind biases, which occur owing to intrusive surveying [52,60]. Therefore, in our opinion, social media data provide added value to CES studies.

Conclusions
Our results are based on photographs uploaded to the social media sites, Flickr and VK.com, which can be used to represent the actual use of some CESs (landscape watching, outdoor recreation, and wildlife watching), and are linked to spatial landscape indices in Estonia. Their spatial analysis enables a better understanding of the geographic organisation of the environment and its potential for providing CES and supporting nature appreciation in an urbanised society [61]. Evidence from our study suggests that social media users prefer taking photographs of landscapes and outdoor activities in areas with greater colour harmony, whilst landscape coherence is linked strongly only to wildlife watching and, to a lesser extent, other CESs.
Topic modelling significantly reduced the time needed for the content analysis of the photographs, and our CES mapping depends on the quality of this automated image content analysis. Therefore, future research could be targeted towards comparing different machine learning algorithms and including the temporal component. The suggested methodological combination of machine learning and natural language processing algorithms advances the existing common methods of CES assessment based on passively crowdsourced photographs, and it is sufficiently robust to be applied across the regional, continental, and global scales. In turn, the test of GIS-based landscape organisation metrics in relation to CES use shows that they can also facilitate the prospects of rapid and reliable landscape visual quality assessment up to the global scale, which does not depend on local subjective landscape evaluations and complements regional landscape character assessment. Drawbacks of the approach are related to the representativeness of the social media data as a source of knowledge about CES use and also to limitations of the GIS and remote sensing applicability for physio-gnomic landscape research. Notwithstanding, we have demonstrated that the combined usage of LBSM data, automated image recognition, natural language processing, satellite imagery, and GIS data is highly relevant for evidence-based ecosystem management and nature protection.  randomly generated geolocations. The landscape coherence index was rescaled (0; 1) to meet the scale of colour harmony estimations in Results.