Our findings generate multifaceted implications based on the analyses of SMD and conventional survey data. The results directly provide design recommendations for public parks in China and work as a pilot effort to compare the similarities and differences of SMD in landscape research.
4.1. Understanding the Similarity between SMD and Survey Data
The high degree of similarity of results drawn from SMD and survey data suggests that the two overlapped greatly in reflecting the attractiveness of grant-size urban parks, such as the Beijing Olympic Green, as assessed by daily users. The similarity between the two confirms the capacity and great promise of SMD in assessing landscapes as there are large volumes of data available online which implicitly demonstrate users’ attitudes and emotions. The value of SMD in landscape assessment can go far beyond attractiveness as texts and images represent a wide variety of social factors such as aesthetics, security, social, health and satisfaction, as suggested by the concept of social sensing in analogue to remote sensing for the purpose of capturing human factors [
47].
Both data sources endorsed attractiveness theories, though our findings generated combined results from both landscape and previous tourism research. The external factors identified from our study are similar to those from tourism research [
23,
24,
26] and the internal factors are in the same vein as previous landscape research [
28,
29,
31]. This suggests that large parks are considered both destinations and enjoyable places. Design strategies to enhance the attractiveness of large parks should embark upon both external and internal features.
Natural elements, such as water and plants, are the major draws for everyday uses. In a big city, people come to the parks mainly because of the natural atmosphere and the activities they can perform. The importance of naturalness to attractiveness has been suggested by many previous studies [
48,
49], though Gobster [
50] and Williams and Cary [
51] reported no clear relationship and even found negative relationships between naturalness and attractiveness. The context of a high-density city suggested a need for naturalness to balance the artificial landscape to be more attractive.
Two reasons can be tracked for the lower ranking of these elements. First, intangible elements tend to be less attractive. Cultural metaphors that inspired the designs and won the design competition were not reported as attractive, because these ideas were not converted to site-scale environments that could enrich human experiences. People either think cultural elements are not as important as their immersive experiences in the natural atmosphere, or these elements are not perceived as important by park users.
Second, artificial elements are less likely to be considered attractive. These include not only structures and facilities but also artificial programs that use natural elements, such as artificial waterfalls and flower beds. Artificial elements necessitate big financial inputs to build initially and required frequent maintenance. The relative low attractiveness and high inputs may inform park decision-makers about such programs. For a big urban park where citizens may spend longer hours, attractiveness is more associated with specific experiences and activities. Cultural metaphors may still be important in the design process. However, they can be more attractive to citizens if converted into usable spaces. Future park renovation can better represent Chinese culture by adding more tangible components such as theatrical plays, music concerts and culturally relevant activities in open air spaces.
4.2. Understanding the Disparity between SMD and Survey Data
While many scholars suggest the merits of emerging SMD, especially the availability of huge volumes of samples, this study provides a detailed multifaceted view of SMD and survey data on the same study subject—attractiveness. While recognizing SMD can serve as a suitable source as well as an efficient process for landscape social evaluation in much the same way as other data sources [
15], we may also see limits based on the disparities.
Table 7 summarizes fundamental differences between SMD and conventional survey data as identified by our study and others. Surveys can be considered a small data source as sampling size must exceed 1000 unique data points to be considered large. SMD can easily provide volumes of data exceeding 10,000 data points. However, surveys may contain more profound and detailed information, while SMD has much lower information resolution and highly redundancy. If future “deep learning” can solve the technology, analysis and methodology challenges of using big data [
14], two intrinsic characters of SMD that may convey conflicting demands and opinions in the “real world” [
15] desire further attention.
First, SMD is selective data associated with the social media users’ emotions and pride. The massive volume of sampling size of SMD does not change the fact of this data source’s selective generation mechanism. Social media users prefer to input information that is more likely to be read by friends and web viewers. Hence, SMD partially reflects the essence of the phenomenon it records and should never be deemed as a perfect data source. In this study, the fact that that sunflowers were reported with higher attractiveness than lawn from SMD suggests that people might like to share more exciting scenery. Though lawns are suggested as of higher attractiveness from survey data, these “normal” scenes may not stimulate the social media users to mention them in their posts. For the same reason, the activity of running, extracted from SMD, is evaluated as a more attractive factor; while the survey data suggests that walking is a more attractive activity in the park. In this selective process, the true fact of attractiveness is represented by the excitement of particular programs.
At the same time, the emotionally selective process of SMD may conceal important information. In this study, survey participants reported that free admission was a highly attractive factor for a park visit. However, it is barely mentioned in SMD, because it might be considered a little disgraceful to mention free admission as part of communicated information. The degree of excitement constitutes a filter for social media users to unconsciously select from their experiences and put up a post. Hence, more commonly used factors can be underrated when SMD data is extracted and analyzed. The emotional selection process through social media can induce an exaggerated and aggregated bias of SMD.
Second, SMD is unstructured and volunteered data. Compared with a participant who takes a questionnaire survey, the generation process of SMD lacks a method for comparison and reasoning process. The SMD is extracted from the posts freely uploaded by social media users. When posting the information, they do not necessarily compare one activity or program with others.
By contrast, in an organized and categorized formatting, survey participants are always asked to compare and synthesize the given items. For example, in evaluating the attractiveness of running and walking, survey participants realize that walking is an activity more frequently participated in than running, when the questionnaire presents the two items simultaneously. Nevertheless, without a comparison framework provided by the survey, social media users are more likely to create posts about running, which better shows off a healthier life style. In this sense, the appearance of terms in SMD is “isolated” from other terms, because they are not purposefully compared and weighted. When SMD is used as a data source, for example, the importance of running becomes meaningful by itself. In the analysis process, the factors of different “isolated meanings” are forced to be rated equally attractive when frequency is defined to represent attractiveness. Social meanings of SMD are always inferred.
When used in social perception research, SMD is a data source that can be extracted rather than collected. Unlike survey, observation and experiment data, researchers cannot predetermine the format, amount and richness of the data source. SMD features the large sampling size, the ease of acquiring data and no need for interaction with human subjects, all of which suggests the efficiency of acquiring data. However, disregarding the generation mechanisms and the resultant limits can lead to bias, or even errors, which could be risky when trying to draw reliable conclusions. Since SMD is passive, the researcher cannot alter the method of acquiring SMD. Better understanding of their generation process and the research assumptions about the appropriateness of SMD are urgently needed [
52]. On the other hand, given its thinness in meaning, thickness in number and lack of control from researchers, SMD can still provide high numbers of profiles of phenomenon, which broadens the scope of other data sources.
4.3. A Combined Evaluation Method
This study investigates characters of SMD as used for academic research. While fully recognizing the opportunities SMD brings out in understanding social phenomena with massive amounts of data, we reconciled possible limits of SMD by analyzing the disparities between results from SMD and survey data. It is worth noting that, despite all these instinct limits of SMD, there are many similarities between the findings from SMD and survey data, particularly in identifying keywords.
Inspired by the framework to integrate crowd-sourcing data in urban planning [
53], this study recommends a landscape assessment framework that combines data sources in places with abundant SMD based upon the strength and weakness of SMD and survey data. This would enable the researcher to: (1) Utilize information from SMD to identify keywords; (2) Build up an assessment structure based on keywords from SMD; (3) Conduct basic landscape assessment based on SMD; (4) Apply conventional surveys for supplementary and detailed information, particularly regarding prosaic yet important factors. Using appropriate techniques, the ease and cost-effective characteristics of SMD can offer relevant and useful information quickly. Supplementing this with a time-consuming survey would solve the lower information resolution, questionable reliability and data dependency of SMD. Both SMD and surveys are suitable and non-replaceable data sources that can offer different user perspectives, yet provide complementary information.
It is also worth mentioning that this article compared the two data sources based on a population aged 19–35. This group constitutes the major users of social media in China. In the combined evaluation method, the lack of any other population in SMD cannot be compensated for even by a traditional survey. Whenever this type of combined evaluation method is used, the limited representativeness of SMD in terms of user population should be noticed.
4.4. Future Research
Though our research is carefully designed, the conclusions are still subject to some limitations that merit further research attention. The first challenge is the selection of appropriate coverage of social media including better representation of stakeholders and primary issues [
15]. This study only investigates available and accessible social media networks. There are other social networks, such as Sina blog, not used in this research due to technical challenges in retrieving related data. In the future, comparative studies of SMD from different sources are urgently needed.
Second, the methodology used in analyzing SMD may also need to be further refined. This study assumes texts with positive emotional inflections are indicators of attractiveness, which might be true most of the time, with some biases due to the unlabeled features of SMD. For instance, the area surrounding Bird’s Nest and the Water Cube are mentioned frequently and their mentions are associated with a positive mood. Yet it is difficult to tell whether people mention them because they are attractive or simply use them as a location reference. A stronger unsupervised selection technique is needed to analyze these unlabeled, unstructured and inherently linked datasets online [
54]. This study focuses on positive feedbacks only, yet negative responses may provide much more instructive information about on-site problems, users’ unsatisfied desires and thus outline needed changes in renovation or restoration. Negative emotions and descriptions deserve more scrutiny in future research.
Last but not least, due to the challenge of extracting data, only textual data is used in this study. Some may argue that visual information is more representative in recording the perceptions of social media users. And, visual information suggests not only the program names but also the means, angles and the way they are used. Therefore, more advanced extraction tools need to be developed in the future to satisfy this need.