Human Thermal Conditions and North Europeans’ Web Searching Behavior (Google Trends) on Mediterranean Touristic Destinations

This paper examines the relationship among the outdoor thermal conditions in northern European countries and their individuals’ web searching frequency on summer holidays and touristic destinations. While previous studies have examined biometeorological conditions’ impacts by comparing statistics from police and hospital archives, this paper focuses on the Big Data gathered by Google and delivered by Google Trends service as a new and promising species of data. The correlation of those two factors (thermal conditions and searching frequency) indicates the connection between the atmospheric thermal conditions and the alteration of human behaviour and desires. The analysis and the visualization of time-series, longer than a decade, retrieved by five countries, reveal the anticipated seasonal covariance and a striking impact of the thermal conditions on the searching behavior of the individuals. Additionally, the paper introduces a new field of combined utilization of web searching activity and atmospheric data.


Introduction
Global economy, trade and other human activities are affected by the behaviour and the psychology of the individuals [1][2][3].Knowing the influential factors, which contribute to the perception, behaviour and psychology, a positive reinforcement to the economic cycle could be possible [2,4,5].Human behaviour is affected by a wide set of factors, some of them being the social and environmental conditions.The expression of behaviour is frequently reflected as wishes, desires and preferences which drive to everyday choices and actions.Scientists have already described the influence of the environmental conditions, especially weather, on the psychology and behaviour via conscious and subconscious functions [6][7][8][9].The essential datasets for the analysis of weather influence on human life are the meteorological data, along with archives of medical registries, psychiatric hospital databases, and police stations' records [9][10][11][12][13][14][15][16][17] or structured questionnaires completed by individuals in open public spaces [18][19][20].The major drawback of those datasets is that are fragmented and more or less biased by several known and unknown factors [9,21,22] such as the unknown physical, mental and psychological conditions of interviewees along with their acclimatization level when we referred to structured questionnaires [7,23].An essential part of individuals' lifestyle (as a part of their behaviour) is the choice of summer vacation destinations.It is known that tourism, especially summer tourism, can be described by the Triple S (Sun, Sea and Sand), which is an abundant feature of the Mediterranean region [24,25].Many of the major tourism factors for the triple S depend on weather and local climate.Tourist decision making with respect to the destination choice could be supported by specific indices such as, among others, the Tourism Climate Index (TCI) developed by Mieczkowski [26], the Climate Index for Tourism [27] and the Climate-Tourism/Transfer-Information-Scheme [25,28,29].Further, the application of new tools and applications such as Decision Support Systems (DSS) takes into account distinct criteria in a decision making process about preliminary ratting destinations [30].DSS are usually refers to demand-oriented systems such as destination management or consumer-oriented travel-counselling systems [31].A DSS is usually built to support the solution of a certain problem or to evaluate an opportunity, through the design of computer models and the simulation of real-life experiences [32].DSSs continue to improve the quality of decisions by standardizing the process and logic information managers' choices and making the criteria for determining appropriate outcomes systematic [33].
To investigate the relation between the outdoor thermal conditions and the individuals' desires related to the summer holidays, we examined the searching frequency of specific keywords as retrieved by the Google Trends service and the values of human thermal index Physiological Equivalent Temperature (PET) during the simultaneous time period.The summer holidays terms, thereafter keywords, were a set of famous summer holiday destinations over Mediterranean Sea and northern coastal European regions, accompanied by some very common words linked with the summer.The results indicate a clearly positive relation between the searched keywords and the PET values.

Data and Methodology
Calculations and analysis procedures, were utilized by two types of datasets.The atmospheric conditions dataset and the simultaneous web searching recordings' dataset.For the evaluation of the outdoor thermal conditions the individuals were exposed to when they were searching the specific keywords, we utilized the METAR (Meteorological Terminal Air Report) data of the Helsinki, Berlin, Dublin, Oslo, Stockholm and London areas respectively.The biometeorological conditions were assessed by the well-known human thermal index Physiological Equivalent Temperature (PET) [25,[34][35][36].This index is considering the human energy balance and can describe with adequately the human sensation under the recorded meteorological conditions.Additionally, the PET concept has no climatic restrictions, so the index is applicable under hot, cold, warm or dry atmospheric conditions as well [37][38][39].Moreover, the free and high performance biometeorological software RayMan was utilized to estimate the PET values for the purposes of this study.The main advantage of RayMan, apart from the widely approval of the scientific biometeorological community, is the rapid calculation process, the user-friendly interface and the accuracy and efficiency of the results [40][41][42][43][44].The data we input in RayMan for the calculation of PET were air temperature and humidity, wind speed, cloud coverage and the geographic characteristics of the meteorological station location.The maximum and minimum PET values were estimated on a daily basis and after that the mean PETmax and PETmin were calculated on a weekly basis to have comparable data with Google Trends.
The internet search machines such as Google, Yahoo, etc., are a part of the contemporary way to seek and find goods and/or services to fulfil our desires.Searching behaviour of the individuals were recorded by Google, the most popular searching machine among the internet users worldwide (according to statista.com[45] and Wikipedia.com[46]), which justifies our choice of Google Trends as the source to retrieve queries data to address the research concerns of this study.Google Trends returns the usage volume of a particular search term (keyword) for a specific region of the world over a defined period.Search-term hits are recorded at the spatial resolution of individual cities on a second administration level area at the temporal resolution of a week or month [47].Generally, according the Google company "Search results are proportionate to the time and location of a query: Each data point is divided by the total searches of the geography and time range it represents, to compare relative popularity" according to Google [48]."Otherwise places with the most search volume would always be ranked highest.The resulting numbers are then scaled on a range of 0 to 100 based on a topic's proportion to all searches on all topics.Different regions that show the same number of searches for a term will not always have the same total search volumes".A query in Google Trends first returns a world map of the search-term hits per country and a monthly time series of the search-term hits dating back to 2004.By default, the results returned by Google Trends are rescaled by dividing the search-term hits obtained for a given week by the maximum number of hits obtained at any moment over the period of interest.Query results are accessed by logging into a Google account and downloading a csv file of the data.Manually downloading the many files generated by entering separate search-term queries is cumbersome [47].Hence, we utilized an R package called "rgtrends" which allows the rapid retrieval of Google Trends data and a better data manipulation.
The selected countries for this study of the searching behaviour of their individuals are Finland, Germany, Ireland, Norway, Sweden and the United Kingdom.The criterion for the selection is the individuals' touristic preferences and their usage of English language as a common language in Google searching keywords.Additionally, the retrieved dataset from Google Trends, taking into account the time period from 4 January 2004 to 21 May 2016.
The first keywords' group that was investigated is "Crete", "Ibiza", "Majorca", "Dubrovnik" and "Bodrum" representing famous Mediterranean summer touristic destinations along with some northern summer destinations such as "Normandy", "Galicia" and "Donostia/San Sebastián as the second keywords' group.In addition, five popular and obviously connected to the summer period keywords were investigated such as "beach", "sun", "weather", "holiday" and "sea", as the third keywords' group.
The estimation of the PET index by the METAR weather data and the retrieval of Google Trends frequencies lead to a statistical analysis of the times series in pairs (the searching frequency of the keyword along with the simultaneous PET values).The correlation of the thermal conditions (as a function of PETmax and PETmin on a weekly basis) and the searching behaviour was examined to draw conclusions concerning the impact of thermal environment on searching behaviour.

Results and Discussion
The analysis of Google Trends along with the PET data reveals a striking influence of the weather conditions on the searching frequency of specific popular Mediterranean destinations and related keywords.Apart from the obvious seasonality, the time series (Figures 1 and 2) demonstrates a covariance among the normalized PET values and the keywords' normalized searching frequency.The intra-annual variation of the presented pairs is typical for all examined 130 pairs.The most interesting characteristic is that the lines' patterns show similarities during all seasons, not only during the summer.Especially during the spring (which is the right rising side of the bell curve for each year), the fluctuation of PETmax is accompanied by the analogous fluctuation of the keyword searching frequency.
Regarding the calculated correlation coefficients (Table 1), the higher values are recorded between the normalized PETmax time series and the Mediterranean destination.Considering the correlation matrix of the perspective of Countries, United Kingdom and Germany record the higher coefficient between PET values and searching frequency of the Mediterranean destinations.In contrast, the northern touristic destinations (Normandy, Galicia and Donostia/San Sebastian) are fairly correlated with the PET max or min values.The third keywords' group (beach, sun, weather, holiday, sea) is a mix of high (Germany PETmax vs. beach) and low (Norway PETmin vs. holiday) correlation coefficient's values.On the other hand, the PETmax normalized time series record higher correlation coefficient than PETmin for every keyword.The strength of correlations does however considerably differ among the PETmax and PETmin, which can most likely be attributed to the meaning of the keywords which are clearly connected with warm thermal conditions.
Another interesting characteristic is that there is no time lag between the two lines.This is an indication that the effect of the rising PET value on the searching frequency is almost simultaneous.The selection of Google data gave the research some unique advantages, because using this web service, individuals are able to search without any intervention, instructions or prompts, thus the recorded responses and reactions are most likely spontaneous [47,[49][50][51][52][53][54][55].
Often, there is no presence of any other person to influence the searched terms (keywords) in contrast to the conventional studies with structured questionnaires.The searching procedure usually takes place under the conditions that the applicant prefers and at the time each one chooses to use Google machine.Contemporary technology, such as smartphones and tablets, allows searching at any time at any place [56].Moreover, the searching procedure often discloses subconscious mind activity which eventually may be apocalyptic of the mood and psychological state.Finally, the spatial and temporal resolution is very high and the cost of usage is practically zero.All the aforementioned characteristics give Google Trends a considerable advantage as a research tool.
The limitations of this study are related with the characteristics of the data.Google Trends' temporal resolution is weekly or monthly, so we have to investigate the influence of outdoor thermal conditions on searching behaviour in weekly basis and not in daily basis.Moreover, the METAR are carried out from airports which are frequently located far from the urban fabric where the population is dense.The presented results are the part of the preliminary study which is the beginning of a forthcoming in deep analysis concluding higher spatial resolution and more countries.

Conclusions
Outdoor thermal conditions as quantified by PET index are correlated with the searching frequency of summer tourism and destinations keywords.Specifically, the PETmax values are more closely connected with the summer related searched keywords, as analysis reveals, compared with the PETmin.Additionally, this study notifies the potential of this Big Data dataset (Google Trends) on the biometeorological research as a connection between the atmospheric conditions and the subsequent human decision-making and behaviour.Some striking conclusions are the following: -According to the correlation coefficient the PET max is more influential than the PET min on searching keywords related to summer holidays -As the correlation coefficient and the graphs reveal, there is an obvious relationship between the thermal conditions and the searching behaviour but the strength of this relationship is not equal between all locations and keywords - The graphs and the statistic's tests indicate that the correlation is not only a function of seasonality but a relation between thermal conditions and searching behaviour -According to the analysis so far, there is no time lag between the prevalence of the thermal conditions and the searching of the keywords - The link between thermal comfort conditions and searching the specific keywords is evidence of the behaviour alteration due to biometeorological conditions

Figure 1 .
Figure 1.The normalized Physiological Equivalent Temperature (PET) maximum values and the normalized searching frequency of the keyword Crete for the population of United Kingdom from 4 January 2004 to 21 May 2016.

Figure 2 .
Figure 2. The normalized PET maximum values and the normalized searching frequency of the keyword "beach" for the population of Germany from 4 January 2004 to 21 May 2016.

Figure 1 .
Figure 1.The normalized Physiological Equivalent Temperature (PET) maximum values and the normalized searching frequency of the keyword Crete for the population of United Kingdom from 4 January 2004 to 21 May 2016.

Figure 1 .
Figure 1.The normalized Physiological Equivalent Temperature (PET) maximum values and the normalized searching frequency of the keyword Crete for the population of United Kingdom from 4 January 2004 to 21 May 2016.

Figure 2 .
Figure 2. The normalized PET maximum values and the normalized searching frequency of the keyword "beach" for the population of Germany from 4 January 2004 to 21 May 2016.

Figure 2 .
Figure 2. The normalized PET maximum values and the normalized searching frequency of the keyword "beach" for the population of Germany from 4 January 2004 to 21 May 2016.

Table 1 .
Correlation coefficients from the comparison of normalized Physiological Equivalent Temperature (PET)max and PETmin values and the selected keywords.Red colour indicates higher values and green colour lower values respectively.* Donostia/San Sebastián.