Spatial Diversity of Tourism in the Countries of the European Union

: The aim of the article is to present the spatial diversity of tourism in the countries of the European Union (EU). The main objective of the article can be divided into three immediate goals, each of which is to determine countries that are similar by means of: (1) accommodation base; (2) tourism traffic; and (3) tourism-related expenditures and revenues. In order to group countries, Ward’s cluster analysis method is used. The aim is verified with the use of 2017 United Nations World Tourism Organization (UNWTO) and Eurostat data. The analysis covers all EU member states. The research conducted confirms, inter alia, the key role of the accommodation base in the development of tourism in those countries. A.N.; curation, Mo.R., Mi.R.; analysis, Mi.R., Mi.R.; resources, Mi.R., Mo.R.; visualization, A.N., Mi.R., Mo.R.; writing—original draft, Mo.R., Mi.R., A.N.; writing—review and editing, Mo.R., Mi.R., A.N. All authors have


Introduction
The issue of spatial diversity is a subject matter that often draws the attention of economists studying markets. This diversity is often analyzed in regions, macro-regions and countries. The spatial diversity of markets is one of the matters used to assess interrelations and spatial development by the identification of developed and underdeveloped areas. This also applies to the tourism market.
Travel and tourism make up the largest service industry in the world and it is continuing to grow. This industry stimulates Gross Domestic Product (GDP) growth in host countries and contributes substantially to government tax revenues [1]. Worth USD 7.6 trillion, the travel and tourism sector accounts for more than 10% of global GDP and represents 7% of all international trade and 30% of the world's export of services. Tourism receipts provide an important source of foreign exchange for countries around the world, enabling economic growth and investment in a multitude of other sectors. In 2016, tourism grew by 3.1%, outperforming the global economy's growth of 2.5% [2,3].
Tourism is the third largest socio-economic activity in the European Union (EU), and it makes an important contribution to the EU's gross national product and to employment [4]. Europe is also the world's number one tourist destination. Within the global sector, however, Europe is not the fastest-growing region and its market share, in terms of international tourist arrivals and receipts, is shrinking [5]. Europe is ranked as the world's number one destination for international arrivals, USD 713 million in 2018, over half the global total, growing by 6% in 2018. Early indications are that 2019 saw further growth, although at more modest levels than 2018. Tourism creates a surplus for the EU's economy, with international tourism receipts exceeding EU residents' spending on international tourism by USD 27 billion in 2016 [5,6].
Tourism businesses in the EU are confronted with a number of changes in tourists' profiles and behavior. Demographically, tourists in the EU are older than in previous decades. Geographically, a growing number of tourists travelling to the EU come from emerging countries, although the EU's source markets still provide the biggest share of tourists [5,7].
Tourism has an important territorial dimension, with uneven spatial distribution between and within countries, and it delivers localized impacts. The importance of the spatial dimension of tourism is also underscored by findings indicating that tourism growth in one region positively influences tourism in neighboring regions [8], or that public policy can impact the spatial patterns of tourism demand [9]. Therefore, an interesting research issue is the recognition of the spatial diversity of tourism in EU countries. This information may be useful for tour operators, owners of facilities and tourist attractions, and above all for EU and national government leaders. The conducted research will indicate areas with the weakest results, which may constitute valuable information for developing tourism in these countries.
The aim of the article is to present the spatial diversity of tourism in the countries of the EU by means of cluster analysis with the use of Ward's method. Ward's method was selected because of its tendency to produce more evenly sized clusters. Most other measures have a tendency to produce one large and numerous much smaller clusters, which is less useful for spatial diversity in tourism [10]. As part of the main goal, the authors looked for answers to the following questions: • What is the spatial diversity of the accommodation base in EU countries? • What is the spatial diversity of tourist traffic in EU countries?
• What is the spatial diversity of expenditures and revenues in EU countries? • Are countries with similarly developed accommodation facilities characterized by similar tourist traffic?
The individual parts of the article present the theoretical premises for the spatial diversity of tourism, followed by the research part of the spatial analysis. After this, the initial issues of the thesis, theoretical background and research gaps are presented. Section 2 presents an in-depth literature review on how to use data sets to generate tourist statistics, especially spatial relationships. Section 3 discusses the materials and methods. Section 4 deals with the results of the cluster analysis, and is divided into three parts: spatial diversity of the accommodation base, spatial diversity of tourism traffic and spatial diversity of tourism expenditures and revenues. The final part of the thesis concerns the discussion and applications.

Literature Review
Tourism is perceived as a spatial phenomenon that has a great impact on society and various sectors of the national economy, inter alia, the construction industry, transport and trade [11]. One of the elements of tourism development is the accommodation base. There are many studies covering analysis of the accommodation base of individual cities or small regions [12][13][14]. However, there is no research on the spatial diversity of the accommodation base in EU countries. For example, Batista e Silva et al. [9] created a map of tourism capacity in the group of 28 EU countries (EU-28) in 2017 using data from Booking.com and TripAdvisor. They also analyzed tourist density changes in selected months of the year in EU-28 at a city level. Navrátil et al. [15] assessed the impact of various characteristics of the geographic space on the location of tourist accommodation facilities. According to them, hotels create spatial clusters situated mainly in urbanized areas. Hostels are strictly related to towns, and camps and resorts are situated primarily near water resources in warmer areas. This is the reason why they are considered to be a core source for the sustainable competitiveness of a destination. The lack of an accommodation base "acts as a constraint on overnight visitor numbers" [16]. Building up the accommodation capacity is one of the essential parts of the process of planning tourism development for destinations [17]. The location of hotels constitutes part of the development of the regions [18], as well as influencing tourism traffic [19].
Tourist traffic means the spatial movement of people, which is connected with voluntary and temporal changes of residence, environment and the rhythm of life. Within tourist traffic, one can distinguish leisure, sightseeing and specialist tourism [20]. In areas where tourist traffic is developing, in order to satisfy its needs, the natural environment is changed, transport infrastructure is built, and accommodation and catering bases are created [21,22]. As a result, tourism can become a positive and valuable element of the spatial order but also contribute to the degradation of a given area's natural and cultural environment [23][24][25]. Shoval et al. [19] concluded that hotel location has a profound impact on tourist movements, with a large share of the total tourist time budget spent in the immediate vicinity of the hotel. Further, the study illustrated the impact of geomorphic barriers on tourist movements. Some research has examined the importance of location in hotel site selection, especially for urban destinations [18,26,27]. In this case, it results in the creation of several models of hotel location [28,29]. Table 1 presents a list of the selected publications that are subject to analysis in order to identify methods of using databases to generate tourist statistics, and in particular indicators measuring spatial diversity dependencies and the use of cluster analyses. There is a lot of tourism research, but as Xiao and Smith [30] have pointed out, one of the major limitations of research in tourism is caused by the fact that the research is, in most cases, concerned with a single case, location, nationality, etc. Examples of such research can be found in the work of e.g.: Soybali [31], Raun et al. [32] Peng et al. [33] Del Vecchio et al. [34] and Guilarte and Quintans [35]. The scientific publications presented in Table 1 indicate that work is focused on the use of databases in order to develop methods and tools demonstrating the spatial diversity of tourism. The authors use a number of variables to show the spatial diversity of tourism. Some authors define the specialization of tourist regions using indicator and taxonomic methods. Papulova et al. [36] analyzed the economic relations between sub-regions in a coastal area of Greece, and the spatial concentration of economic activities and examination of communities within the sense of socio-economic characteristics, placing emphasis on the analysis of the correlation between employment in the tourism sector and other economic activities. These authors think the geographical allocation of tourist facilities constitutes a broadly applied hint on measuring spatial fluctuations in the tourist industry. It is important because the tourist base constitutes one of the most significant elements of a tourist product that makes it possible to measure it, and data concerning the geographical allocation of the tourist base provide useful elements because of the importance of tourism and its spatial structure. Borzyszkowski et al. [37] carried out an analysis of the spatial diversity of the tourist function development based on the values of Defert's tourist function index (DTFI), which is one of the basic indicators used in tourism geography. The analysis showed considerable differences between the communes in the region examined. This confirms the assumption that the highest tourist function development is typical of seaside communes. DTFI compares the number of tourist beds available in a destination to the total number of residents or hosts in the region (in this article it is the variable X2.). Gawroński and et al. [40] presented an evaluation of the spatial diversity of tourism attractiveness. They believe that an assessment can be made based on the analysis of the statistical data carried out using taxonomic methods (zeroed notarization and the Wrocław taxonomic method).
Świstak and Świątkowska [39] presented an analysis of the spatial diversity of accommodation facilities in Poland and their use by the tourists. The scope of the work included the presentation of the resources and structure of the tourist accommodation base and a general spatial analysis and database on the basis of selected indicators. In the study, the authors used, among others, Defert's tourist function index and the Charvata accommodation density indicator, expressed in the number of tourist beds per 1 km 2 of land (in this paper, the variables X3 and X4).
There are also studies on the spatial diversity of tourism using cluster analysis. Lascu et al. [14] compared the level of tourism in the 17 major regions of Spain and identified the key natural, cultural, and dual attractions using a two-step cluster analysis to ascertain the relative importance of the three types of attractions. Rodriguez and Sanchez [41] claimed that the techniques provided by spatial analysis have become a great ally of tourist planning as they allow exhaustive territorial analyses to be carried out. The authors' present study uses these techniques to study the degree of equilibrium in the distribution of places and its level of occupation in a region. Other authors perform cluster analysis with Ward's method. Navarro Chavez et al. [38] used cluster analysis for the analysis of 14 competitive tourism factors for 20 member countries of the Asia-Pacific Economic Cooperation (APEC). Kolvekova et al. [42] discussed the fusion of 54 regions of Central and Eastern Europe (Czech Republic, Slovakia, Hungary, Poland, Estonia, Lithuania, Latvia, Slovenia, Romania and Bulgaria) into clusters according to the selected tourism indicators which Eurostat uses to evaluate tourism. The authors studied the capacity and occupancy of collective tourist accommodation using mainly numerical data (except nights spent by residents and non-residents per thousand inhabitant and nights spent by residents and non-residents per km 2 ). This approach, in the case of spatial analysis, may give inconclusive results, because the population and area of regions are different. Therefore, all of the variables used in our article are not numerical values but indicators. The literature presents much information about research in the issue of cluster analysis in tourism. The extensive use of Ward's method in tourism is summarized and discussed in Dwyer et al. [43].
Summing up, what is important from the point of view of the aim of the article is the issue connected with the dependencies of the spatial diversity of tourism. The number of articles on the issue is still small (in particular ones providing a comparative analysis of countries). The issue discussed in the article is a new one and has not been fully recognized from the research point of view till now. An EU-wide study should be considered as a research gap. The use of the division of the analysis into accommodation base, tourist traffic and economic factors is a novelty. This will allow the identification of specific tourist differences between EU countries.

Materials and Methods
Cluster analysis is a group of multivariate techniques whose primary purpose is to group objects based on the characteristics they possess. The resulting clusters should exhibit high internal (within-cluster) homogeneity and high external (between-cluster) heterogeneity. Cluster analysis has been used in every research setting imaginable. It can classify different objects: individual people; markets, including the market structure; and analyses of the similarities and differences among new products or country [10]. Therefore, cluster analysis can be used to research in tourism and to show the spatial diversity between countries. The spatial diversity of tourism was verified based on the cluster analysis with the use of Ward's method. It is one of the agglomerative hierarchical clustering methods and is based on the classical criterion of the sum of squares [44]. The division should be carried out in such a way that objects of one group (class) are as similar as possible and those of different classes as different as possible. The measures of similarities or differences are based on the distance between the units [45]. The starting point in this method is matrix D of Euclidean distance dij between classified objects: The algorithm procedure is as follows: (1) Each Oi object (i = 1,2,..., n) is treated as a one-element group; (2) The distance matrix finds the minimum value: dpq = min{dij}; (3) Op and Oq objects are treated as one-element groups and Ap and Aq are combined into one two-element group Ar: Ar = Ap ∪ Aq; (4) Determination of the distance dir of the formed Ar group from all other groups Ai; (5) Repeating steps 2-4 until all objects form one group [10,46].
The general formula for the conversion of the distance matrix while combining groups Api Aq into a the new group Ar for hierarchical agglomerative methods based on the principle of the central agglomerative procedure takes the following form: Where: dir-the distance between groups Ai and Ar; dip-the distance between groups Ai and Ap; diq-the distance between groups Ai and Aq; dpq-the distance between groups Ap and Aq; ap, aq, b, c-the transformation parameters.
Thus, Ward's method consists of combining such clusters as Ap and Aq, which ensures the minimum sum of squares of the distance from the center of gravity of the new cluster they create. Ward's method aims to obtain rather small clusters and is believed to be very efficient. This method was able to better ascertain the optimal classification than other methods-minimum, maximum and mean. To choose the number of classes, the Cubic Clustering Criterion (CCC) [47] and Pseudo F [48] were used. All the calculations were made with the SAS 9.4 software.
In the case of an analysis of clusters, it is usually proposed to make the classification complete, disjunctive and non-empty. Completeness means that every object belongs to a class. Disjunction means that it belongs to only one class. And non-emptiness requires that each class should contain at least one object. The problem in cluster analysis may result from ensuring completeness in case there are distinct units, dissimilar to others, in the examined cluster [46].
The simplest solution is the creation of one-element classes, which can in fact be interpreted as the specific exclusion of such objects (countries). However, such a situation may result in the erroneous classification of the remaining objects. This is why the article thoroughly analyzes the examined variables first, and then, when the distinct objects are recognized, they are eliminated in the course of clustering countries and treated as separate classes.
In order to verify the spatial diversity of tourism, the authors based the analysis on secondary, non-public data from the United Nations World Tourism Organization (UNWTO) [49] and public data from Eurostat [50]. In order to verify the spatial diversity, the 2017 data were purposefully chosen because this was the latest year for which full data were available at the time of writing. The analysis covers all the EU member states in 2018.
Seven variables were chosen to analyze the spatial diversity of tourism in the EU. These variables are as follows: X1-Average length of stay; X2-Available capacity (beds per 1000 inhabitants); X3-Accommodation for visitors (per 1000 km 2 ); X4-Accommodation in hotels and similar establishments (per 1000 km 2 ); X5-Overnight visitors (tourists) (per 1000 inhabitants); X6-Tourism expenditure over GDP (%); X7-Tourism receipts over GDP (%). The descriptive statistics of variables in the EU countries can be found in Table 2.
The variables were selected on purpose so that it is possible to compare elements of the accommodation base, tourist traffic and economic factors. It should be mentioned that, apart from the substantive criterion, the choice of the variables also resulted from the low mutual correlation of variables (correlation rate below 0.8) ( Table 3). This proves the reliability and validity of the variables in the cluster analysis. In addition, all variables were standardized. There are two main benefits from standardization. First, it is much easier to compare variables because they are on the same scale. Second, no difference occurs in the standardized values when only the scale changes.
Thus, using standardized variables eliminates the effects due to scale differences across variables and for the same variable as well [10].

Spatial Diversity of the Accommodation Base
Accommodation facilities are basic elements of the material-technical base of tourism, since they facilitate the visitors' stay at a destination and constitute a basis for further development of the destination [35]. Figure 1 presents the outcomes of clustering the EU countries with regard to the level of similarity of the accommodation base. The use of Ward's method resulted in the differentiation of five groups of countries that are most similar in terms of accommodation base infrastructure. The first one contains Austria, Cyprus and Greece. This cluster can be labeled as "Very well developed accommodation base". In these countries at that time there was a very large number of beds (amounting to about 70 places per 1000 inhabitants) and the accommodation was mainly in hotels and similar establishments ( Table 4).
The second cluster rated as average developed accommodation base. The cluster is composed of Belgium, the Czech Republic, Germany, Luxembourg, the Netherlands, Portugal and the United Kingdom, where the number of beds per 1000 inhabitants ranged between 15.4 and 31.4. In addition, in these countries there was quite a high availability of accommodation in hotels and similar establishments (per 1000 km 2 ), which on average was 100.
The third cluster contains Bulgaria, Croatia, Denmark, Estonia, Finland, France, Ireland, Latvia, Lithuania, Poland, Romania, Slovakia, Slovenia and Sweden. This cluster can be labeled as "Less developed accommodation base". In these countries, the availability of beds per 1000 inhabitants was at a similar level to the second cluster. However, the availability of accommodation in hotels and similar establishments (per 1000 km 2 ) was lower; on average it was 16 places with a maximum of 30.  The fourth cluster consists of three countries: Hungary, Italy and Spain, in which the accommodation base is well developed. In these countries, the availability of accommodation for visitors (per 1000 km 2 ) was 330 on average. The last cluster consists solely of Malta, in which the number of beds was as much as 92 per 1000 inhabitants and the main accommodation was in hotels-733 places per 1000 km 2 . Figure 2 presents the grouping of EU countries in terms of tourist traffic. The use of Ward's method resulted in the distinction between three groups of countries. The first one contains Austria, Croatia, Cyprus and Malta. These countries were characterized by high tourist traffic and can be labeled as "Long-term travels". The average length of stay was 5.9 days and the number of overnight tourists per 1000 inhabitants was 3.5 persons (Table 5). The second cluster was made up of Belgium, the Czech Republic, Estonia, Finland, Germany, Hungary, Ireland, Latvia, Lithuania, Luxemburg, the Netherlands, Romania, Slovakia and Slovenia. This cluster can be labeled as "Short-term travels". In these countries the average length of stay was the lowest and ranged from 1.9 to 2.6 days. The same was true for the number of overnight tourists per 1000 inhabitants, which averaged 1.3.

Spatial Diversity of Tourism Traffic
The remaining EU countries formed a third cluster, which we described as "Long-term but dispersed tourist traffic". The number of overnight tourists per 1000 inhabitants was also low, as in the case of the second cluster. However, in these countries the average length of stay was 5.6 days.

Spatial Diversity of Tourism Expenditures and Revenues
The importance of tourism in creating the GDP of a given country is an interesting research issue. Figure 3 presents the grouping of EU countries in terms of tourism-related revenues and expenditure. The use of Ward's method resulted in the distinction between five groups of countries. The first cluster, which we described as "An important role of tourism in the country's GDP", represents Austria, Bulgaria, Greece, Hungary, Portugal, Slovenia and Spain. The group included countries that were characterized by high revenues from tourism, on average 6.2% of GDP (Table 6), while expenditures in these countries was at 2% of GDP.
The second cluster was made up of the following countries: Belgium, Denmark, Estonia, Luxembourg and Sweden. This cluster can be labeled as "An average role of tourism in the country's GDP". In these countries, revenues and expenditures as part of GDP were at a similar level, on average about 4%.
The third cluster was made up of Croatia and Malta, where tourism revenues as part of GDP were one of the largest and tourism expenditure over GDP was at a low level. This cluster can be labeled as "Tourist countries".
A separate cluster was created by Cyprus, in which revenues accounted for 14.1% and expenditures for 6% of GDP. The other EU countries formed the fifth cluster. In these countries, both revenue and expenditure for tourist purposes accounted for about 2% of GDP. This cluster can be labeled as "Non-tourism countries".
The groups of countries obtained in the research can be linked to research on measuring the tourism efficiency of European countries by using Data Envelopment Analysis [51]. According to this research, the third and fourth cluster countries were considered effective. In addition, effective countries include: Estonia, Finland, France, Greece, Hungary, Ireland, Latvia, Luxembourg, Poland, Portugal and Spain. Other studies confirm the relations between the income generated by the inhabitants and the competitiveness of tourist destinations [24,25,52]. Table 6. Cluster descriptive characteristics of tourism expenditure over GDP (%) and tourism receipts over GDP (%).

Discussion and Conclusions
The tourism sector is one of the largest and fastest growing industries in the world. Thanks to the generation of employment, export revenues, investments and infrastructure developments, the tourism sector makes serious contributions to the socio-economic process directly and indirectly. The issues concerning the accommodation base, tourist traffic, and expenditures and revenues analyzed in the work cover the main aspects of tourism development. Other authors also pay attention to conducting individual analyses of individual elements of tourism [42]. It must be noticed that the accommodation base is the most important element of tourism management that is of key significance for tourist traffic. First of all, it serves to satisfy the need to sleep, rest and reside. Confirmation of the importance of accommodation facilities can be seen in the fact that hotel services belong to the basic services satisfying tourists' material needs connected with the change of the place of residence and travel as well as a series of other needs of travelers [53]. Therefore, it can be said that there are no grounds for development or even the occurrence of tourist traffic determining revenues from tourism without appropriate tourism management, the main element of which is the accommodation base, which is confirmed in the present research.
The development of tourism is a consequence of complex natural factors, forms of spatial organization, and the effects of human activities [54]. One of the development directions is the use of tourism attractiveness to build competitive advantage and attract tourists [55]. In many regions, tourism has become the sole or key determinant of income, as well as economic and social changes. Countries such as Malta, Croatia and Cyprus, which often form clusters in the spatial diversity analyses herein, can be the examples of that. It is worth noting that the accommodation base in EU countries is more spatially diverse (five clusters) than tourist traffic (three clusters). This may indicate that other factors that have not been studied in this paper also have an impact on tourist traffic. The distinguishing of areas attractive for possible tourism is based on the assessment of the occurrence of, inter alia, the tourism attractions that constitute the aim of tourists' arrivals, and the tourism infrastructure that makes it possible to use those attractions [14,40].
Data on clusters, as presented in this paper, can be used in the effective planning and decision-making for a destination [56] to support sustainable tourism development in a specific country. The data can also be used as a relevant base for potential future cooperation between various countries from one cluster to support tourism competitiveness and sustainable development [57,58]. In some countries, tourism has a good opportunity for joint development, e.g., by introducing a joint offer for tourists. Especially if the countries are located close to each other, there is a chance for tourists to use common tourist assets, such as tourist routes.
Moreover, the results presented in the article may be of application significance, both in scientific and practical terms. They can be useful for: • Country's authorities-to develop a strategy for tourism development in the country; • Universities, research institutes and scientists-comparison of obtained results; implementation of projects on tourism development in EU countries; looking for dependencies in the spatial development of regions; • Organizations (e.g., national, regional and local tourist organizations, tourist associations, tourist clusters) and institutions (e.g., the Ministry of Tourism)-use of the results during trainings, courses, scientific conferences on the development of tourism and its spatial conditions; comparing elements of tourism in different countries; • Tourist service providers (e.g., hotels, hostels, guesthouses)-defining perspectives for tourism development and tourist traffic (e.g., in areas of strong tourist competition).
Tourism is considered to be an activity that perfectly expresses spatial interaction [36]. Tourism has a heavy impact on local development [59]. Based on the results of clusters it can be determined: • The spatial diversity of the accommodation base may indicate countries in which it can lead to some estimations of overuse, e.g., laundry, electricity or cosmetics (countries from clusters 1, 4 and Malta). Other authors also pay attention to this [42]; • The spatial diversity of tourist traffic may indicate an increase in the use of tourist and associated infrastructure, such as the transport infrastructure in individual countries [60]. This can also show the level of the impact on sustainable tourism development. On the other hand, countries with more tourist traffic should have this infrastructure more developed (countries from the clusters "Long-term travels" and "Long-term but dispersed tourist traffic"); • The analysis of the spatial differentiation of revenues and expenditure indicated countries specializing in "tourism", which at the time of crises or epidemics may show very large losses in the budget of the state and inhabitants (objects from cluster "Tourist countries").
The importance of the spatial diversity of tourism in terms of the factors analyzed in the article has a great impact on the presentation of tourism development in a given country. Spatial diversity is fundamental for characterizing and carrying out research into tourism in a given area [11]. However, some limitations of the study should be acknowledged: • It should be highlighted that the more countries or regions the research covers, the more probable it is that it will be more differentiated and it will be more necessary to obtain detailed and comparable spatial-temporal data concerning tourism [9]. This is why the presented issue should be recognized as very broad, with research into it not being fully exhausted. The more the available statistical data from official European data sources on tourism are limited in terms of both the spatial and temporal resolutions, the more it curbs potential analyses and applications relevant for tourism management and policy. • Use of more tourism indicators in the cluster analysis may result in more accurate outputs.
However, in some cases, it may lead to changes within clusters and/or the number of clusters. The limitation in this case is also the lack of comparable data for all countries.
The literature on the subject lacks studies on the spatial diversity of tourism including the variables used (accommodation, tourism, expenditures and revenues from tourism). There is also a lack of spatial studies containing other factors such as tourist seasonality or the recently fashionable issue of innovation in tourism [61][62][63]). There are dynamic changes taking place in tourism so it is worth upholding the issue and conducting similar research, e.g., covering spatial relationships at lower territorial units (NUTS 2-basic regions for the application of regional policies or NUTS 3-small regions for specific diagnoses).