Inﬂuence of Service Valuation and Package Cost on Market Segmentation: The Case of Online Demand for Spanish and Andorra Ski Resorts

: Ski resorts are important tourist resources in mountain areas. They have a high impact on the environment but also on the development of the territory. Corporate social responsibility management plays an important role in inﬂuencing consumer purchasing behavior. This research seeks to understand the behavior of ski tourists and to classify them. Approximately 50,000 online purchases of tourist packages to ski resorts in Spain and Andorra are analyzed in order to describe the different segments of demand in these resorts through a latent class model. The tourists’ age and previous experience, the type of accommodation, and the season—among other variables—are considered to clarify the different classes. Six different segments were found. Some relevant results for the online ski-package market are highlighted, such as the inﬂuence of skier’s expenditure on service valuation. Managerial implications, limitations of this study and recommendations for future research are also discussed.


Introduction
Corporate social responsibility (CSR) is a phenomenon that has multiple definitions and is applied in the specific literature on the tourism sector. According to The European Commission, CSR is a "concept whereby firms decide voluntarily to contribute to a better society and a cleaner environment" and its implementation is possible by integrating "social and environmental aspects into business operations and their interaction with the stakeholders" [1] (p. 6).
Tourism is a service that involves offering a pleasant experience, transportation, comfort and entertainment [2]. Tourism companies are responsible for the environment of the territories in which they operate. Good management in the tourism industry involves not only taking into account the tastes and preferences of consumers, but also ecological and sustainable development [2,3]. CSR initiatives are part of the sustainability practices adopted by companies in tourism in general and mountain tourism in particular [4].
Recent research applied to the hospitality industry has suggested that CSR practice can improve the performance of companies in different ways: consumer preferences and attitudes [5,6], employee well-being [7,8], and corporate financial matters [9][10][11] in restaurants, heritage sites, hotels, and also in ski resorts.
Skiing is a tourist activity based on the exploitation of a natural resource that requires a quantity and quality of suitable snow as well as a stable and favorable climate to offer skiers baseline conditions of visibility and security [12][13][14]. It is a type of tourist activity that can presently be considered as being in a growth phase in Spain and Andorra. However, the   [22]. Several authors have identified mountain areas as regions that are especially vulnerable to the effects of climate change [23,24]. A slight increase in temperature and a decrease in the average thickness of the snow layer have been detected in the last decade. This has led to a scarcity of snow at low altitudes, and ultimately to a decrease in the number of skiable days, as well as to a progressive delay of the opening of the ski season [25,26]. One of the most important adaptive measures to counter this trend consists in the largescale production of artificial snow. However, this measure entails higher maintenance costs and lower profitability, thus negatively impacting the bottom-line concern for ski resort managers [27]. Moreover, artificial snow production also involves environmental externalities which affect the local flora and fauna, and require special consideration in a future scenario of greater water scarcity [14,25,28]. Ecological considerations are thus important by themselves, since counteracting measures can cause further ecologic damage, while at the same time jeopardizing the economic profitability of the ski resorts. For this reason, in order to optimize the ski resorts' performance and to ensure their ecological and economic sustainability, it is important to better understand the behavior of visitors to these ski resorts.
The geographic and other natural conditions obviously vary greatly from country to country, and it is beyond Spain's capacity to attract the same volume of skiers as the big four skiing countries mentioned earlier [29]. In spite of these differences and of the effects of climate change, the number of ski resorts and skiers in Spain has increased consistently in previous seasons, and there is still room for growth [21]. Therefore, it is key to study who these skiers are and how they behave, both at the time of booking and during their actual holiday, in order to design a suitable product for each type of consumer [30].
The demand of winter tourism depends on several factors, such as skier income, flat-rate prices, transportation and other costs, timing of Easter holidays, and climate change [31]. The decision to travel to a ski station in winter involves both internal aspects, such as individual characteristics and personal motivations, as well as external aspects, including the characteristics of the destination [32]. In particular, the traveler wishes to carry out a series of activities in order to enjoy certain experiences and benefits. The range of available activities, therefore, affects the choice of destination. Travelers' motivations and the destination's perceived capacity to satisfy expectations are two strategic variables in destination marketing [33]. Therefore, knowing and classifying the motivations of different segments of visitors to ski resorts is key to managing this type of installation.
Traditional segmentation research does not usually consider the effect of satisfaction or expenditures on the probability of class membership [34]. However, satisfaction and expenditures are relevant features to help businesses (in this case, ski resorts) design marketing strategies and focus organizational efforts on the segments where they could achieve the best possible commercial results. Concretely, concentrating on consumer groups that present a better fit with the company's goals can lead to higher economic profitability. At the same time, such a segmentation framework can help in the design of public policies which guarantee a sustainable environmental management without abandoning any of the services offered by the tourism sector [35]. It can also be useful to stimulate rural and mountain economies with a high dependence on one specific industry, or prone to structural problems such as depopulation or low economic profitability [36].
The purpose of this study is to analyze skier behavior, and to classify it into distinct segments so that the ski tourism industry can optimally address the demand. Several studies have widely analyzed skier behavior in countries like the United States [37][38][39], Canada [40,41], Finland [42], Greece [43], Australia [44] or the Alps [30,45,46]. However, to the best of our knowledge, no previous studies have examined ski tourist segmentation in Spain or Andorra. We thus intend to fill this gap with the present study. Two characteristics of the ski markets in Spain and Andorra that should be kept in mind are the following. First, as mentioned above, the average price of a ski pass in Spain lies well below the European average. Second, the mean length of stay is also shorter than the European average [20]. In Andorra, for example, the average length of stay is slightly below 4 nights.
To sum up, the present study focuses on a segmentation of skiers visiting Spanish and Andorran ski resorts. We will see that most of these visitors are themselves Spaniards. The variables taken into account consist of behavioral variables as well as trip characteristics, while service valuation and package cost are used as segmentation covariates. Although Spain and Andorra are not among the most visited European ski destinations, they are relevant not only because of their own intrinsic interest, but also as a paradigmatic example of a second category of skiing destination countries below the European top-4, including Spain, but also other countries such as Poland, Slovenia or the Czech Republic. These countries are characterized by lower prices, shorter lengths of stay, and a larger proportion of domestic visitors, and have been scantily studied so far [21]. This paper is organized as follows. The present section has introduced the importance of CSR with respect to companies' performance, explained the importance of snow tourism, and discussed relevant elements of sustainability in ski resorts. The following section will present a literature review about market segmentation in skiing tourism. The methodology and the data used are described in the Section 3. Section 4 is devoted to a presentation of the main results. In the Section 5, the results obtained are discussed in relation to previous studies in the literature. Finally, Section 6 contains conclusions and plans for future research.

Literature Review
Within the context of any positioning strategy, a first crucial step consists in determining a profitable target market by classifying the potential customers. This classification involves segmenting a large market into separate groups that may require different marketing strategies [47]. This process is thus called market segmentation, and it allows a company to identify key consumer groups and to adapt its marketing strategies to the needs of each group [48].
Within the tourism industry, a wide variety of tourist behaviors as well as a great diversity of choice motivations can be found. These include the search for unique experiences, the influence of environmental issues, service flexibility, innovation, or the quest for high-quality products [49]. Research has shown that ski tourists spend more than the average tourist, seek new experiences and trends, and are interested in discovering destination authenticity through the practice of snow sports [50]. Market segmentation thus becomes particularly relevant in ski tourism marketing, both in the academic field and in practical business management. Segmentation can ultimately contribute to a more efficient allocation of resources, through a selection of strategic target groups and a determination of the most appropriate products, prices, distribution, and communication policies [51].
The authors of [46] find that consumer satisfaction in the context of skiing is positively related to cross-selling, word-of-mouth recommendation, "upgrading", reduced price sensitivity, and intention to repurchase. They further mention that customer satisfaction generates loyalty, and determine three factors of influence in this relationship: lifestyle, level of spending in the ski center, and ability to ski. They identify two different types of loyalty. "False" loyalty is consistent buying behavior due to a lack of alternatives, whereas "true" loyalty essentially consists in a favorable attitude towards the seller. True loyalty implies that the consumer is likely to recommend the product or destination to third parties. In this sense, repurchase intentions and word-of-mouth recommendations are crucial aspects that must be promoted by improving consumer satisfaction. Influential elements that affect consumer satisfaction, according to [52], include the number of tracks, flat-rate prices, snow quality, or the distance to the resort.
Different segmentation approaches seek to understand the reasons why consumers choose one winter-sports destination over another. Some approaches in the literature attempt to establish a segmentation based on the attributes that characterize a particular destination [40,42,[53][54][55]. Items of interest include the type of activity available at the destination, such as alpine skiing or cross-country skiing, as well as the difficulty and variety of slopes, but also the offer of social activities or spa services. Such attributes of the destination have, for instance, been used in [42] to perform a segmentation of Finnish ski resort visitors. Six different customer segments were identified: passive tourists, crosscountry skiers, want-it-all, all-but-downhill skiing, sports seekers, and relaxation seekers.
Other approaches study motivations, needs, and consumer characteristics [46,[56][57][58]. As an example of this latter approach, skiers visiting Greek resorts have been classified based on the restrictions experienced by these visitors in [59,60]. In [45], the attributes prioritized by different tourist segments when choosing a winter sports tourism destination were identified, and the degree of satisfaction with the services provided by the ski resort was analyzed. Six groups of consumers were found, based on the following five factors and the consumers' degree of satisfaction with respect to them: accommodation, restaurants and social life; resort facilities and services; track quality; number of tracks; and proximity, access, and price. For many ski resorts, the number of skiers who are likely to spend the night is in fact relatively small compared to the number of single-day skiers. Hence, visiting frequency can also be used as a tool to segment ski resort customers [61].
To sum up, most studies about snow tourist behavior have focused either on the destination and the services it offers, or on psychographic characteristics and other aspects related to the consumers themselves and their trip [62]. As a final example that combines both strategies, in [41], recreational skiers with a tendency to travel for skiing were studied. The factor analysis based on service characteristics (offers, entertainment, quality, service, difficulty, and variety) as well as on lifestyle characteristics led to the emergence of seven different skier profiles: the beginning skier; the party looking for après-ski entertainment; the extreme skier; the demanding professional; the family veteran; the passionate skier; and the beginner/saver. Regardless of the type of data and the research strategy, it is recommended that each ski resort identifies the relevant market segments that it wishes to target, in order to focus on achieving customer satisfaction and therefore a high level of loyalty [46].
The present research focuses on consumers visiting ski resorts in Spain and Andorra. While we do not study psychographic variables as such, we instead have a large sample of behavioral data and trip characteristics of actual consumers who have booked and consumed a ski package through the specialized tour operator esquiades.com. These are combined with the tourists' valuation of the services provided by the destination, as well as the expenditures incurred. All these elements are combined into a latent class analysis in order to discover the existing customer segments in these ski resorts.

Method
The current analysis is based on a latent class model (LCM) methodology developed with R-Project software [63], using the poLCA Package [64]. LCM analysis seeks to classify the measured values of observed or manifest categorical variables through an unobserved or latent variable. The different values of this latent variable thus correspond to latent classes, which produce conditional expectations for the values of the manifest variables. Such LCM models have in particular been used for market segmentation applications in tourism by some authors [65][66][67]. We briefly summarize the central idea of the LCM methodology here, refer to Appendix A for a mathematical synopsis, and to [64] for a more detailed introduction.
As just described, the central idea in LCM is to define an additional numerical variable which represents the different classes, groups or segments. This variable is called latent since it is not obtained directly from the observations. However, it is connected to the observed variables, which are typically categorical, through a statistical optimization procedure. This procedure consists in, first, assigning a numerical value to each possible answer or category for each of the questions or observation variables.
Second, the number of classes should be determined. This is a somewhat delicate operation, since increasing the number of classes would in principle always fit the data better. However, over-increasing the number of classes leads to an increase in calculational complexity, and more importantly: a loss of explanatory power. A balance should there-fore be struck between goodness of fit (higher number of classes) on the one hand, and computational simplicity (lower number of classes) and the avoidance of over-fitting on the other, while always keeping explanatory power as a main objective. Two widely used criteria to execute this step are the Akaike information criterion (AIC) [68] and the Bayesian information criterion (BIC) [69].
The third step is the main LCM step, namely the determination of the actual distribution of the classes. This consists in an iterative maximization procedure of the likelihood function which embodies the goodness of fit of the overall model with respect to the original, complete set of data. Note that this step does not imply assigning individual respondents or measurements to concrete classes. Rather, an overall pattern of association is created between the observed characteristics, thus determining the probability of each category within the observed characteristics to belong to each of the classes. In other words, the output of this step gives the composition of each class in terms of the proportions within that class of each answer or category to each of the observation variables.
Finally, the previous model can be extended into a latent class regression model (LCRM) by the inclusion of a number of external factors or so-called covariates. These covariates do not participate in the determination of the classes, but serve as predictors for the class membership probability of each sample element. Concretely, the values of these covariates for a given sample element permit us to calculate the probability that the sample element in question belongs to each of the classes, without knowing the actual answers of the sample element to the original categorical variables that determined the classes in the first place.

Data
The data consist of 50,706 observations obtained from the portal website esquiades.com. That portal, esquiades.com, is an online company specialized in snow tourism and offers tourist packages to ski resorts in Andorra, the Sierra Nevada, the (Spanish and French) Pyrenees and the French Alps. Data were provided in July 2019, in Excel format, and there was no need to pre-treat them. The online company is popular mainly in Spain and Andorra. Apart from the Spanish and Catalan versions, it also has English and French versions, with a dedicated customer service phone number in the UK and in France. The website focuses on accommodation (hotel and tourist apartments), with or without ski pass. The time range of the data is from December 2013 to November 2018. Various types of data are recorded about each purchase on this website, including obviously the (real) cost of the travel package. All data have been collected by the company in accordance with the data protection laws of the European Union (EU).
A 3-dimensional summary of the data is given in Figure 2, where each point represents a monthly mean. One aspect which stands out from this figure is that the number of travelers per group and the length of stay tend to be larger during the winter months than in the rest of the year. Table 1 gives an overview of the manifest variables that have been studied, the different levels into which the possible answers have been grouped, together with their description and the number of observations within each level. Note that the age indicated is of the person actually making the booking through the platform. In the accommodation category, "apartment keys" represents a Spanish level of classification of apartment services and luxury, similar to hotel stars. The other descriptions in the table are meant to be self-explanatory.

Covariates
The esquiades.com database also includes information about the valuation of the service and the cost per package booked. This allows the definition of two additional variables, the service valuation (SV) and the deflated daily expenditure per tourist (DDET). These will be associated to the obtained segments after these segments have been established, in order to predict class membership. The service valuation is recorded on a numerical scale from 0 to 10. The cost of the purchased packages was recorded continuously over a period of 4 years. These costs have been deflated according to the hotel price index (HPI) to allow comparison. The monthly HPI index was obtained from the Spanish National Institute of Statistics [70] with 2008 as base year. The DDET was thus obtained as: with DET the (undeflated) daily expenditure per tourist. Both covariates are summarized in Table 2.  The evolution of the SV and DDET is decomposed into 3 additive terms (trend, seasonal and random part): Here, X i represents the observed values in the ith month, T i the trend, S i the seasonal part, and R i the random part. This time series decomposition is shown in Figure 3.
Note that the DDET shows an outspoken positive trend, with an increase of roughly 20 euros over the considered period, and a strong seasonal character, with higher expenditure in winter. The positive trend is genuine, since the values plotted here are already deflated for comparison. The seasonal character, on the other hand, is unsurprising since most packages in winter include a ski pass, whereas in the remainder of the year, only accommodation is purchased.
The service valuation shows no outspoken trend, and although it has a seasonal component, this is small in magnitude (±0.2 on a scale of 10) and, therefore, not very significant.
The Plot3D, decompose and corrplot packages are used in R-Project to plot the variables, decompose the temporal patterns and calculate the correlation matrix, respectively. This correlation matrix is shown in Figure 4. The main feature of this matrix is that all the correlation coefficients are quite low (see color-scale on the right-hand side of Figure). In particular, none of the manifest variables are strongly correlated. More surprisingly, the DDET and SV also do not present a high correlation with features that might have seemed strongly related, such as accommodation type or country of origin. To sum up, all the variables-including the covariates-can be interpreted as approximately independent.

Results
The manifest variable used in the model is a vector, constructed based on the different categories of data, namely Y i = (Accommodation (Hotel/Apartment category), Age, Group size, Origin, Package (with or without ski pass), Previous experience, Season, Length of stay) for each observation i. The covariate vector is defined as X i = (DDET, SV).
First, a specific LCM model must be selected, together with the number of classes. Then, within the selected model, the different classes are described and interpreted. Finally, the influence of the covariates is discussed. Note that the model requires full row data for each entry. 5988 buyers do not satisfy this condition, so the effectively used database drops to 44,718 observations.

Model Selection and Number of Classes
Five different models are considered, as summarized in Table 3, starting from a simple LCM without external factors up to a LCM where both covariates as well as their cross effects are considered. Note that the most complete (and also most involved) model does not necessarily give the best results, because of the penalization for a higher number of parameters. From the log-likelihood maximization criterion ( Figure 5, top) it is immediately seen that the basic LCM model without covariates can be discarded. The remaining models are all quite similar, especially in the optimal region of 4-7 classes. The single best model seems to be the LCM with 2 covariates and cross-effects, and 6 classes. To confirm this, and to take into account the penalization for over-fitting mentioned earlier, the AIC and BIC minimization criteria are studied for the 4 remaining models, see Figure 5 middle and bottom. Both criteria indeed confirm that a model with 6 classes and cross-effects between both covariates gives the best results. Note that the cross-effects only become relevant for 6 classes. Up to 5 classes, the model with cross-effects is virtually indistinguishable from the model with only additive effects, or even with only the DDET as covariate. Also note that these results are consistent with those obtained from Pearson's χ 2 goodness of fit and likelihood ratio chi-square (G 2 ) statistics. However, these show less outspoken discrimination between the different models and are, therefore, not represented here nor discussed further. In the following, the model with 6 classes and cross-effects will be discussed further.

Description of Classes
For R = 6 classes, the LCM model gives the results shown in Figure 6, with statistical parameters given in Table 4. The 6 classes can be summarized as follows. The first four classes travel during ski season and have bought accommodation with a ski pass. They are differentiated mainly by the following characteristic: Note that classes 1-3 are roughly of equal size, namely 22-26% of the overall tourists each, while class 4 stands for 14% (see Table 4). The last two classes represent 5-6% each. These are tourists who did not book a ski pass, and are spread throughout the year. They are very similar in general terms, with the following differentiating characteristics: •    Table 4 shows that all the coefficients in the LCM results are highly significant (<0.05). Therefore, the external variables can be used as predictors for the class membership, as illustrated in Figure 7. Both the service valuation and the expenditure have a strong influence on the probability of class membership. Specifically, the main features observed are the following.

Probability of Class Membership
With respect to the influence of valuation for a given expenditure range: • Travelers with a low expenditure have a roughly 0.5 probability of belonging to class 1, and this is relatively independent of the valuation they give to the travel product. The remaining 0.5 probability is more or less evenly divided over the remaining classes. • For average and high daily expenditure, very similar results are found. In particular, travelers who give a low valuation have a high probability (0.4-0.6) of belonging to class 1. As their valuation increases, they tend more and more (up to 0.3-0.4) to belong to classes 2 or 3. There is also a non-negligible probability of belonging to class 4, especially at intermediate valuations. With respect to the influence of expenditure for a given valuation range: • For low valuation, at very low daily expenditure there is a probability of roughly 0.25 of belonging to class 1 and likewise for class 6; the remaining 0.5 is equally divided over the remaining classes. As the expenditure increases, the probability of belonging to class 6 decreases rapidly in favour of class 1.

Discussion
Throughout the period analyzed, a clear increase in spending per reservation was observed. This expenditure has a clear seasonal component, since most winter reservations include a ski pass. These results are in line with other works [71,72]. A segmentation process was carried out based on different categories of data, such as: accommodation, age, group size, origin, package, previous experience, season and length of stay.
The decision of number of classes was based on objective criteria, in particular the loglikelihood, BIC, and AIC criteria, as well as subjective criteria, in particular the usability of consistency of the obtained segments. As a result of this analysis, six clearly differentiated segments were obtained. These segments can be considered substantial and profitable in terms of daily expenditure, and actionable and accessible in terms of service valuation [73]. In other words, the obtained segmentation ensures that a concrete ski resort can perform the adequate marketing actions to effectively attract and serve the consumer segments of its interest. Figure 8 shows violin plots, which combine the traditional boxplots of both covariates per class with the group width (the number of individuals for the corresponding y-axis values). Groups 1 and 6 are the groups with the lowest spending (around 50 and 40 euros per day, respectively), while group 3 has the highest daily expenditure (almost 85 euros per day). Even though skiing is usually considered an expensive recreational activity, it should be kept in mind that the average price of ski passes in the destinations analyzed is of 22.9€ in Spain and of 40€ in Andorra, far below the European average. Thus, the previous expenditure figures are quite reasonable, and illustrate that skiers visiting Spain and Andorra, and users of the portal esquiades.com in particular, behave quite differently in terms of expenses with regard to tourists visiting other European ski resorts. There is some research in the literature which studies skier behavior as a function of price levels. For example, the authors of [74] examined how prices affect the number of season tickets sold for different types of skiers. In their models, these authors included snowfall, weekends, and time of season. The obtained results suggested that most skiers will ski on sufficient occasions to receive at least a 50% discount on the daily season ticket price. Another study analyzed how changes in a company's pricing strategy can influence its profitability [75]. This research showed that dynamic pricing induces higher demand and thus can increase revenues. In addition, skiers were found to have a strong preference for good weather conditions, and a strong willingness to pay a higher price on days of good weather for skiing.
On top of price considerations, and contrary to the references just mentioned, our work also takes into consideration other variables related to the overall package booked (such as the type of accommodation or length of stay), as well as the service valuation. In terms of the service valuation, we observe the following differences among the groups. Groups 3 and 5 give a higher valuation of the services (8.15 and 8.09 out of 10, respectively), while group 1 and group 6 give a lower services valuation (7.08 and 7.46 out of 10, respectively). Thus, based on these results, it can be said that the skier segments who spend less per day also value the services less.
This classification, and especially the influence of valuation and expenditure, can be useful for tourist segmentation and targeting. In terms of variables used as predictors for class membership, low expenditure is (unsurprisingly) strongly related to booking low-to average accommodation type (i.e., class 1). The most salient feature in this regard is that valuation has little influence on class membership for this low-expenditure category. In contrast, tourists with a high daily expenditure profile who also give a high valuation to the service are most likely to belong to classes 2 and 3-i.e., young Spanish couples, and travelers of middle to older age, mainly of Spanish origin, who stay at upper-level accommodations. These segments should, therefore, be priority targets for tourism promotion and attraction, since they are economically beneficial and are likely to return and to make positive recommendations to their friends and relatives.

Conclusions
The purpose of this paper was to segment tourists who had booked packages to ski resorts in Andorra, Sierra Nevada, and the (Spanish and French) Pyrenees, based on behavioral tourist and trip characteristics. The results obtained from the latent class model show a segmentation consisting of six classes of tourists. These segments are determined by accommodation type, age, group size, origin (Spanish or non-Spanish), package (accommodation with or without skiing flat rates), previous experience with the purchasing portal, traveling season, and length of stay.
One of the main contributions of this research is to be able to classify and observe the behavior of the snow tourist in these destinations. Moreover, this classification includes not only local tourists but also international tourists, since more than 18,700 responses were obtained from visitors from other countries, mainly France, Germany and the United Kingdom, but are also from America and Asia. Ski resorts managers and travel agencies, both in the destination and in the countries of origin of these international visitors, can use the information obtained about each of the target groups to design their products and package their services more effectively.
The usefulness of the LCM methodology for the study of ski tourist behavior has also been demonstrated. This methodology had been applied previously in other fields of tourism such as cultural consumption [76], nature tourism [77] or tourist time management during holidays [78], but not in the context of snow tourism. The implications of this research have both general aspects, valid for all fields of tourism, and specific aspects for ski tourism in particular. In general terms, understanding the behavior of a concrete tourist segment allows a better adaptation of the services offered. It has been shown that, for most groups of interest, there exists a relationship between the valuation of the services provided and the amount of money spent by the tourists. Therefore, it is necessary to elaborate active policies aimed at improving the perceived quality as well as at establishing an optimal pricing policy.
In more concrete terms, the destinations analyzed (Spain and Andorra) show a relatively weak average spending in comparison with other European ski destination countries. The present research might contribute to establishing appropriate mechanisms for a higher profitability, and thus a higher efficiency, of the (winter) tourism sector, by focusing on tourism segments with a high spending pattern. Ski resort managers should develop products aimed at these segments. Possible elements in this regard include not only a high-quality hotel offer, but also complementary activities from "après-ski" to adventure activities such as snowshoe excursions, snowmobiles or dog sledding. Highly segmented campaigns should be designed for these groups of clients through digital marketing and social networks. Also, the high-end accommodations preferred by tourists belonging to these segments could offer package discounts or extra services in coordination with the actual ski resorts.
The previous observations are valid not only from the point of view of a direct optimization of the income from tourism, but also in relation to CSR policies. An effective implementation and communication of CSR policies necessarily starts with an accurate knowledge of the profile of ski resort users. Developing and communicating CSR actions and strategies is crucial to engage and maintain customers [79]. CSR has a strong and direct impact on services valuation. Given the importance of the services valuation and its correlation with the tourist expenditure that have been emphasized in this research, the results obtained might stimulate the tourism industry to evolve towards a more efficient and more sustainable exploitation of resources, leading to higher customer satisfaction and thus ultimately to higher economic profitability as well [80].
From a more general methodological perspective, LCM models are useful and flexible tools for a variety of segmentation purposes. In particular, there is an ever-growing need for rigorous quantitative analyses of tourism demand and segmentation. These could undoubtedly benefit from a wider application of techniques like that applied here. The same method was used previously in [66] to classify tourists based on their budget distribution, and the importance of such a solid quantitative analysis was stressed. However, to the best of our knowledge, very few studies in the field of tourist segmentation are currently using such techniques.
The main limitation of the present research is related to the data. Even though the database is very large in terms of number of individuals (N ≈ 50,000), the variables included in the database were limited and not determined by the present researchers. In this sense, while the database does report the total package cast, it does not distinguish between the different services included in the package, such as the cost of the accommodation or the ski rates. Furthermore, the database does not inform separately about the valuation of the services offered at the ski destination, but only about the valuation of service offered by the portal website esquiades.com. Also, the data does not record psychographic characteristics. Vice versa, the advantage of the data used in this research is that it is real behavioral data, directly recorded at the time of online booking.
Further research should include an analysis of the cost of each of the separate services included in the package, in order to observe differences between tourists with a higher spending on accommodation and those who spend more on skiing. Along the same line, an analysis of the valuation of services offered (accommodation and ski rate) would be interesting, in order to relate these to the other variables studied. An analysis of the differences between Spanish and non-Spanish tourists, in terms of subjective experience (service valuation) and expenditure, or depending on the length of stay, could also be very relevant. Finally, a survey collecting psychographic characteristics could be carried out in order to further detail the heterogeneity of the obtained segments in terms of lifestyle, values or interests.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Latent Class Model
Denote by Y the manifest variable, which is constructed from J polytomous categorical variables with K j possible outcomes for each respondent i out of a total of N respondents. Thus Y ijk = 1 is defined if respondent i selects answer k for variable j, and Y ijk = 0 otherwise. Furthermore, the probability within class r that an observation of variable j results in answer k is written as π jrk .
With these definitions, the overall cross-class probability density function for each respondent i can be written as Equation (A1), where pr represents the proportion of respondents in category r. The latent class model calculates estimatorsp r andπ jrk , by looking for maxima of the log-likelihood function with respect to p r and π jrk .
With these estimators, the probability that a sample element i with manifest variable values Y i belongs to class r is given by Equation (A2).
The previous model can be extended into a latent class regression model (LCRM) by the inclusion of a number S of external factors or covariates X i which serve as predictors for the class membership probability of each sample element. To this effect, the class proportions p r are replaced by probabilities p ri for each element i to belong to class r, depending on the covariate values for that element. These p ri can be written as Equation (A3).
Here, β r is a vector of S + 1 coefficients (one constant or intercept, and one term per covariate) corresponding to class r relative to the reference class r = 1 such that the result is zero for the first coefficient (β 1 = 0) and the remaining coefficients can be obtained from Equation (A4).
ln(p ri /p 1i ) = β r X i (r = 2, . . . , R) (A4) The LCM now calculates estimatorsβ r andπ jrk πˆj rk , and the probability that a sample element i with manifest variable values Y i belongs to class r is obtained as a straightforward extension from the basic case, see Equation (A2), namelŷ P(r i |X i ; Y i ) =p r X i ;β f (Y i ;π r ) ∑ R q=1p q X i ;β f Y i ;π q (A5) The LCM with or without covariates does not itself determine the number of classes. To select this number, a balance must be found between goodness of fit (higher number of classes) on the one hand, and computational simplicity (lower number of classes) and the avoidance of over-fitting on the other, while always keeping explanatory power as a main objective. Two widely used criteria that penalize over-fitting are the AIC [68] and the BIC [69]: where Λ is the maximum log-likelihood and Φ = R∑ j (K j − 1) + (R − 1) is the number of parameters to be estimated by the model. Note that these criteria both calculate the loss of information of the model with respect to the full, original data, and (contrarily to the Log-Likelihood criterion described above) are therefore minimization criteria.