1. Introduction
Tourism can be a significant driver of economic and social development worldwide, but its benefits depend heavily on how it is managed. Cultural heritage tourism offers valuable opportunities for medium-sized cities to foster sustainable development (
Ottaviani et al., 2024;
Środa-Murawska et al., 2021). However, if not managed properly, it can also pose serious risks to the preservation of cultural assets and the well-being of local communities (
Hargrove, 2017). Uncontrolled tourism growth may lead to overcrowding, degradation of heritage sites, loss of cultural authenticity, and social tensions, making the pursuit of sustainable and responsible tourism practices imperative (
Kosmas & Vatikioti, 2024). In this context, the role of tourists themselves is crucial: tourism management is not only about attracting visitors, but also about understanding the type of tourists a destination receives—and consciously shaping the kind of visitors it wishes to attract.
Understanding tourists’ motivations, backgrounds, and behaviors is essential for any destination aiming to manage tourism responsibly. Tourists are far from a homogeneous group; they range from casual sightseers to individuals who view certain destinations as deeply connected to their personal or cultural identity. This diversity is particularly evident in heritage tourism, which encompasses a wide spectrum of visitors. From those who casually visit cultural sites to those deeply motivated by a search for authenticity, cultural appreciation, or a personal sense of connection and belonging to the place (
Chen et al., 2025;
Poria et al., 2013;
Yan & Morrison, 2008). Recognizing and understanding this diversity is crucial for developing management and marketing strategies that align visitor experiences with heritage conservation and support for local communities. Among destinations that rely on cultural heritage tourism, World Heritage Cities (WHCs) stand out due to their exceptional value recognized by UNESCO. The World Heritage designation not only protects cultural legacy but also could serve as a global branding mechanism that can attract tourists seeking meaningful experiences (
Chen et al., 2025;
Timothy, 2025).
However, sustainable tourism requires more than visibility: it requires a strategic understanding of the visitors’ sociodemographic profiles, value systems, motivations, and behavioural patterns, particularly regarding their spending habits and their interaction with the cultural and social environment. Tourist expenditure reflects not only economic impact but also guides decisions on resource allocation and policy-making. Prior studies have shown that expenditure patterns are influenced by a wide range of factors, including demographic attributes (
Nicolau & Más, 2005;
Pulido-Fernández et al., 2021), travel motivations (
Kim et al., 2011), and even environmental attitudes (
Nickerson et al., 2016). While many of these studies use linear or parametric models, which may overlook complex interactions, others have started exploring more flexible approaches. As tourism behaviour becomes increasingly diverse and data availability improves, segmentation techniques that capture non-linear, interaction-based, and hierarchical patterns are becoming more relevant.
In response to this need, the present study applies a Conditional Inference Tree (CTree) model to analyse tourist expenditure in medium-sized World Heritage Cities. This method identifies visitor profiles based on combinations of sociodemographic variables and their interactions. Rather than focusing on causal explanations, the analysis reveals which demographic groups are consistently associated with higher or lower spending, and under what conditions.
The ultimate goal is not merely to identify high-value tourist segments, but to prompt a broader reflection on the type of tourism that World Heritage Cities are fostering—and the type they should intentionally prioritize. This study seeks to assist public and private stakeholders in formulating tourism strategies that transcend passive adaptation to current demand, aiming instead to proactively shape destinations in line with long-term sustainability goals.
Beyond its empirical results, this study also contributes to the literature in two related ways. On the one hand, it applies a Conditional Inference Tree approach to expenditure analysis, making it possible to identify hierarchical and interaction-based profiles that are difficult to capture through more conventional models. On the other hand, it focuses on medium-sized World Heritage Cities, a setting that remains less examined in expenditure segmentation research despite its relevance for heritage-based destination management.
2. Theoretical Framework
Cities with cultural heritage assets often face a paradox: while tourism offers pathways to economic revitalization and international visibility, it also entails risks of commodification, social disruption, and environmental stress (
Caust & Vecco, 2017;
Środa-Murawska et al., 2021). The designation of a site or city as a UNESCO World Heritage destination often leads to significant economic benefits, such as increased local income and rising property values, particularly in luxury real estate and commercial sectors. However, this designation can also trigger gentrification processes, resulting in social challenges like housing affordability issues and displacement of vulnerable local populations (
Bertacchini et al., 2024). These risks are especially pronounced in medium-sized cities, which, unlike major heritage capitals with global recognition and robust infrastructure, often have limited capacity to absorb and manage tourism pressures.
In smaller cities, historical and cultural heritage plays a central role in shaping both urban identity and development, often exerting a stronger influence than in bigger cities, where heritage tends to be one element among many (
Kern et al., 2021). For many of these smaller destinations, their value does not lie in possessing globally “outstanding” heritage, but rather in their authenticity, human scale, and cultural proximity. Yet, even modest increases in tourism can lead to unsustainable dynamics if local policies are uncoordinated or resident engagement is insufficient.
In contrast to mass tourism that is typically driven by high volume and low-cost experiences, heritage tourism tends to attract visitors motivated by a search for authenticity, cultural enrichment, and a deeper connection with history and place (
Chen et al., 2025;
Poria et al., 2006,
2013;
Timothy, 2025). This presents an opportunity: if well managed, cultural tourism can support not only economic development but also the long-term preservation of local identity and community wellbeing. Therefore, understanding both the economic opportunities and socio-spatial impacts of UNESCO status is essential for sustainable heritage tourism management.
Tourist expenditure has long been a central variable in tourism economics due to its direct relationship with the economic impact of tourism on destinations (
Mehran & Olya, 2019;
Meleddu et al., 2024;
Wanga & Davidson, 2010). Among the most consistently examined determinants are sociodemographic variables, such as age, income, education level, nationality, and employment status, which have been shown to influence both the amount and distribution of spending (
Nicolau & Más, 2005). Sociodemographic variables have been primarily tested through classical regression techniques, including OLS, quantile regression, Tobit models, two-step procedures, and logistic regressions (
Brida & Scuderi, 2013). The most frequently used explanatory variables in the analysis of tourism expenditure are income, sociodemographic characteristics, and trip-related factors. For example, higher income and educational attainment are often associated with increased discretionary spending and a greater willingness to engage in cultural or recreational activities.
In addition to demographic characteristics, a range of contextual and behavioural factors have also been shown to influence tourist expenditure. Variables such as travel motivation, travel party composition, length of stay, accommodation type, and overall satisfaction contribute to shaping spending patterns (
Alegre et al., 2011;
Bernini & Galli, 2022;
D’Urso et al., 2020;
Pulido-Fernández et al., 2021;
Smolčić-Jurdana & Soldić-Frleta, 2017). Psychological, attitudinal, and lifestyle factors have gained relevance in recent years as segmentation models move beyond purely descriptive profiling (
Sánchez González et al., 2025;
Štefko et al., 2022). Also, measuring expenditure by category (accommodation, food, transport, and recreation) provides nuanced insights into consumption behaviour and local economic integration (
Disegna & Osti, 2016).
Although this body of research has substantially improved our understanding of tourist expenditure, much of it still relies on approaches centred on net effects or average relationships between variables. This is useful for identifying general expenditure determinants, but it may be less effective when spending behaviour depends on specific combinations of sociodemographic characteristics rather than on isolated factors considered independently.
Importantly, not all spending is equally beneficial. High expenditure concentrated in foreign-owned businesses or external booking platforms may result in significant economic leakage, limiting the benefits for the host community (
Chaitanya & Swain, 2024). Moreover, large-scale spending can coincide with environmental stress, as seen in destinations suffering from overtourism (
Butler & Dodds, 2022). For this reason, understanding who spends more, in what ways, and under what conditions is essential for identifying segments that contribute to a sustainable tourism economy.
While most studies offer valuable insights, many rely on parametric or clustering methods that may overlook interaction effects or non-linear relationships between variables. This limitation highlights the need for methodological approaches that can better capture the complexity of tourist behaviour in heritage destinations. Especially, in contexts where strategic planning requires identifying not just average profiles but specific combinations of traits that define high-value segments.
This limitation is particularly relevant in heritage destinations, where expenditure patterns often emerge from the interaction of several visitor attributes rather than from the isolated effect of a single variable. In cities such as Úbeda and Baeza, understanding expenditure therefore requires more than identifying who spends more on average; it also involves examining how nationality, income, employment status, or education combine to shape distinct visitor profiles. From this perspective, tree-based approaches provide a useful analytical alternative, as they make it possible to identify conditional and hierarchical structures that are not easily captured through more conventional models.
4. Territorial Context
The empirical analysis focuses on two medium-sized World Heritage Cities in southern Spain: Úbeda and Baeza, both located in the province of Jaén, within the autonomous community of Andalusia. In 2003, these cities were jointly inscribed on the UNESCO World Heritage List for their exceptional Renaissance architecture and urban layout, reflecting a period of cultural and artistic flourishing in 16th-century Spain. The historical development of Úbeda and Baeza reflects several successive cultural and urban phases. Their origins are linked to the medieval Islamic period, while later transformations followed the Christian conquest and the consolidation of Castilian power in the area. The most significant urban and architectural renewal occurred during the sixteenth century, when both cities incorporated Renaissance planning principles and artistic influences from Italy. This process gave Úbeda and Baeza a distinctive humanist character and connected their architectural legacy with broader Renaissance currents that later influenced urban and architectural developments in Latin America.
1Despite their relatively small populations (approximately 35,000 in Úbeda and 16,000 in Baeza), both cities have developed robust cultural tourism sectors. Their proximity to each other (less than 10 km apart) allows for combined visitation, often marketed as a twin heritage destination. Visitors are drawn to their historical city centres, religious architecture, civil buildings, and cultural events linked to Andalusian identity (
Carrillo-Hidalgo et al., 2019).
As is typical of many heritage cities, tourism in Úbeda and Baeza is predominantly cultural in nature and highly seasonal, with peaks during holidays and festivals. The local economy benefits significantly from visitor spending, particularly in accommodation, gastronomy, and guided visits. However, the cities also face challenges related to maintaining the quality of the tourism offer, distributing economic benefits equitably, and avoiding overcrowding during peak periods.
The selection of Úbeda and Baeza as case studies is therefore highly relevant for analysing tourist expenditure patterns in heritage contexts. Their shared designation as World Heritage Cities, comparable cultural profiles, and strong reliance on tourism revenue make them ideal settings for testing segmentation models based on sociodemographic and behavioural variables. Medium-sized World Heritage Cities often face the challenge of leveraging their cultural status for economic benefit without exceeding their infrastructural or social capacities, making them ideal for exploring targeted, sustainable segmentation strategies.
5. Results
5.1. Descriptive Analysis of Socioeconomic and Sociodemographic Variables
This section presents a descriptive overview of the main socioeconomic and sociodemographic characteristics of the tourists surveyed in Úbeda and Baeza. The aim is to profile the typical visitor through key variables such as age, gender, place of origin, income, education level, employment status, and professional category. These variables provide a foundational understanding for the subsequent analysis of spending behaviour and tourist segmentation.
The age distribution shows that the majority of tourists (72%) are aged 45 and above, with nearly 60% falling within the 45 to 65-year-old bracket. Conversely, younger tourists under 30 represent less than 7% of the total sample, a figure that decreases to just 2.9% among foreign visitors. Gender distribution is almost evenly split, with males accounting for 49.55% and females 50.45%, a difference that statistical testing confirms as non-significant. Regarding place of origin, domestic tourists make up almost 90% of the sample, with the remainder mainly from France, the United Kingdom, and Italy. Within Spain, visitors predominantly come from major cities such as Madrid, Barcelona, and Sevilla.
Income levels follow a classic pattern, with the majority of tourists falling into middle-income brackets. For analytical clarity, income was grouped into three categories: low (up to €1200), medium (€1201–€2100), and high (above €2100), with nearly 60% in the medium range. Educational attainment is relatively high, with approximately 70% holding university-level degrees. To facilitate analysis, education levels were categorized into basic, medium, and higher education groups. Employment status reveals that nearly 79% of tourists are employed, 12.7% are retired, and the remaining segment includes students and homemakers. Lastly, the professional category aligns closely with education level, with most tourists being university-qualified professionals or civil servants, and only a small proportion falling into lower-skilled categories.
Table 1 summarizes the main socioeconomic and sociodemographic characteristics of the surveyed tourists.
These descriptive results highlight a clear profile of tourists visiting Úbeda and Baeza: predominantly middle-aged to older adults with a balanced gender distribution, mostly from domestic origins but with a notable international presence from neighbouring European countries. The concentration of tourists within middle-income and higher-education brackets suggests that these destinations attract a relatively affluent and educated demographic. The high proportion of employed visitors suggests that tourism in the area may be driven by discretionary spending among working individuals, while the significant share of retirees points to the destination’s appeal to older tourists, possibly seeking cultural or leisure experiences.
This section explores the composition and distribution of tourist expenditures, based on nine spending categories collected in the survey. While accommodation and meals are universal expenses incurred by all visitors, the remaining categories, such as vehicle rental, organized tours, and entertainment, represent optional expenditures, offering insight into individual preferences and spending behaviour.
5.2. Expenditure Analysis
Understanding the structure and distribution of tourist expenditure is crucial for assessing the economic impact of tourism and identifying potential areas for strategic development. The survey collected detailed information across nine distinct spending categories, encompassing both essential (e.g., accommodation and dining) and optional expenditures (e.g., leisure, shopping, or organized activities). These categories collectively constitute the total tourist spending, allowing for a nuanced analysis of consumption behaviour.
Table 2 presents descriptive statistics for each expenditure category, including their average amounts, standard deviations, and their relative contribution both to the total spending and to tourist participation rates. As shown, all tourists incurred costs for accommodation and dining, while participation in other expenditure categories was more selective.
Accommodation represents the largest share of total spending (37.43%), followed by meals in restaurants (23.21%) and local transport (13.84%). Together, these three categories account for 74.48% of overall tourist spending. Other relevant categories include organized excursions (9.67%), while shopping and leisure activities account for smaller proportions.
Notably, certain categories show very limited participation: only 4.5% of tourists reported spending on miscellaneous purchases, and just 7.1% rented vehicles—likely related to foreign visitor profiles, as discussed later.
A correlation matrix among the spending items reveals generally weak associations, suggesting that most spending categories behave independently. However, some moderate positive correlations emerge between accommodation, meals, and food shopping, on one hand, and between vehicle rental and local transportation, on the other. Interestingly, a negative correlation is observed between participation in organized excursions and miscellaneous spending, implying that tourists who choose structured activities tend to spend less on non-categorized items.
To explore underlying patterns, an exploratory factor analysis with Varimax rotation was conducted. Four distinct components emerged, as shown in
Table 3.
The first factor groups accommodation, restaurant meals, and grocery shopping, and accounts for the largest share of explained variance (19.57%). The second factor includes vehicle rental and local transportation (18.93%). The third contrasts participation in organized excursions with miscellaneous spending (15.01%), while the fourth is characterized by other types of shopping and leisure activities (12.38%).
To aid interpretation, the most relevant loadings (above 0.6 or below −0.6) are highlighted below:
- -
Factor 1 groups Accommodation (0.71), Meals (0.76), and Grocery Shopping (0.76), reflecting core travel necessities.
- -
Factor 2 associates Vehicle Rental (0.88) and Local Transport (0.85), representing mobility-related expenses.
- -
Factor 3 contrasts Organized Excursions (0.79) with Miscellaneous Expenses (−0.80), indicating that tourists engaging in excursions tend to spend less on other unclassified items.
- -
Factor 4 is dominated by Leisure (0.69) and Other Purchases (0.77), reflecting discretionary spending on non-essential activities.
These findings indicate that tourist expenditure can be effectively summarized through a limited number of behavioural patterns. Most spending is concentrated on core services—accommodation, food, and transport—represented by the dominant first factor. The low incidence and independence of some optional categories further justify treating total expenditure as a unified, coherent dimension. This aggregated measure will be used in subsequent analyses to evaluate spending behaviour across tourist profiles and to facilitate robust segmentation and modelling.
5.3. Study of Total Expenditure per Tourist
The variable “G1 Total Expenditure” was analysed in relation to each of the considered factors (see
Table 4). Initially, tests for normality were conducted on the “Total Expenditure” variable within each expenditure category, which revealed that the data did not follow a normal distribution in any case, even after applying a logarithmic transformation to normalize the data. Consequently, the results obtained from non-parametric tests (Wilcoxon or Kruskal–Wallis, as appropriate) should be prioritized. Nevertheless, the following table presents a summary of findings derived from both parametric and non-parametric analyses.
An initial analysis of the data (see
Table 5) reveals that the total expenditure of foreign tourists is substantially higher than that of national tourists. The expenditure distribution for national tourists exhibits positive skewness, indicating the presence of relatively few observations with exceptionally high values. To validate this observation, both a parametric test (after data transformation to address skewness) and a non-parametric Mann–Whitney (Wilcoxon) test were conducted. In both cases, the resulting
p-values were below the 0.05 threshold, indicating statistically significant differences between the two groups. Therefore, it can be concluded that foreign tourists spend significantly more, on average, than national tourists.
As shown in
Table 6, the average total expenditure is slightly higher among women (€137.62) than among men (€135.90), although the difference is minimal. Both distributions show very similar dispersion and skewness indicators: standard deviations of approximately €61–€64, interquartile ranges of €55, and right-skewed distributions (skewness ≈ 1.46–1.47). The coefficient of variation (CV) is also nearly identical for both groups (around 0.45–0.46), indicating comparable relative variability. To statistically assess the presence of differences, a test of means was conducted after data transformation to correct for skewness, along with a non-parametric Mann–Whitney (Wilcoxon) test. In both cases, the
p-value was above 0.05, indicating that no significant differences exist in total expenditure between male and female tourists.
There is a noticeable right-skewness in the distribution of total expenditure across all educational levels (see
Table 7), indicating the presence of high but infrequent spending values that are far from the group means. Descriptive statistics suggest that tourists with higher education levels tend to spend more than those with basic or intermediate education. To validate these observations, both a classical mean comparison test (after data transformation to address skewness) and the non-parametric Kruskal–Wallis test were conducted. In both cases, the resulting
p-values were below the 0.05 threshold, indicating statistically significant differences in total expenditure among the three educational groups.
To further identify which groups differ significantly, a post hoc multiple comparison procedure (Wilcoxon rank-sum test) was applied. The results revealed that tourists with higher education spend significantly more than those with basic or intermediate education. However, no statistically significant difference was found between the basic and intermediate education groups (p-value = 0.08). It is worth noting that among foreign tourists, these differences were not statistically significant. This is likely due to the limited number of foreign respondents with only basic education (n = 7), which may have reduced the power of the statistical tests in this subgroup.
As shown in
Table 8, a noticeable right-skewness is present across all age groups, indicating occasional high expenditure values that deviate substantially from the central tendency but occur infrequently. Descriptively, total expenditure appears to increase with age. This observation is statistically supported by both a classical analysis of means applied after data transformation to address skewness, and the non-parametric Kruskal–Wallis test. In both cases, the
p-value was below 0.05, indicating significant differences in total expenditure among age groups.
To identify the specific groups with significant differences, multiple comparisons using Wilcoxon rank-sum tests were conducted, revealing statistically significant differences between all age categories (p-values < 0.05). When analysed by nationality, these patterns hold true for national tourists. However, among foreign tourists, those aged 30 to 44 years (27 out of 175) and those over 65 years demonstrate significantly higher total expenditures than those aged 45 to 65 years. It is worth noting that foreign tourists under 30 years old constitute a very small subgroup (only 5 out of 175), limiting the robustness of conclusions for this category.
Regarding occupational status within the foreign tourist population, only two categories are meaningful for analysis: employed (110 tourists) and retired (61 tourists). The unemployed group is negligible, comprising only two individuals with exceptionally high average expenditures (€352), and thus, they were excluded from the combined analysis.
Significant differences in total expenditure were found between these occupational groups (p-value = 0.008), with retired tourists spending more on average than employed tourists.
As with previous variables, all income groups exhibit a right-skewed distribution, characterized by infrequent but notably high total expenditure values that deviate from the central tendency. Descriptively, total expenditure appears to increase with income level, suggesting that individuals with higher incomes tend to spend more during their trips. This pattern is statistically confirmed through a classical mean comparison test applied after transforming the data to correct for skewness, and the non-parametric Kruskal–Wallis test. In both cases, the p-values were below 0.05, indicating significant differences in total expenditure across income levels.
To determine which specific groups differed, pairwise comparisons were conducted using the Wilcoxon rank-sum test. Results revealed statistically significant differences between all income groups (p-values < 0.05), reinforcing the observation that higher income is associated with higher tourism expenditure.
However, when disaggregated by tourist nationality, these differences were only significant among national tourists. For foreign tourists, the Kruskal–Wallis test yielded a non-significant p-value (p = 0.4396), suggesting that total income level does not significantly influence total expenditure within this group.
5.4. Decision Tree Analysis
Decision trees are a form of supervised learning algorithm widely used for classification and regression tasks involving complex datasets. Capable of handling both categorical and continuous variables, they are especially valued for their interpretability and visual intuitiveness. In this study, the decision tree functions as an econometric tool to investigate the relationship between total tourist expenditure (log-transformed, continuous dependent variable) and a set of sociodemographic predictors.
Specifically, we employ a Conditional Inference Tree (
Hothorn et al., 2006,
2015;
Levshina, 2021), a statistically robust method that determines variable selection and split points based on significance tests, rather than relying solely on impurity-based criteria. This approach mitigates risks of overfitting and biased variable selection—especially important in datasets with mixed data types or multicollinearity.
Figure 1 illustrates the resulting tree structure. The first and most influential split is based on nationality (
p < 0.001), revealing markedly different expenditure patterns between foreign and national tourists. Among foreign tourists, employment status is the next most important predictor, with retired individuals showing notably higher expenditure levels. For employed foreign tourists, education level further distinguishes expenditure patterns, with university-educated individuals spending more on average.
On the other branch, among national tourists, the initial split is driven by income level (p < 0.001), indicating a strong positive association between income and spending. Subsequent divisions based on employment status and income further refine the classification, highlighting that both economic capacity and labour market position are key factors influencing expenditure behaviour in this subgroup.
This model highlights the hierarchical and interactive nature of sociodemographic variables in shaping tourist spending. The tree structure suggests that policies aiming to stimulate tourism expenditure could benefit from targeting specific subgroups, such as high-income nationals or retired foreign visitors, who demonstrate greater economic impact. As outlined in the introduction, a total of eight missing values (“NA”) were identified in the variable A6 Employment Status, and these were excluded from the analysis. Additionally, two unemployed foreign tourists and one unemployed national tourist with a total income above €2100 were removed due to their outlier status and negligible representation.
As a result, the final sample comprises 1654 individuals grouped into seven distinct tourist profiles, defined by combinations of nationality, employment status, income level, and educational attainment:
Cluster 1: Foreign—Retired
Cluster 2: Foreign—Employed—University Education
Cluster 3: Foreign—Employed—No University Education (residual group)
Cluster 4: National—Retired—Total Income > €2100
Cluster 5: National—Employed—Total Income > €2100
Cluster 6: National—Total Income between €1200 and €2100
Cluster 7: National—Total Income < €1200
Table 9 presents the distribution of individuals across clusters, in terms of both frequency and relative percentage:
Table 10 presents descriptive statistics for each cluster and confirms substantial heterogeneity in spending:
Consistent with prior findings, right-skewed distributions are observed in nearly all clusters (except Cluster 4), suggesting the presence of high but infrequent expenditure outliers. The descriptive pattern again highlights that foreign retirees (Cluster 1) and foreign workers with university education (Cluster 2) are the groups with the highest average total expenditures.
To confirm these differences, both a mean comparison test (after log-transforming the data to correct for skewness) and the non-parametric Kruskal–Wallis test were conducted. In both cases, the results showed statistically significant differences in total expenditure across clusters (
p < 0.05). As shown in
Table 11, post hoc Wilcoxon rank-sum tests revealed that most pairwise comparisons between clusters were also significant:
Notably, there were no significant differences between the following cluster pairs:
- -
Cluster 1 vs. Cluster 2 (foreign retirees vs. foreign employed with university education).
- -
Cluster 2 vs. Cluster 4 (foreign employed with university education vs. high-income retired nationals)
- -
Cluster 3 vs. Cluster 5 (foreign employed with no university education vs. high-income employed nationals).
These findings reinforce the key role of nationality, employment status, education, and income as primary factors shaping tourist spending patterns.
6. Discussion and Conclusions
This study examined the relationship between sociodemographic variables and tourist expenditure in World Heritage Cities, identifying seven distinct visitor profiles through a CTree model. The results confirmed that nationality is the most influential predictor of expenditure, followed by employment status, income, and education. Foreign visitors, especially retirees and university-educated professionals, exhibited the highest average spending (€316), significantly above that of national visitors (€199). Among domestic tourists, income level and employment status were the main determinants. The analysis also revealed asymmetries in spending patterns across categories, with the highest average expenditures concentrated in accommodation, food, and shopping. These findings suggest that foreign tourism contributes disproportionately to local economies and that certain combinations of demographic attributes define high-value segments more accurately than single variables.
The strength of the CTree model lies in its ability to detect hierarchical and conditional relationships that are often overlooked in traditional segmentation approaches. Unlike linear models or conventional cluster analysis, decision trees identify how specific combinations of characteristics (e.g., foreign + retired + highly educated) influence outcomes. From a methodological perspective, our research contributes by applying a non-parametric, interpretable segmentation technique (conditional inference trees) that complements existing clustering and regression-based approaches. This aligns with a growing body of work that leverages tree-based models to understand complex tourist behaviour (
Gardella et al., 2021;
Kim et al., 2011).
Our findings echo those of
Serrano López et al. (
2019), who applied CART and Random Forest models to profile visitors in the World Heritage city of Cuenca (Ecuador). Their study, like ours, emphasizes the value of demographic and expenditure variables in differentiating between tourist types. In their case, a younger, low-budget “backpacker” segment and an older, higher-spending cultural visitor. These results confirm that decision trees are not only effective for prediction but also for providing clear and actionable segmentation schemes that can guide public and private planning, especially in heritage destinations where resource allocation is critical. Our work extends this approach by focusing explicitly on spending behaviour, revealing hierarchical relationships between variables and highlighting expenditure as both an outcome and a profiling criterion.
Pulido-Fernández et al. (
2019,
2021) similarly emphasized that spending patterns are shaped by factors such as length of stay, nationality, and type of activities, although their models relied more on correlational methods. Our study advances this line of research by offering an explanatory model that captures interactions among variables, not just associations. Specifically, the CTree method helps destination managers identify not only which attributes matter, but in what order and combination, providing clear segmentation logic.
Our findings align with those of
Park et al. (
2020), who also identified education, occupation, nationality, and travel purpose as key predictors of tourist expenditure. While their study includes additional behavioural factors, both analyses underscore the importance of sociodemographic variables in explaining spending patterns. Complementarily,
Aparicio et al. (
2022) showed that tourist expenditure tends to be spatially concentrated in specific zones, forming local economic clusters. While their focus was geographic, our profile-based approach adds a sociodemographic lens to expenditure concentration. Together, both dimensions (who spends and where) are essential to understanding and managing tourism’s economic impact in urban heritage settings.
These insights have practical implications. The resulting segmentation provides a data-driven foundation for tourism planning and marketing, allowing local authorities to prioritize visitor types based on their potential economic contribution. This is particularly relevant for medium-sized heritage cities with limited resources, where selective targeting of high-value segments may optimize returns while minimizing pressure on infrastructure and residents. However, while identifying high-spending segments is valuable, it should not be conflated with sustainability.
Nickerson et al. (
2016) suggest that sustainable tourists often exhibit higher spending patterns, signifying a potential compatibility between environmental responsibility and economic return. However, this association should not be assumed. High expenditure does not inherently indicate sustainability. Several studies highlight that much of tourist expenditure may not remain within the local economy due to economic leakage, particularly in developing destinations where tourism infrastructure is often foreign-owned or reliant on imports (
Chaitanya & Swain, 2024). Moreover, destinations experiencing high levels of tourism income also report significant environmental degradation, strain on infrastructure, and declining resident well-being, all symptoms of overtourism (
Butler & Dodds, 2022;
Claudio et al., 2018).
These findings underscore the risk of equating high visitor expenditure with sustainable tourism. A more nuanced approach is needed, one that balances economic indicators with social and environmental metrics to evaluate tourism’s long-term viability. Additionally, recent evidence suggests that tourists choosing sustainable or eco-oriented options may do so not solely for environmental reasons, but also for ego-enhancing motives such as identity signalling or social status (
Beall et al., 2021). This further complicates assumptions about the relationship between spending and sustainability.
7. Limitations and Future Research
While this study offers a robust and interpretable model for understanding tourist expenditure in heritage destinations, several limitations should be acknowledged. First, expenditure data were self-reported by respondents, which may introduce recall bias as well as some degree of under- or over-reporting across spending categories. Second, specific visitor subgroups, particularly foreign tourists with lower education levels, were underrepresented in the sample, which may constrain the robustness of some subgroup comparisons and the broader generalizability of the findings. Third, the empirical analysis is based on two medium-sized World Heritage Cities with specific territorial and tourism characteristics, which limits the external validity of the results beyond this case study. Lastly, although this issue is discussed further below, the dataset reflects a limited time frame and therefore does not fully capture potential seasonal or temporal variation in visitor behaviour.
Second, although decision tree models such as those used here have proven effective in tourism segmentation (
Gardella et al., 2021;
Jackman & Naitram, 2023;
Kim et al., 2011), our analysis was limited to sociodemographic and economic predictors. Future research could extend this framework by incorporating psychographic, motivational, and lifestyle variables, which are increasingly central to understanding tourist behaviour (
Sánchez González et al., 2025;
Vargas et al., 2021). Such variables are particularly relevant in heritage tourism, where visitors’ identities, cultural motivations, or personal values may strongly shape both experience and expenditure.
Additionally, digital behaviour variables, such as the use of mobile applications, online booking platforms, and social media engagement, were not included. Recent work highlights the significant role these digital touchpoints play in shaping travel decisions, satisfaction, and expenditure patterns (
Minazzi, 2015;
Ooi et al., 2023;
Vayghan et al., 2023). Incorporating these factors could enhance both the explanatory depth and predictive accuracy of future models.
Finally, the dataset reflects a limited time frame and does not capture potential seasonal or temporal variation in visitor behaviour. Tourist expenditure patterns and their determinants may shift depending on the time of year, external events, or changes in destination conditions. Future research should adopt longitudinal or multi-seasonal designs to examine the temporal consistency of observed patterns and test the stability of identified segments over time. Collectively, these future directions point to the value of multidimensional, dynamic, and technologically informed approaches to segmentation. Enhancing the current framework through richer variable sets and broader temporal scopes can offer even more actionable and sustainable insights for tourism policy and planning. Future research could also benefit from comparative analyses between UNESCO and non-UNESCO cities, in order to assess whether heritage designation is associated with distinctive visitor profiles and expenditure patterns. Understanding tourist profiles and spending behaviour is key to managing tourism in a way that protects cultural heritage while benefiting local communities. This includes knowing not only where visitors come from or how much they spend, but also the underlying factors that shape those behaviours: demographic traits, motivations, and cultural values.
This knowledge helps answer essential questions: What kind of tourists do we currently attract, and what kind of tourists do we want to attract? Rather than simply adapting to existing demand, destinations should define their desired visitor profiles and strategically develop offerings to align with long-term sustainability goals. It is not enough to tolerate tourists who bring economic returns if their presence harms local society or the environment. Sustainable tourism management must prioritize visitors who respect cultural and natural heritage, balancing competitiveness with social and environmental responsibility.