1. Introduction
In recent years, the relationship between citizens and cultural heritage (CH) has become more and more complex. On the one hand, different socioeconomic barriers are increasingly reducing cultural participation [
1]; on the other hand, ecological activism has identified museums and heritage sites as settings for new forms of protest. These latter events, in particular, have raised the provocative question of whether modern society is more concerned about the “protection of a painting, or the protection of our planet and people” [
2]. These phenomena suggest how a sentimental approach to heritage, rooted in “care”, can help in understanding citizens’ access to cultural heritage from a novel perspective [
3].
In this sense, care is deeply intertwined with key concepts in heritage discourse. Groys, for example, opens his monograph
Philosophy of Care by underscoring the close connection between care, historical significance, identity, memory, and values [
4]. Heritage values, in turn, inform “decisions about what to conserve, how to conserve,” and ultimately what to care for [
5]. However, these values are not universal or inherent to objects, sites, or practices per se; instead, they are attributed by institutions, authorities, and communities [
6]. Moreover, they are multiple [
6] and diverse [
7], as different stakeholders recognise and prioritise different aspects of heritage based on their own perspectives [
6]. Consequently, caring for cultural heritage depends on what is identified as such, a process grounded in individual and community values.
While these contributions help clarify the “subjective” vision of heritage [
7], systematic analyses of visitors’ caring attitudes have received limited attention from heritage literature. Interdisciplinary contributions, particularly from philosophy and moral psychology, may help bridge this conceptual gap.
In fact, the debate on “caring attitudes” stemmed from the second half of the 20th century in the moral and political philosophy domain. One of the pioneering scholars is Carol Gilligan. She interrogates the dichotomy between caring for oneself and other people, and she formalises this topic in a three-stage framework [
8]. This model culminates in a post-conventional stage, in which the balance between care for oneself and others is achieved: individuals develop an ethical perspective that integrates personal well-being with the needs of others, emphasizing interdependence. However, Gilligan’s “care theory” uniquely emphasises the role of care in everyday life and is mainly limited to interpersonal relationships.
Although philosophy did not provide a theoretical framework tailored to cultural heritage, Gilligan’s contribution paved the way for other scholars who sought to extend “care theory” to the relationship between objects (though not mandatorily cultural artifacts). In
Moral Boundaries, Tronto proposes a model incorporating objects and introducing an essential intergenerational ethical dimension, which is also crucial for the preservation of CH [
9]. At its core, caring represents a vital activity encompassing all efforts to “sustain, enhance, and restore our ‘world’ for optimal living. This world comprises our bodies, identities, and surroundings, which we strive to interconnect within a rich, life-nurturing web”.
In this view, caring transcends the notion of being merely a disposition or “perspective” and is redefined as a collection of practices. The author introduces a pertinent framework for understanding the dynamic between caregiver and care recipient, rooted in the essential idea of responsibility. This framework includes a phase of “care-receiving”, wherein the care recipient actively responds to the support provided. For instance, a “well-tuned piano produces beautiful sounds again, and a patient experiences improved health”. This phase is the sole indicator that the caring needs have been fulfilled.
Further studies have emphasised how caring attitudes are shaped by a range of factors, including attachment, recognition of personhood and uniqueness, a sense of wonder, compassion, awareness of human fragility, and a commitment to justice. Noddings [
10], for example, argues that moral behaviour is grounded in an ethics of care, with empathy playing a central role. Importantly, research shows that empathy is not fixed but can evolve over time, influenced by various triggers [
11,
12]. Storytelling, for instance, can foster empathy and emotional engagement by creating a “feeling of resonance”, which helps, in turn, in hooking visitors’ curiosity [
13]. Similarly, museum experiences [
14] and visual arts themselves have also been shown to enhance empathetic capacity [
15,
16].
Building on this foundation, a recent study by Woodhead proposed a new framework that scrutinises, from a legal and ethical perspective, caring behaviours towards cultural objects, sites, or practices not only by public bodies or scholars and scientists but also by the behaviours and attitudes of single individuals and communities [
17]. In her study, the author cites Tronto and her definition of “care” as both a “disposition, in the sense of being a motivation, based on the strong feeling towards cultural heritage, and [a] process, which reflects the fact that communities may care about cultural heritage island respond to this through caring to it” [
17]. On this basis, she highlights (i) the strong relationships between CH and individuals, communities, nations, or humanity since it shapes part of their identity [
17] and (ii) that every caring behaviour (or “caring attitude”) stems from the recognition of a need and the assumption of the responsibility to meet this need [
17].
In addition, the scholar recalls the intimate connection between caring and empathy, noticing that “empathy and sympathy are central to stewards acting with an ethics of care approach” [
17]. This thesis draws on the study by Held [
18] and resonates with recent studies in the cultural heritage domain. In particular, scholarship in this field has delineated a distinct form of empathy known as historical empathy, characterised as a cognitive and affective process involving engagement and identification with figures of the past [
11]. This form of involvement can deepen visitors’ sense of belonging and potentially foster political engagement [
19].
With regard to possible evaluation tools, despite the lack of ad hoc solutions, care theory in cultural heritage also benefits from research in ecology and care for environmental heritage. Kaiser adopted the Rasch model to structure a
General Measure of Ecological Behavior [
20]: this scale is composed of forty items with a yes/no response. In this study, attention is also paid to the observation of volunteering and financial contributions to the environmental cause. The presence of specific associations for the preservation of CH, which also rely on the support of private citizens as volunteers (e.g., in Italy, FAI, Fondo Ambiente Italiano), suggests how these aspects may be included in the measurement of caring behaviour towards CH.
The intimate connection between environmental protection and care for cultural heritage is also stressed in a more recent study by Buonincontri and colleagues [
21]. The authors introduce a framework and the first scale (though not following the Rasch model and not specific to CH caring) for the assessment of sustainable behaviours in heritage sites. Their hypothesis argues that visitors’ experience in these sites and their place attachment may result in more sustainable behaviour, i.e., “specific actions undertaken to reduce or help to recover the damage caused at a specific destination” [
21]. Their scale stems from a robust conceptualisation of sustainable behaviours at heritage sites. Although it does not capture all the elements that emerged in the literature (such as historical empathy, see supra), it is a valid starting point for our research. It differentiates between the following:
General behaviours:
- -
Civil actions: e.g., paying more taxes, volunteering, and signing petitions for CH protection;
- -
Educational actions: e.g., reading publications, watching programs, and attending meetings and seminars about heritage issues;
- -
Financial actions: e.g., donation of money;
- -
Persuasive actions: e.g., talking with friends and relatives about issues related to CH protection;
- -
Legal actions: e.g., judicial actions for the enforcement of CH legislation and regulations.
Site-specific behaviours: e.g., adoption of a work of art, engagement in voluntary work or donations to specific heritage sites, and respect for local traditions.
To the best of our knowledge, a similar endeavour is lacking in the domain of curatorship and exhibition design. While participatory museology provides tools and metrics to assess visitor engagement and the impact of cultural experiences (see, e.g., [
22]), both theoretical frameworks and evaluation protocols for assessing visitors’ sense of care remain unexplored. Addressing this gap could prove beneficial for museum practitioners, as visitors’ relationship with CH is also influenced by how they care for cultural objects or sites. By understanding this “latent dimension” (see
infra) of their visitors, cultural institutions may segment their audiences with more precision and define more tailored engagement and participation strategies to guide their actions.
This absence is not limited exclusively to theoretical frameworks, which may in part be derived from contributions in other domains, but also to robust evaluation tools or specific strategies. Enabling museum practitioners to assess the “sense of care” of their audience could be beneficial, as visitors’ relationship with CH is also influenced by how they care for cultural objects or sites. By understanding this “latent dimension” (see infra) of their visitors, cultural institutions may segment their audiences with more precision and define more tailored engagement and participation strategies to guide their actions.
Embracing the evaluative and interdisciplinary vocation that historically characterises the visitor studies domain [
23], this paper presents the first validation study of a new scale on a sample of young adults. The experimental design is based on the Rasch model, which is conceived to understand the latent ability of a population (see
Section 2). Building on this perspective, we reinterpret care theory within the context of cultural heritage by focusing on visitors’ caring attitudes, as expressed through observable behaviours and critical reflection.
In this sense, our study draws substantially on Tronto’s conceptualisation of care as a “collection of practices.” Accordingly, our framework focuses on concrete actions, such as preserving, engaging with, or advocating for heritage, as well as on visitors’ ability to recognise and reflect upon areas of concern within the heritage domain. Moreover, since Tronto’s theory explicitly refers to objects (the off-tune piano in her example), our scale is primarily focused on caring for tangible heritage, particularly art collections in museums. At the same time, items of the scale can also be applied to heritage sites and intangible heritage (primarily performing arts).
This study, therefore, presents the adopted methodology for the definition of the item, mentioning the most relevant theoretical contribution on “care theory” for cultural heritage and other similar assessment tools proposed in the past, and then analyses the data collected in the first validation (
Section 3).
2. Materials and Methods
2.1. Participants
This validation study was conducted on young adults using the first version of the scale. Items were uploaded and answered in Qualtrics, and the questionnaire was given to participants in the context of a broader study on the impact of personal navigational styles on the visit experience to CH sites [
24]. Administration took place on the premises of the University of Bologna (Italy) during late spring–early summer 2024.
Participants in the randomised sample were recruited through social networking sites, word of mouth, and snowball sampling, for a total of 52 surveys. This value is consistent with the guidelines for the definition of Rasch scales for polytomies [
25]. Respondents were university students and young adults (average: 23.31 years old; q0: 18, q1: 21, median: 24, q3: 25, q4: 34 years old); both males (16 participants) and females (36 participants) participated.
They were Italian and had different educational backgrounds, defined here by the last degree achieved: for only 4% of them, the last education cycle completed was middle school, whereas for 36.5%, it was high school. Another 36.5% of the participants had received a bachelor’s degree, while 19% had received a master’s degree. Only 4% of the survey respondents had been awarded a doctorate.
The study was approved by the Ethics Committee of the Department of Psychology of Bologna University (Prot. No. 0008598), in line with the guidelines on human research of the Declaration of Helsinki (1964). All participants gave informed consent before completing the study. Participation was voluntary and without compensation.
2.2. Item Definition
The construction of the scale for the assessment of citizens’ sense of care for cultural heritage was carried out on the basis of the Rasch model. This mathematical model assumes that the likelihood of the response to an item in a survey is influenced by the difficulty of the item itself and the ability of the person in the reference domain. By analysing these responses, it is therefore possible to assess latent ability. A possible workflow for the construction of the scale [
26] is organised in four essential steps:
The identification of the latent variable, which is assumed to be unidimensional;
The creation of observable indicators or items to describe the variable at stake (observation design);
The codification of the responses to the different items (scoring rules, e.g., dichotomous “yes”–“no” or Likert scale answer format);
The elaboration of a model “to link the observed responses to items and people based on their locations on a latent variable scale” [
26].
The definition of possible items corresponding to different difficulty levels was thus the first step of our research. Nonetheless, the two main challenges were the absence of previous evaluation tools specific for the sense of care in CH and the limited literature available in this domain. We identified, as a possible solution, the retrieval of specific aspects connected to care from existing studies on the topic and comparison with other scales for similar purposes (e.g., ecological behaviour and sustainability in CH sites) (see
Section 1). For each dimension, a set of items was put forward and discussed within a focus group of cultural heritage experts. This process allowed us to prepare a first survey, which was submitted for the first validation (see
Section 2.3 and
Section 3 for the analysis).
These resources laid the foundation for the definition of a first set of items, which was discussed with conservators, heritage scientists, and researchers in interaction design and digital cultural heritage, who were all from Italy and primarily focused on tangible heritage, in particular movable objects in museum collections. This first validation was conducted as a focus group: as it involved seven participants, it was configurable as a mini-group [
27] and had a double goal.
Firstly, participants were asked to analyse these items and provide feedback about them, as well as suggest new possible caring behaviours. The results of the focus group stress the importance of these three classes of behaviours: (i) visiting cultural sites and institutions and engaging in participatory and/or social activities (e.g., workshops); (ii) “educational actions”, e.g., listening to podcasts dealing with CH issues; and (iii) “civil actions”, in particular volunteering, and “financial” actions.
Secondly, the construction of a Rasch scale requires the definition of difficulty levels associated with each item. From this perspective, focus group participants were asked to complete two different tasks. The first one consisted of organising, in increasing order of difficulty, seven classes of behaviours that emerged in the literature. The results were the following: (i) worrying about CH; (ii) spending time for CH; (iii) visiting at least five cultural institutions (museums, sites, etc.) per year; (iv) spending money on CH; (v) when visiting a cultural institution, looking for additional material to better understand what you are seeing; (vi) devoting your free time to deepening topics related to CH and/or that emerged during a cultural visit; (vii) being an “ambassador” of CH, i.e., inviting people to learn about and visit cultural institutions around you.
These results are aligned with the insights that may be extracted from the existing literature, but from groups (ii) and (iv): personal involvement in terms of effort, time, or money (defined in existing studies as “civil actions” and “financial actions”) is traditionally identified as the most difficult action to put into practice. Participants were therefore asked to fill in a pre-test on the first prototype of the scale. To limit biases connected to self-report, the questionnaire did not take into account visitors’ future intentions or beliefs [
28]; conversely, the questions (see project’s repository [
29]) exclusively investigated behaviours shown in a limited and recent time span.
The limited sample does not make it possible to draw sharp statistical inferences; nonetheless, some general observations can be made. The vast majority of the participants agreed that identifying CH as part of their own identity, visiting CH sites or institutions, and worrying about CH and its preservation are seen as groups of behaviours requiring less effort. More demanding are traditional “educational actions,” both during the visit (e.g., using either free or paid educational material) and, in particular, after the visit. This group is followed by “persuasive actions,” whereas “financial” and “civil” actions emerge as the most difficult to put into practice. On this basis, a first set of items was prepared and underwent the first statistical validation, as described in the following sections.
2.3. Materials
The final survey consists of a core section of sixteen questions. The items are provided here in English (the validation was conducted in Italian), and each of them required an answer on a Likert scale (from 1 to 6). This set is integrated with an informed consent form and an additional set of questions concerning demographic data on age, gender, and educational level (quantified in years).
The core part of the Scale for the Assessment of Caring for Cultural Heritage (CHARE) is described in
Table 1: items are associated with an identifying key, used in the validation analysis (
Section 3), and grouped in classes of behaviours, ordered in a hypothesised increasing order of difficulty. The last column specifies the source supporting the inclusion of this item (the literature is referred to with direct bibliographic citation, while focus group conclusions are mentioned with the acronym “FG”).
In this first validation survey, additional questions were introduced to later assess the concurrent validity of the CHARE questionnaire. Given the scope of our analysis, we hypothesised a possible correlation between a sense of care and artistic interest. To this end, the questionnaire on art interests by Chamorro-Premuzic and Furnham (translated into Italian and adapted to the Italian school system) was used [
30].
This questionnaire consists of three main sections. The first one focuses on respondents’ education and participation in cultural activities and, in the version adapted for this study, consists of ten questions with a dichotomous “yes”–“no” answer format. The second section includes three questions on cultural participation, in which survey respondents describe their habits with a semantic differential scale. Lastly, fourteen questions make up the last section: here, respondents are asked to report if they would manage to recognise a picture from different art-historical periods with a “yes”–“no” answer.
2.4. Rasch Model as a Measure of Caring Attitude
In order to estimate caring attitudes, we used a dichotomous Rasch model for scale measurement. Specifically, the Rasch model constitutes a simple form of Item Response Theory (IRT). As such, it assumes that the probability of a respondent’s answer to an item can be described as a function of the location of the person on the latent trait and the item difficulty parameter (
). In this way, starting from dichotomised responses, in the case of the dichotomous Rasch model, it is possible to estimate person and item locations on a linear scale that represents the latent variable of interest. The difference between the item and person locations can then be used to estimate the probability of the participant answering 0 or 1 to a specific item, thus representing the probability of engaging or not engaging in a specific behaviour. Formally, the dichotomous Rasch model can be expressed in log odds units (logits) as follows:
where
represents the probability of a person
n positively engaging in item
i,
represents the probability of a person
n negatively answering item
i, and these are given by the difference between the person’s ability on the latent trait (
) and item difficulty (
), expressed on the logit scale. In general, the main assumptions for estimating a Rasch model are as follows:
Unidimensionality: The model estimates a single latent variable, which is sufficient to explain most of the item response variations.
Local independence: There is no substantial association between responses to individual items apart from the latent dimension.
Differential Item Functioning (DIF) and subgroup invariance: Item estimates must be independent from the persons (i.e., the sample) used to estimate them. A test item shows DIF if individuals from different subgroups (e.g., gender, age, urban or rural place of residence, etc.) have different probabilities of answering the item “correctly”, even though they have the same overall ability according to the construct being measured.
The steps we employed to assess the validity of these assumptions are explained below.
2.5. Statistical Analyses
Analyses were conducted using R [
31], and Rasch model estimation was conducted using the eRm package [
32]. Data and analyses are available on the project’s repository [
29]. In the first step, polytomous responses were dichotomised (Likert responses 1–3: 0; 4–6: 1), following standard procedures [
33,
34]. We started by evaluating the model fit of the initial set of items resulting from the item definition process described above. Items with acceptable fit measures were maintained, and the model was refitted, following an iterative process. Here, we describe model fit and Rasch model testing steps.
2.5.1. Model Fit
Infit and outfit. In the first step, we assessed inlier-sensitive, or information-weighted, fit (infit), as well as outlier-sensitive (outfit) statistics. Specifically, we considered both mean-squared (MS) fit and z-standardised infit and outfit statistics to assess item validity. The former represents the amount of distortion of the measurement system and is based on
, with 1 representing the model’s expected value. The latter expresses the corresponding
t-test values answering the question of whether the data perfectly fit the model, therefore representing the improbability of data compared to the model, with 0 being the expected value of the model. Indeed, we followed Wright and Linacre’s guidelines to assess fit measures [
35], for which MS fit values should range between 0.6 and 1.4 for rating scales and values higher than 2 distort the measurement system, while values lower than 0.5 indicate that the corresponding items are less productive for measurement, but not degrading. Similarly, z-standardised values higher than 3 indicate that the values are highly unexpected, therefore indicating that the corresponding items do not fit the model. Conversely, values lower than −2 indicate that the items are too predictable and may indicate the presence of an additional dimension.
Discrimination. We assessed items’ discrimination by calculating point-biserial correlations between each item and each person’s row score sum of every other item. Point-biserial correlations consist of an adaptation of Pearson’s correlation for dichotomous responses [
36]. Specifically, we used eRm’s
method, which tests whether one-sided correlations between one item and every other item are too low, possibly suggesting that the item discriminates differently compared to other items.
2.5.2. Rasch Model Assumptions Testing
Unidimensionality. We assessed the one-dimensionality of the model using different approaches. First, we performed principal component analysis (PCA) on the matrix of inter-item correlations of the standardised residuals produced by the model. The PCA produces components, referred to as contrasts. If the first contrast’s eigenvalue is small (usually less than 2), it can be considered to be at the noise level, thus not falsifying the hypothesis of random noise. On the contrary, the residuals’ loadings indicate that there are contrasting patterns in the residuals [
37,
38]. The second approach we used to assess one-dimensionality was the Martin–Löf test (MLoef) [
39], which is based on the likelihood ratio test to assess whether the items measure the same one-dimensional latent construct. Finally, we followed up by conducting a nonparametric test for multidimensionality (
) [
40], in which data are divided into two subscales that must be positively associated. Again, a low correlation indicates low discrimination and/or multidimensionality between items.
Local dependence/independence. Local dependence was assessed first by using the nonparametric
method [
41], which considers all possible item pairs and counts cases with equal responses to both items. By doing so, this method checks whether extreme response patterns occur more often than expected by the Rasch model [
40]. We also used the nonparametric
method [
41], which calculates the sum of absolute deviations between the observed inter-item correlations and the expected correlations, thus checking for local stochastic independence.
DIF and subgroup invariance. To test for measurement invariance, we employed Andersen’s Likelihood Ratio Test (LRT) [
42], the nonparametric Ponocny’s
test [
41], as well as the Wald test to detect DIF in single items [
43]. For all tests, we split the sample into subgroups, first based on participants’ gender and then based on their ability (i.e., scoring lower and higher than the median). In order to confirm the assumption of subgroup invariance,
estimates (i.e., item difficulty) should be similar for both subgroups, for each splitting criterion.
We sampled 500 matrices for all the nonparametric tests described in this section and set the same seed for reproducibility of results.
2.5.3. Model Reliability
In general, reliability is expressed as the quotient of true variance over observed variance and represents the level of reproducibility of measures.
Item and person separation. A good questionnaire developed following the Rasch approach should produce acceptable item and separation reliability scores. Indeed, the former is based on the variance of item difficulties (), while the latter is based on the variance of individuals’ abilities ().
Internal consistency. Internal consistency was assessed by calculating the KR-20 [
44], which is analogous to Cronbach’s
for dichotomous responses and is based on raw score variance.
Concurrent criterion reliability. To assess the test’s concurrent validity and predictive power, we compared people’s caring attitudes estimated by our CHARE model with their scores on the arts interest questionnaire. As described in
Section 2.2 above, this questionnaire is composed of three sub-domains investigating different aspects of people’s arts interest and art knowledge. Given the characteristics of the three subscales, we hypothesised caring attitude to be positively associated with the two art interest dimensions, whose items investigate people’s engagement in cultural and art-related activities (artistic experiences and artistic activities). Conversely, we did not expect caring attitudes to be directly related to people’s level of knowledge about artistic movements and styles (art recognition). To accomplish this, we first used simple Pearson correlation to quantify the relationship between the caring estimate and the three art interest measures. In the second step, we included the three artistic interests measures in a multiple regression model in order to observe which of these predicted caring attitude, expecting artistic experience and artistic activities to predict caring, compared to art recognition.
3. Results
After item dichotomisation, we observed that all responses to item E3 [“In the last year, did you volunteer for the cultural sector (e.g., Italian FAI, guide for a local institution, Friends of a museum etc.)?”] were 0, meaning that it is not a good estimator of the behaviour. Indeed, it would be interpreted as too difficult, given that no one reported engaging in this specific behavior. As such, we excluded it from the estimation. Nonetheless, future studies with broader and more heterogeneous samples may lead to different answers to this item (see infra): therefore, it may still be included in the future.
Moreover, people scoring zero on all items and people with perfect responses (scoring all 1) were excluded from model estimation and model fit analyses, given that these responses are unproductive for measurement. Conversely, caring attitude was estimated for them as well when assessing concurrent criterion validity.
3.1. Rasch Model Estimation
We proceeded by estimating the Rasch model on the remaining 15 items. To assess item adequacy, we first analysed items with high MS outfit and infit statistics, which would distort the measurement system. Indeed, item C3 [“In the last year, during a visit, did you try to use paid informational materials (e.g., audio guides or paid smartphone apps)?”] presented high MSQ values (Outfit MSQ = 2.852; Infit MSQ = 1.558). Moreover, point-biserial correlations confirmed that C3 could be problematic, as this was the only item showing a significant point-biserial correlation (p < 0.001) with the rest of the items, indicating poor discrimination.
Therefore, we chose to exclude this item and refit the model with the remaining 14 items to see whether the exclusion of C3 would recentralise the remaining items. However, all “financial actions” items (E1, E2, and E4) still presented high overfit, expressed as very low MSQ values (E1: Outfit MSQ = 0.258, Infit MSQ = 0.700; E2: Outfit MSQ = 0.311, Infit MSQ = 0.883; E4: Outfit MSQ = 0.097, Infit MSQ = 0.589). On the other hand, C2 presented borderline Outfit MSQ values (=1.407). Point-biserial correlations were not significant for any items, apart from item C2 (p < 0.05). Although the three “financial action” items did not show significant point-biserial correlations, we chose to exclude them from the model, given that maintaining these items could produce inflated reliabilities. Conversely, we chose to maintain C2 to see whether its fit measures would recentralise after excluding the financial items.
We thus proceeded with re-estimating the 11-item model. Specifically, this reduced item set produced overall good fit measures (see
Table 2), confirmed by z-standardised statistics summarised in
Figure 1. Specifically, after eliminating possibly problematic items, item C2 now presents improved fit statistics. On the other hand, point-biserial correlations are acceptable for all items except C2, which still results in a significantly low correlation (
p < 0.05). This indicates that the item does not discriminate as well as the other items, potentially indicating that the item is estimating a different dimension than the other items. However, given the good fit measures of the item and the theoretical importance of “educational” behaviours for the development of a sense of care towards cultural heritage, we chose to maintain this item [
21].
The person fit statistics presented an average mean-square infit of 0.987 (SD = 0.179) and average infit t of 0.094 (SD = 0.635). Overall, no participant exhibited significant misfit (z-value > 1.96), indicating that no one deviated significantly from the expected responses. Therefore, all participants (n = 44, after excluding zero and perfect responses) were included in the analyses.
3.2. Rasch Model Testing
Having obtained a set of items with satisfactory fit statistics, we proceeded with testing Rasch model assumptions and reliability.
3.2.1. Unidimensionality
The first step we used to check the one-dimensionality assumption was conducting PCA of residuals. As displayed in
Figure 2, results show a first contrast presenting an eigenvalue of 2.2, thus exceeding the threshold of 2.0, for which correlations among residuals may present a non-random pattern, possibly indicating multidimensionality.
Following up on the previous analysis, we ran a Martin–Löf (ML) likelihood ratio test on 500 sampled matrices. Using a median split criterion, the test result was non-significant (p = 0.95), strongly indicating that the assumption of unidimensionality was met. Further confirming this, the test for multidimensionality, also conducted on 500 sampled matrices, showed a non-significant result (p = 0.95).
3.2.2. Local Independence
We used both and to assess local stochastic independence, both conducted on 500 sampled matrices. Specifically, the first test revealed that items B1 and B2 showed significant one-sided p-values (p < 0.01), suggesting that these items violate the independence assumption. To follow up on this analysis, we conducted the global test on the whole set of items. Notably, this result was non-significant (p = 0.606), indicating that the overall set of items does not violate the independence assumption.
3.2.3. Subgroup Invariance and DIF
To evaluate the assumption of subgroup homogeneity, we examined whether items were more or less difficult for males compared to females, as well as for individuals with raw scores below or above the median. Notably, Ponocny’s
test showed a non-significant result for both splitting criteria (median split
p = 0.36; gender split
p = 0.078), suggesting the absence of DIF. Andersen’s LR test further confirmed this, returning non-significant values for the two splitting criteria (median split
p = 0.338; gender split
p = 0.064). Taken together, the tests converge in indicating that the items’ difficulty is the same for all subgroups. Despite this, we further checked how single items behaved for all subgroups. Indeed, the DIF plot displayed in
Figure 3 allows the visual inspection of whether item difficulty differed based on the two splitting criteria. Indeed, red ellipsoids indicate 95% confidence intervals of both dimensions of
estimates. Items with ellipsoids that do not touch the identity line can be considered to show DIF. Indeed, item C1 seems to be more difficult for people with higher raw scores; on the other hand, item C2 appears to be consistently easier for females than males, while item B3 shows the opposite pattern.
Wald’s test results quantify the presence of DIF for these items. Confirming the graphical results, item C1 is more difficult for people with higher raw scores (z = 2.331, p < 0.05), while item C2 appears to be more difficult for males than females (z = −2.118, p < 0.05). Conversely, item B3 is more difficult for females (z = 2.281, p < 0.05). All other items’ difficulties result in non-significant values, indicating the absence of DIF.
3.3. Reliability
After assessing model fit and checking model assumptions, we investigated both its reliability and concurrent validity. Indeed, the overall model internal consistency results were acceptable (KR-20 = 0.74), indicating that the estimates produced by the CHARE scale reliably estimate caring behaviour.
Overall, the model presented a moderate separation reliability of 0.69, meaning that the scale can distinguish between at least two levels of general ability on the latent scale, while further refinement of items and additional items may be needed to improve measurement precision. On the other hand, an item separation reliability of 0.72 also indicates moderate reliability. This means that the test can differentiate items across a reasonable difficulty range but may not cover the entire spectrum of difficulty. Reliability indices are summarised in
Table 3.
Indeed, this can be visualised by observing the Item Characteristic Curves (ICCs,
Figure 4) and the Person–Item Map (PI map,
Figure 5). ICCs represent the estimated probability of engaging in a specific behaviour described by the item as a function of an individual’s position on the estimated latent trait (i.e., caring attitude for CH). Overall, the items seem to be well distributed on the latent dimension, although some ICCs (A2–A3 and B3–B1) seem to overlap, indicating that they produce the same information. On the other hand, PI maps are useful for comparing the range and position of the item measure distribution in ascending order of difficulty (lower panel of
Figure 5) with the range and position of the person measure distribution (upper
Figure 5). Items should ideally be located along the whole scale to meaningfully measure the ‘ability’ of all persons [
46].
Indeed, the person–item map reveals that items A2–A3 and B3–B1, although very close, are not completely overlapping, so they may all be considered informative for measurement. Overall, item difficulty appears to be well distributed across difficulty levels, although the logit difficulty range they cover appears to be narrow, spanning between around −1 and 1, failing to include participants with lower or higher caring estimates. This could be due to the exclusion of more difficult items, like those involving money investments. Moreover, some gaps between item difficulties exist, especially in the range between 0.4 and 1 logits.
Lastly, a good model estimated using dichotomous Rasch modeling should yield normally distributed person estimates (
) (upper panels in
Figure 5 and
Figure 6). Specifically, person parameter estimates appear to be normally distributed (Shapiro–Wilk: W = 0.96,
p = 0.11) with mean (
) = 0.03 and standard deviation (
) = 1.4.
3.4. Concurrent Criterion Reliability
To assess concurrent criterion reliability, we first conducted a simple Pearson correlation among all three dimensions of the arts interest scale, namely artistic experiences (Art. Exp), artistic activities (Art. Act), and art recognition (Art. Rec), and the caring attitude score, estimated on the whole sample (including zero and perfect scores previously excluded to estimate the model) using the CHARE scale (
Table 4). Notably, all measures resulted in moderate positive correlations (
p < 0.001).
Crucially, we then included all variables in a multiple linear regression model in order to estimate whether the three artistic interest sub-dimensions related differently to the caring attitude estimate. In order to assess concurrent criterion validity, we expected caring attitude to be predicted by artistic experiences and activities but not by art recognition. Results are displayed in
Table 5: the model was statistically significant (F(3, 48) = 13.03,
p < 0.001) and explained 44.9% of the variance in caring attitudes estimated with the proposed CHARE scale (
= 0.449, adjusted
= 0.414). While artistic experiences were a significant predictor of caring attitudes, as expected, artistic activities did not significantly predict caring attitudes. Conversely, although art recognition was not significant (
p = 0.052), as predicted, it showed a marginal relationship with caring attitudes. Lastly, the assumptions of linear regression were evaluated and found to be adequately met. Residuals were approximately normally distributed (Shapiro–Wilk:
p = 0.776). Homoscedasticity was confirmed, indicating constant variance of residuals (Breusch–Pagan:
p = 0.848). Additionally, there was no significant autocorrelation in the residuals (Durbin–Watson:
p = 0.053).
4. Discussion
The core hypothesis of this research was that the sense of care towards CH can be considered a latent ability and can therefore be assessed using a Rasch model. The results yielded by this first validation study confirm this initial hypothesis. Responses to the first set of items (see
Table 1) show that the CHARE scale demonstrates good reliability. In the overall item set, the unidimensionality (as per the Martin–Löf test) and local independence assumptions for the Rasch model estimation are met. Furthermore, item difficulties do not vary significantly between subgroups; nonetheless, some items show evidence of differential item functioning (DIF), suggesting that individual differences and gender may influence individual tendencies towards cultural heritage caring.
The scale shows moderate correlations with the three artistic dimensions—art experience, art activity, and art recognition—indicating a relationship between caring and these factors. However, only art experience emerges as a significant predictor of caring among these. This finding is consistent with the theoretical framework of the scale, as art experience (i.e., personal involvement and engagement with art) is more likely to reflect an individual’s sense of care towards CH. While art recognition may correlate with caring, it theoretically has little direct impact on caring attitudes, reinforcing the idea that the scale captures the most relevant aspects. Overall, the results provide evidence for the concurrent validity of the caring scale, with the strongest support coming from its relationship with art experience.
Furthermore, the current results show that the theoretical assumption that caring for cultural heritage may be considered a unique latent dimension is generally upheld by the converging diagnostics. However, future studies may explore alternative structures and challenge whether the choice of a one-dimensional structure adequately captures the complexity and multi-faceted nature of caring for cultural heritage.
Compared to the initial set of items (see
Table 1), five questions had to be excluded: C3 and the entire last group of behaviours, hypothesised as the most difficult one (E1, E2, E3, and E4). These questions correspond to what the literature defines as “civil” and mainly “financial” actions. Their exclusion may be influenced by the sample chosen for the validation of this scale, which is composed of very young adults. These figures are coherent with major socioeconomic trends in the post-pandemic scenario. On the one hand, the “culture sector experienced significant decline during the COVID-19 pandemic, with […] an estimated revenue loss amounting to 20% to 40%” [
47]. On the other hand, the “pandemic disproportionately affected young people in relation to their […] social development, impacting their employment prospects and future prosperity” [
48].
Indeed, young adults—especially in post-pandemic socioeconomic conditions—often have limited financial capacity and are less likely to donate monetarily, instead preferring time or experiential forms of engagement [
49]. Moreover, intergenerational research highlights that older cohorts exhibit stronger donation behaviour grounded in generativity, which younger people may not yet display [
50]. Therefore, asking this population about their dedication to CH based on money or time investment may appear inappropriate.
Additionally, the sample involved in this validation study is not segmented into other possibly relevant subgroups. As mentioned in the Introduction, the recognition of an object, site, or practice as cultural heritage; the motivation supporting its care; and, in general, heritage values themselves are greatly dependent on the cultural, social, and economic background of a community. In this pilot study, our analysis investigated a national context (Italy) without gathering further information about the origin of the participants, i.e., whether they came from rural or metropolitan areas or from the North or the South of the country. These factors should therefore be considered as variables in future iterations of this study.
By the same token, the primary focus on young adults also poses possible biases connected to the shifting baseline syndrome. Drawing on the seminal work by Pauly [
51], Spenneman [
52] observes that each generation tends to perceive the world they grew up in as the norm, often disregarding historical reference points valued by previous generations. In the cultural heritage domain, this suggests that heritage values and even what is recognised as significant—and thus worth caring for—can be substantially redefined over time. This phenomenon can therefore impact the attitudes assessed by the CHARE scale. Expanding the evaluation sample to include other age groups is therefore necessary to mitigate this risk and ensure broader applicability of the scale.
As a result, the specificity of this sample population appears to be a limitation of this study, particularly concerning its implementation in research contexts involving more diverse audiences. Future studies should be conducted to extend this validation analysis with a heterogeneous and more representative sample in terms of both cultural and socioeconomic background. These iterations with populations holding different heritage values could enhance the generalisability of our conclusions and support their applicability to other forms of CH, particularly those related to intangible heritage. This process may also prompt a redefinition or refinement of certain items to better reflect these specificities.
These limitations therefore suggest that the applicability of these preliminary findings holds in a specific context, i.e., for a given population, on a national scale, and based on a definition of “cultural heritage” that primarily refers to museum collections. Nonetheless, this work still offers a novel contribution to the museological debate, investigating the implications of the care theory on how visitors relate to cultural heritage. Particularly, our study extends the moral findings by Tronto [
9], the legal reflection by Woodhead [
17], and Buonincontri et al.’s study on sustainable tourism [
21] to the cultural heritage discourse and represents the first attempt to fill an existing literature gap. Furthermore, this study’s scope is not limited to a theoretical contribution; rather, it operationalises its findings in a preliminary evaluative tool at the disposal of museum and heritage practitioners.
5. Conclusions
Although “care theory” is emerging as a novel perspective to understand the relationship between citizens and CH, tools for its assessment currently need to be improved. This study proposes the first attempt to fill this gap, introducing a new scale, CHARE, which relies on the Rasch model. This manuscript, in particular, discusses its validation on a sample of young adults.
In
Section 2, an initial set of sixteen items is derived from the existing literature on the topic and a focus group discussion with heritage experts. It is subdivided into five distinct groups of behaviours, ordered in increasing order of difficulty. Then, statistical validation methods are presented, and the full analysis is described in
Section 3. The study shows that the CHARE scale demonstrates good reliability, unidimensionality across its items, and local independence, whereas subgroup variance is generally not significant. This verifies the core assumptions of the application of the Rasch model.
Compared to the initial set of items, the validation study excluded five of them (items C3, E1, E2, E3, and E4); in particular, an entire group is excluded. This group pertained to behaviours requiring money and time investment for the CH cause. Their exclusion may be connected to the reference population used in this validation, for whom similar requests might be inappropriate. In fact, in the current state, the specificity of the sample is the main limitation for implementing CHARE on a large scale with more diverse target audiences. For this reason, it is crucial to extend this tool’s validation to a more heterogeneous population and consider extending the set of items to improve the difficulty range and item reliability.
The finalisation of this scale and its validation on different target audiences will be significant in various fields. For instance, in education, it can help identify groups with a low sense of care and guide the development of tailored educational programs (particularly for the secondary and tertiary education cycles, as the validation in this study was performed on young adults). Similarly, in public policy, this tool can support decision-making in cultural heritage management. Through CHARE, communities or demographic segments exhibiting detachment from heritage can be identified. On this basis, policymakers can more effectively allocate resources to initiatives that promote inclusion, participation, and social cohesion [
53]. Additionally, repeated administration of the questionnaire can contribute to monitoring the impact of policy interventions over time.
Lastly, the scale holds significant potential for museum professionals and researchers within the visitor studies domain. The design and curation of (digital) cultural heritage experiences increasingly require robust evaluative tools to assess not only user satisfaction but also deeper attitudinal and cognitive engagement. This growing attention is also reflected at the European level, with multiple Horizon projects addressing this challenge, such as GIFT [
54], SPICE [
55], and PERCEIVE [
56]. As an example, the last one (
https://perceive-horizon.eu/ (accessed on 1 August 2025)) focuses on exploring—also through interactive exhibitions—visitors’ sense of care in a variety of cultural objects, such as sculptures, paintings, photographs, textiles, and born-digital artworks. In this context, CHARE can serve as a pre-test to understand visitors’ initial caring attitudes towards cultural heritage. If combined with a post-experience questionnaire, the scale can provide valuable insight into the effectiveness and the cognitive–emotional impact of user experiences in cultural contexts [
57].