Virtual Reality Applied to Heritage in Higher Education—Validation of a Questionnaire to Evaluate Usability, Learning, and Emotions

: Cultural heritage is one of the areas where Extended Reality is having a signiﬁcant impact nowadays. Although often associated with entertainment, this technology has enormous educational potential when applied to heritage. Therefore, it is essential to implement monitoring tools in educational practice to assess its actual eﬀectiveness. This article presents the process of generating and validating a statistical data collection instrument developed to evaluate a virtual reality experience created using the archaeological heritage of the ancient Roman city of Augusta Emerita (Mé-rida, Spain). It can be easily adapted to evaluate similar experiences. The aim is to gauge t he eﬀec-tiveness of these experiences as a didactic resource. The questionnaire was subjected to an evaluation of its three dimensions. Content validity was analyzed through expert judgments, while applicability was tested by students. Finally, a series of statistical tests were conducted to verify construct reliability and internal consistency. Based on the results obtained and cross-referenced with the data provided by the literature, the suitability of this tool for collecting data on usability, learning, and emotions in virtual reality experiences is conﬁrmed.


Introduction
The dizzying technological advances of recent years need to be accompanied by practical applications that improve people's quality of life.In the case of digitization, modeling, and 3D visualization (understood as extended reality), the dissemination of cultural heritage is one of the areas where they are having the greatest impact, especially when applied to historic buildings and sites.But beyond a more playful aspect, the educational potential of extended reality experiences applied to heritage is enormous.
Heritage-related educational programs have experienced a rapid increase worldwide since the 1990s [1,2], although it was not until the beginning of this century that their didactic potential was transferred to the classroom [3].Thus, museum pieces, historical buildings, traditions, customs, archaeological remains, or popular stories become highly effective instruments in the teaching-learning process [4].European policies have also helped to expand this concept, especially following the framework agreement signed in Faro (Portugal) in 2005 [5] and ratified in 2021.In harmony with these lines of action, heritage education is gaining importance within the Spanish educational system [6] in such a way that knowledge, respect, valuation, and conservation of heritage are part of the exit profiles of each stage under the new educational law (LOMLOE).
Therefore, it is essential to implement the use of heritage elements, in general, and archaeological heritage, in particular, in subjects like social sciences and history in the degrees of Early Childhood and Primary Education due to its crucial role as one of the most significant pedagogical resources within these disciplines [7] and its contribution to the construction of social and cultural identities [8].But, following Gómez-Carrasco et al. [9], even today, master classes, textbooks, and rote exams predominate in training future university teachers.In the constant search for improving the teaching-learning process, in the past few decades, various alternatives have been imposed to complement the traditional merely expository classes.It is about adopting an active didactic methodology in which students participate in their self-learning [10].In this context, information and communication technologies (ICT) become relevant.Their full incorporation into the classroom is one of the most urgent challenges facing the education system [11] to acquire the digital skills demanded in the society of the 21st century [12].Recently, virtual and augmented reality (VR and AR, respectively) have also begun to be deployed for their possibilities in education [13][14][15][16].
Their use in the field of historical and archaeological heritage, as well as in museums, has become increasingly widespread in recent years due to their well-known capabilities for dissemination.However, the application of these VR experiences in formal educational environments remains limited [17], either because of the cost of the necessary devices or because of the lack of teacher training when implementing these kinds of experiences [18].Even so, we can find some examples, such as Yildirim et al. [19], who facilitated a VR experience with Primary Education degree students for learning about Islamic culture, or Arias, Egea, and García [20], who presented an immersive virtual reality experience allowing secondary school students to visit the Roman theatre of Cartagena, demonstrating VR's utility in two distinct educational areas.
Nevertheless, in the design of activities of this type, the didactic approach is sometimes lost, turning these active and innovative methodologies into a mere diversion.So much so that some studies consider the suitability of certain experiences when it comes to producing coherent and structured learning [21].Similarly, although our literature review on the subject shows that virtual reality experiences are motivating students [16,22], it is necessary to know the type of emotions generated by these immersive experiences [23] and their impact on the teaching-learning process [24].
Therefore, it is essential to implement monitoring tools in our educational practice that allow us to assess from different points of view (physical/psychological effects, usability, learning, emotions...) the real impact that the proposed activities have on students, as we see in Baxter and Hainey ([25]), Lin and Mawela [26], Liritzis et al. [17], Bazargani et al. [27], Paolanti et al. [28], or Yildirim et al. [19].
Based on these premises, the questionnaire, the validation of which is presented in this article, was developed.Its objective is to evaluate the effectiveness of the VR experience that was created specifically by this research group to use the archaeological heritage of the ancient city of Augusta Emerita (Mérida, Spain) as a didactic resource for higher education students.For this purpose, a 3D reconstruction of the Aeneas group, a sculptural group from Roman times whose remains are currently in the National Museum of Roman Art, was used.The 3D Co-ViM research group of the University of Extremadura developed a virtual reality experience based on the digitization of its pieces [29] and the space in which this sculptural group was originally located.It consists of a visit to a virtual museum in which "visitors" can move through its rooms and interact.The first room contains the fragments still preserved of the various statues that made up the group (Aeneas, the main character, a mythical hero who flees Troy accompanied by his father Anchises, who is carried on his shoulders, and his son Ascanius, who is held by his hand).In the second room, it is possible to see its full-size reconstruction and to perceive the sensation of the real height of the group (more than 5 m) when standing next to it.The visit ends with a tour around the place where the three statues would originally be located, a Roman space in the city of Mérida itself, the so-called "marble forum" or forum adiectum, currently not accessible to the public, to contemplate the Aeneas group exposed on what would have been its pedestal [30] To present the work done, the rest of the article is structured as follows: In Section 2, both the instrument created for data collection, in the form of a questionnaire, and the methodology to be followed to evaluate its reliability and validity through various qualitative and quantitative tests are presented.Subsequently, the results obtained are analyzed to establish the relevance of the questionnaire.The following section will present a discussion of the existing literature on the subject, and some conclusions will be drawn.

Materials and Methods
Taking into account the context described above, there is a clear need to develop a series of instruments to measure the effects of the use of virtual reality technology applied in teaching-learning processes of competencies related to history and artistic heritage.In response to this need, a tool has been designed and subjected to a validation process consisting of a questionnaire prior to and a questionnaire after the virtual reality experience briefly described above.The parameters for the design and validation of the questionnaire are described below, following the lines of work set by Flores-Camacho et al. [31] or Pérez-Escoda et al. [32].The questionnaire has been designed to collect information on three fundamental aspects in relation to the didactic use of virtual reality for the teaching of history: usability of the VR application used, learning of content related to Roman art acquired through the use of VR, and emotions experienced by the participants.Previously, the sociodemographic information necessary for the study was collected.The design of the tool followed the steps described in Figure 1.
methodology to be followed to evaluate its reliability and validity through various qualitative and quantitative tests are presented.Subsequently, the results obtained are analyzed to establish the relevance of the questionnaire.The following section will present a discussion of the existing literature on the subject, and some conclusions will be drawn.

Materials and Methods
Taking into account the context described above, there is a clear need to develop a series of instruments to measure the effects of the use of virtual reality technology applied in teaching-learning processes of competencies related to history and artistic heritage.In response to this need, a tool has been designed and subjected to a validation process consisting of a questionnaire prior to and a questionnaire after the virtual reality experience briefly described above.The parameters for the design and validation of the questionnaire are described below, following the lines of work set by Flores-Camacho et al. [31] or Pérez-Escoda et al. [32].

Design
The questionnaire has been designed to collect information on three fundamental aspects in relation to the didactic use of virtual reality for the teaching of history: usability of the VR application used, learning of content related to Roman art acquired through the use of VR, and emotions experienced by the participants.Previously, the sociodemographic information necessary for the study was collected.The design of the tool followed the steps described in Figure 1.The tool was developed ad hoc, based on some previous work that has been used for the analysis of virtual reality experiences applied to the teaching-learning processes of history and historical heritage [33][34][35].Looking at each of the three main blocks of information collected through this tool, the following references have been particularly considered: The tool was developed ad hoc, based on some previous work that has been used for the analysis of virtual reality experiences applied to the teaching-learning processes of history and historical heritage [33][34][35].Looking at each of the three main blocks of information collected through this tool, the following references have been particularly considered: In relation to the usability and the degree of acceptance of the developed VR experience, this research builds upon a series of previous studies dedicated to the analysis of applications developed to support the learning process in different educational areas [19,[36][37][38][39][40].
Regarding the learning of history and art skills through the use of VR, key studies include those of Arias Ferrer and collaborators [41], Cózar Gutiérrez and collaborators [42], Velasteguí López [14], and Zapatero [15].These studies delve into the importance of the use of VR technologies for the teaching of historical and heritage skills.
Taking all these previous studies as references, we have developed the data collection instrument presented here, which aims to measure the usefulness of a didactic experience that includes virtual reality for the teaching of historical heritage in the three areas mentioned above: usability of the VR tool, learning performance, and emotional performance.
This process has produced a questionnaire composed of four main blocks that are presented below.
Table 1 shows the sociodemographic information based on the description of questions and response variables.Below are the items linked to each of the blocks of information that are collected through the tool designed.The responses to these items have been set on a Likert scale of 1-5, with 1 being "I strongly disagree" and 5 "I strongly agree".Table 2 presents the items related to the usability of the virtual reality tool used throughout the experience.Table 3 shows the items through which the learning of historical competencies related to the virtual reality experience has been assessed.

Validation Process
Throughout the process, the tool underwent three main phases of validation: Analysis of the validity of the content through expert judgments.This part of the process allows the validation of the content on which information is to be obtained through the tool [48][49][50].
Analysis of the employability of the subject by students [51,52].
Conducting statistical tests to verify the reliability and internal consistency of the construct through statistical tests such as Cronbach's alpha [53,54], KMO, and factor analysis [55,56].

Sample for the Three Validation Phases
In each of the phases described above, a sample that allows the objectives to be met was used.
Regarding the analysis of the validity of the content through expert judgments, a group of 8 expert judges was selected, in accordance with the needs of the tool designed, and seeking compliance with at least 3 of the following criteria: • They teach in the educational stage in which the tool is going to be applied, i.e., at the university stage or at the stage immediately prior to that.• They teach in the areas of "Didactics of Social Sciences, Language and Literature", or "Systems Engineering and Automation".• They have collaborated in scientific publications in the educational field related to the teaching of the areas described above.• They have carried out didactic experiences linked to the use of new technologies applied to teaching and learning.
In relation to these parameters, the group of expert judges who analyzed the tool was configured as presented in Table 5.In relation to the testing phase by students, the selected sample was composed of four students belonging to the group with which the tool was tested and who were randomly selected.fulfilling the following criteria: • They are Early Childhood Education degree students.
• They have used a designed tool.
• They have been selected by a non-probabilistic procedure.
Finally, in relation to the application of statistical tests to check the reliability and internal consistency of the construct, the tool was applied to a sample of 136 students, selected for convenience in a non-probabilistic way, following these criteria: • They are in the third year of the degree in Early Childhood Education.
• They are studying the subject of Didactics in Social Sciences.
The sample selection and data collection procedure followed the guidelines of the ethics committee of the University of Extremadura, informing the participants of the purposes of the research and treating the information collected anonymously.In addition, a pilot test experience was carried out with 51 teachers in training.The results of this test can be found in Corrales et al. [57].In the process of collecting information, the channels prescribed by the ethics committee of the University of Extremadura were followed, using informed consent.This process was approved by the ethics committee of this university, with registration number 56//2023.

Reliability
To verify the reliability of the instrument, Cronbach's alpha coefficient has been used, which determines the significance of the items.

Construct Validity
To verify the validity of the construct, once the tool was applied to the experimental group, the data were analyzed with the statistical package "Statistical Package for the Social Sciences" (SPSS) version 27, applying the following tests: For the relationship between variables and multidimensional value, sample adequacy of Kaiser-Meyer-Olkin (KMO) was analyzed, which allows verifying whether a factor analysis is possible.
Bartlett sphericity to verify the correlation between variables.

Expert Judgment
The information collected in the reports of the expert judges, which assesses the content of the items of the questionnaire, is shown in Table 6.In it, the comments of the expert judges in relation to the blocks and items of the questionnaire are ordered, as well as the decisions taken in relation to possible changes in the questionnaire.
To complete the information collected from the expert judges, Aiken's V statistical test has been performed, and the results are presented in Table 7.

User Testing
In a second phase of the validation process, and with the aim of assessing to what extent the format and language of the tool are adequate for its being understood by the students for whom it is intended, a test was carried out by four students.This sample was selected by a non-probabilistic convenience procedure to meet the characteristics necessary to test the questionnaire, namely, Early Childhood Education degree students, of both genders (two belonging to the male gender and two to the female), of the age corresponding to the academic year (21 years) who did not previously know the experience.
To carry out this process, the selected students were asked to answer the questionnaire and then evaluate it according to the parameters of comprehensibility, adequacy, and approximate response time.The results of this process are shown in Table 8.  9 and 10 show a very high reliability index after the application of Cronbach's alpha test (α = 0.868) for the total of the questionnaire items (34 items).In order to assess to what extent each of the items affects this reliability index, the possibility of excluding items that reduce reliability was analyzed.The deletion of items QA10 and QE9 raises the reliability index of the construct to α = 0.88.

Factor Analysis
The KMO test yields a result of KMO = 0.869, as shown in Table 11.This result, greater than 0.5, provides a value that shows high significance for performing factor analysis.The KMO index takes values between 0 and 1.It presents the following scale as an interpretation guide: <0.5 is unacceptable; 0.5-0.6,bad; 0.6-0.7,moderate; 0.7-0.8,good; and >0.8, excellent [52,53].The index obtained, KMO = 0.869, allows the performance of factor analysis to check the sphericity of the results.
On the other hand, the significance of Bartlett B = 0.00 allows us to reject the null hypothesis of this test (the variables analyzed are not correlated in the sample), and therefore it can be affirmed that the different variables are sufficiently related.
The results of the factor analysis are shown in Table 12.

Discussion
The whole process carried out has allowed us to obtain a series of conclusions about the validity of the content of the tool designed, as well as the reliability and validity of the construct.
The process applied for the validation of the questionnaire coincides with similar processes, such as in the study by Roblero [58] to validate a questionnaire on time management among Mexican students, or the study by Flores-Camacho and collaborators [31] to validate a questionnaire on representations in physics teaching.Pérez-Escoda et al. [32] apply in the validation process a review by 11 experts and the measurement of reliability through Cronbach's alpha, in addition to other tests, such as factor analysis.The same validation procedures of instruments have been implemented in the Primary Education stage [59], as well as in secondary education [60].
With regard to the validity of the content, the analysis conducted by the expert judges makes it possible to affirm that the content responds to the design intention of the tool.Bearing in mind that the overall assessment made by the expert judges is positive, and once the proposed corrections have been incorporated, it can be affirmed that the content responds to the research objectives.In this sense, several studies support the method of review by expert judges as a method of validation of tools similar to the one presented [58,61,62].To complete this block, the result of Aiken's V test (0.81) confirms the high degree of agreement of the judges on the relevance of the content of the questionnaire.This content validity index is considered a useful tool for completing the information of the qualitative analysis of the judges' ratings, as some studies state [58,63,64].
On the other hand, the tests of validity and reliability of the construct report a high degree of reliability, as indicated by the index α = 0.88.In this sense, some studies, such as that by Barrios and Cosculluela [65] or Herrán Gascón et al. [66], indicate a range between 0.7 and 0.9-0.95 as the optimal level of reliability.
Regarding construct validity, the results of KMO = 0.869 should be interpreted similarly; being greater than 0.5, which is the standard commonly used as a reference [67] and the application of Bartlett's sphericity test allows us to conclude that the designed questionnaire presents a high degree of validity for the construct.

Conclusions
The main objective of this research was to design and validate an instrument for evaluating the effectiveness of the VR experience that was created expressly by this research group to use the archaeological heritage of Mérida as a didactic resource for higher education students.The design process has been satisfactory, obtaining a tool that allows measuring the usability, learning, and emotions of the students participating in the VR experience.The validation process, in turn, has verified the adequacy of the content, and the statistical tests carried out guarantee the reliability and validity of the construct.Among the limitations of this work, it should be mentioned that it was restricted to a pilot test experience.Replication of the experience will be necessary to consolidate the results obtained.

Table 1 .
Sociodemographic information from the questionnaire.Own elaboration.

Table 2 .
QU (usability questions) items related to usability.Own elaboration.

Table 3 .
QL items (learning questions) on the learning achieved in the intervention.Own elaboration.

Table 4
assesses the incidence of emotions during the development of the virtual reality didactic experience, as well as the possible causes that provoked students to experience these emotions.

Table 4 .
QE items (emotion questions) about the emotions experienced in the didactic intervention.Own elaboration.

Table 5 .
Compliance with selection criteria of expert judges for content validation.Own elaboration.

Table 6 .
Synthesis of proposals by expert judges and action taken.Own elaboration.

Table 7 .
Aiken's V test results data.Own elaboration.

Table 8 .
Information collected through student testing.Own elaboration.

Table 9 .
Application of Cronbach's alpha test on the sample.Own elaboration.

Table 10 .
Results of α with all the items of the questionnaire, and after the elimination of QA10 and QE9.Own elaboration.

Table 11 .
Result of KMO and Bartlett.Own elaboration.

Table 12 .
Results of factor analysis.Own elaboration.