Virtual Environments and Augmented Reality Applied to Heritage Education. An Evaluative Study

: Technological advancements have provided heritage with new learning environments via the use of virtual and augmented reality, which can foster the accessibility and understanding of culture and propose new ways of interacting with heritage. Therefore, in this study, a systematic evaluation is carried out of n = 197 heritage education programs listed in the database of the Observatorio de Educaci ó n patrimonial en España (OEPE) (the Spanish Heritage Education Observatory–SHEO) which, in their descriptions, integrate the use of virtual environments and / or augmented reality to promote learning on the part of the user. The objectives of this study are: (1) to analyse the state of the art, (2) to evaluate the quality of their educational designs via the “analysis and assessment sequential method for heritage education programs” (SAEPEP-OEPE) and (3) to identify variables which can be improved or which have a signiﬁcant inﬂuence on the quality of the programs. Highlights of the results include: (a) the increasing implementation of these technologies in heritage education programs, with the greater presence of virtual resources than of learning environments, (b) the low level of the scope of educational quality in their designs, particularly their assessment, and (c) the inclusion of advanced technologies slightly decreases the speciﬁcity of the educational design. Author Contributions: Conceptualization, C.J.G.-C.; Funding acquisition, A.I.-E. and C.J.G.-C.; Investigation, O.F.; Methodology, S.G.-C.; Project O.F.; O.F.; Supervision, A.I.-E.;


This Study
In order to identify what work is being carried out on these new relationships between heritage sites, digital applications (particularly AR and VR and virtual environments) and education processes, we evaluated a selection of cultural heritage education programs which specify the use of virtual environments or virtual or augmented reality in their educational design. The main aim of this study was to analyse the educational components of heritage education programs which make use of these emerging technologies.
The relationship between heritage education and technology is gaining presence in research in Spain [30]. Led by the Observatorio de Educación Patrimonial en España (the Spanish Heritage Education Observatory-SHEO), there has been a large number of educational research projects carried out in recent years in heritage contexts with ICT resources (Information and Communication Technology). Particularly worthy of note are the analyses of the use of digital technologies in the areas of archaeology and heritage education [31], the use of ICT and contemporary heritagization processes [32], and the most recent projects regarding the study of apps [33][34][35] which follow in the footsteps of the initial projects carried out in Great Britain [23]. Over the past few years, studies on virtual heritage [36], Big Data [37] and heritage, ICT, and inclusion [38] have been carried out in other parts of the world.
Although there are previous studies on the evaluation of heritage education programs, there are none which evaluate whether the fact of introducing these types of devices into the educational design supposes an alteration of their planning, or whether these types of devices are already totally integrated or if the process of implementation is still in progress. The initial hypothesis of this research suggests that we are facing an emerging process in which there is an increase in heritage educational programs that integrate virtual environments, AR and VR, which do not have a quality didactic or evaluative structure.
Therefore, the specific objectives of this study are: SO1: To analyse the typology of heritage education programs which use VR, AR, or virtual environments. SO2: To evaluate the quality of the educational design of heritage education programs which specify the use of virtual environments, AR and VR in their planning. SO3: To analyse the correlations between the quality standards of these heritage education programs. SO4: To differentiate the quality of the programs according to whether they are publicly or privately owned, the specificity of the program and the category of heritage in which these digital resources are implemented.

Materials and Methods
The approach of this study is based on the Analysis and Assessment Sequential Method for Heritage Education Programs (SAEPEP-OEPE) [39] in its most advanced phase of implementation, corresponding to the evaluation of programs based on basic standards via the application of the Q-Edutage [40] scale. This scale evaluates the level of educational quality of heritage education programs with proof of validity and reliability for the implementation of educational practices based on the design of effective actions.

Sample
The selection of the study sample was extracted from the database of the Spanish Heritage Education Observatory (SHEO). This institution currently has n=2156 educational programs relating to heritage. A search of these programs was carried out based on the descriptors "virtual environment", "VR", and "AR", which resulted in a valid sample of n=197 programs for this study.
This sample was, in turn, analysed in two successive phases: one encompassing the whole of the sample and another comparative analysis of two subsamples selected following a discrimination of the programs in relation to the consistency of the descriptors of interest, those which appear in a partial (they are only named) or specific way (their use, objective or manner of implementation is described). These subsamples correspond to n=113 and n=84 respectively.

Data Collection Tool
The Q-Edutage scale is an assessment tool which forms part of the previously mentioned SAEPEP-OEPE method, designed by way of a review of literature based on quality indicators of heritage education programs and on the methodological criteria derived from the Plan Nacional de Educación y Patrimonio (National Education and Heritage Plan) [41], following the assessment model proposed by Stake [42]. The scale has previously been validated via expert judgement [43] consisting of 17 members from 9 knowledge areas, who have analysed its empirical and content validity [44] in accordance with its coherence, relevance and congruence with regard to the object of evaluation on a 4-point scale, calibrated via the item response theory (IRT) on an assessment of 330 programs [40].
The scale consists of 14 items resulting from this procedure which respond to various facets and aspects of the quality of the programs. This tool possesses a high degree of reliability and discrimination between the 4 levels of the variables [45], which makes it possible to adequately identify the quality of the programs included here. It is a brief and precise tool which does not present bias independently of the assessor applying it.
The 14 items which make up the scale [45], make it possible to accurately discriminate four levels of quality via a Likert scale which contributes towards detecting outstanding programs and to optimising the rigour of the assessment and the planning of programs ("is not achieved", "is achieved with conditions", "is achieved", and "is achieved with quality"). For its correct use, a rubric for assessment is used which helps to determine the scoring of the items (this can be requested from the first author). The indicators (see Table 1) respond to aspects of identification of the program, such as the contact and the descriptors which define it, the holistic conception of the heritage, if the heritage is presented in a complete and multiple way in all of its categories, the specification of the typology of the project carried out, the description of its bases or principles or the audience to which it is directed and if it has annexes; and to the key elements of an educational plan: justification, objectives, contents, methodology, resources, assessment and repercussion. The Cronbach's alpha (Table 2) and Guttman (Table 3) statistics show that the questionnaire is suitably reliable for analysis. The criterion agreed upon by different authors is that a Cronbach's alpha value of between 0.70 and 0.90 indicates a good internal consistency for a one-dimensional scale [46,47].

Data Analysis
As far as the procedure for data analysis is concerned, the sample and resulting subsamples were assigned to three expert assessors familiarised with the assessment rubric of the items and, in a previous stage, have gone through a registration sequence which implies searching for and locating programs, a discrimination phase for their inclusion in the database, and the later inventory via a tool with 42 fields to be completed. The data were analysed with the XLSTAT v.2019.3.2 program by way of descriptive statistic processing of the items which make up the scale. After this, Pearson correlation tests were performed, along with non-parametric tests (Mann-Whitney U test and Kruskal-Wallis) on the items and the variables "type of entity", "type of project", and according to the specificity of the project. This test has enabled us to check if there are any statistically significant differences among the result variables and the variables of identification of the project.

Typology of Programs
This is a heterogeneous sample originating from public (n = 144; 73.09%) and private entities (n = 53; 26.90%), implemented internationally, which tackles different categories of heritage (see Table 4) and which encompasses 17 typologies of educational design (see Table 5). All of the programs which make up the sample deal, in part or totally, with different categories of heritage education and also reflect the use of virtual environments, VR and AR. Natural and cultural heritage 14 6 Monuments: Architectonic works 10 7 Monuments: Archaeology 9 5 Archaeological sites 7 3 Places: created by man 6 9 Digital heritage 5 8 Monuments: Pictorial works 2 10 Monuments: Cave art 1 11 Ensembles: Isolated constructions 1 12 Monuments: Group of elements of importance 1 13 Cultural and natural 1 In relation to the typology of the educational designs which make up the sample, a greater frequency is detected for those which are defined as educational projects (21.31%), programs (18.27%), and educational designs (12.18%). This piece of data is a common denominator in all of the studies extracted from the observatory's database, independently of the descriptors which define the search, as these suppose a high percentage of the total sample. However, the appearance of educational resources (12.18%) among the three most frequent typologies is worthy of note. This detail is a differential indicator, due to the fact that this typology does not appear in all cases. Rather, it depends on the descriptors of the search which define the sample. With the aim of contrasting these results with previous studies [48,49], a statistical graph was created (see Figure 1) which allows us to obtain the trends for each typology of program. In this case, a slight increase in frequency can be observed in comparison with other studies in the cases of educational resource and educational tool, both of which are teaching aids to facilitate the teaching and learning processes in educational activities and, furthermore, should ensure interaction on the part of the learner [10]. Therefore, it seems clear that using VR and AR in an attempt to bring heritage environments closer to the classroom in order to achieve existential experiences supposes an increase in these typologies. Also worthy of note is the increase in research projects which elaborate on issues of new forms of information technology (ICT) and new technology for learning and knowledge (TLK) in response to the inclusion of VR and AR in learning processes. Finally, the absence of improvement projects must be highlighted, given that in the same way as research projects, VR and AR should be one of the main objectives for achieving the advancement and improvement of the quality of education, as well as going more deeply into programs which respond to the descriptors of inclusion and attention to diversity.

Assessment of the Quality of the Programs
In general, the assessment carried out by the items via the Q-Edutage scale shows an extremely low opinion of heritage education programs which use virtual environments, VR, and AR (see Table  6). Only two items were scored above an average of three points, with only one program achieving a

Assessment of the Quality of the Programs
In general, the assessment carried out by the items via the Q-Edutage scale shows an extremely low opinion of heritage education programs which use virtual environments, VR, and AR (see Table 6). Only two items were scored above an average of three points, with only one program achieving a score of between 2.5 and three points. Eleven of the fourteen items received scores below 2.5 and a median of two or one (in other words, they did not achieve a sufficient level of quality, or only did so with conditions). Table 6. Statistics of the items which make up the Q-Edutage scale.
Number 197 197 197 197 197 197 197 197 197 197 197 197 197  The three Items with the highest scores are 1, 2, and 4 (see Figure 2). These items are mainly related to the identification of the program and the contact information. As can be seen in figures a (i01 contact), b (i02 descriptors) and c (i04 typology), the interquartile range (Q1-Q3) is situated between three and four points in Items 1 and 4 (a-c), and between two and three points in Item 2 (b). Therefore, the majority of the programs reached the level of quality required for these items of identification. Figure 2. The most notable box plots: contact, descriptors, and typology.
The three items with the lowest scores were 3, 7, and 13: holistic conception of heritage, annexed documents, and determination of assessment systems (see Figure 3). As can be seen in figures a (i03 heritage), b (i07 annexes), and c (i13 assessment), the interquartile ranges of the scores for these programs are situated between one and two in the case of the items on annexed documentation and assessment systems, and between one and three points for the case of the holistic conception of heritage. In other words, a large proportion of these programs did not fulfil the quality criteria, or only did so with conditions. It is also important to point out that all of the items on the educational design of the program (objectives, contents, methodology, materials and resources, and evaluation) have an average score of between two and 2.5 and a median of two or one, in other words, all these items are reached with conditions that imply a very superficial design of the programs. In the assessment carried out of the programs which use virtual environments, VR, and AR, very few fulfilled the criteria for quality required in the educational design. The three items with the lowest scores were 3, 7, and 13: holistic conception of heritage, annexed documents, and determination of assessment systems (see Figure 3). As can be seen in figures a (i03 heritage), b (i07 annexes), and c (i13 assessment), the interquartile ranges of the scores for these programs are situated between one and two in the case of the items on annexed documentation and assessment systems, and between one and three points for the case of the holistic conception of heritage. In other words, a large proportion of these programs did not fulfil the quality criteria, or only did so with conditions. It is also important to point out that all of the items on the educational design of the program (objectives, contents, methodology, materials and resources, and evaluation) have an average score of between two and 2.5 and a median of two or one, in other words, all these items are reached with conditions that imply a very superficial design of the programs. In the assessment carried out of the programs which use virtual environments, VR, and AR, very few fulfilled the criteria for quality required in the educational design.
only did so with conditions. It is also important to point out that all of the items on the educational design of the program (objectives, contents, methodology, materials and resources, and evaluation) have an average score of between two and 2.5 and a median of two or one, in other words, all these items are reached with conditions that imply a very superficial design of the programs. In the assessment carried out of the programs which use virtual environments, VR, and AR, very few fulfilled the criteria for quality required in the educational design. Figure 3. The most insufficient box plots: heritage, annexes, and assessment.

Correlations in the Assessment of the Quality of the Programs
The table of Pearson correlations (see Table 7; Table 8) shows moderate to high correlation coefficients among all of the items, with the exception of Item 1, which has low correlations (between 0. 17

Correlations in the Assessment of the Quality of the Programs
The table of Pearson correlations (see Table 7; Table 8) shows moderate to high correlation coefficients among all of the items, with the exception of Item 1, which has low correlations (between 0.17 and 0.25) with the other items. The rest maintain statistically significant coefficients as can be seen in the table of p-values. Thus, 87 of the 89 correlations have values p < 0.05. Table 7. Pearson coefficient correlation.  The highest correlations are among items related to the implementation of the educational design of heritage education programs (Items 9-13): the implementation of educational objectives, contents, methodology, resources, and evaluation and impact of the proposal. The correlations among these items are located between 0.50 and 0.67 and all of them have p-values <0.01. As can be seen in the image of the correlation matrix (see Figure 4), the greatest correlations arise among these items (light green). It can also be observed how Items 1 and 4 are those which present a lower correlation coefficient with the rest (these are items of identification). From these data, it is observed how the items of the educational structure (Items 9-13) find a very positive correlation, which means that regardless of the degree of scope in which they are made explicit, they are related and the proper unit of design between its parts.
Appl. Sci. 2019, 9, x FOR PEER REVIEW 10 of 21 Figure 4. Image of the correlation matrix.

Hypothesis Testing
The following detail the hypothesis tests that determine the differentiation between the quality of the programs based on the typology of the institution and between the specificity of the program are detailed, as they are integrated into its conception in a general or specific way the virtual environments, VR, and AR.

Public/Private
For hypothesis testing, items on the educational design of the program were selected as, on the whole, they present a similar average (under 2.5) and high correlation coefficients. This demonstrates a high consistency between the elements which define the educational structure of the program as units of information which are designed and systematically recorded and organised [50].
In order to determine the existence of differentiated values between the publicly (73%) and privately (27%) funded programs, the items relating to the educational design were compared. As can be seen in Table 9, no big differences were found. The items with the biggest differences concern the description of the objectives and the systems or tools for evaluation, the quality of which are slightly lower in the privately funded programs. One reason for this may be the need for transparency in projects which belong to public institutions. As a consequence, the publicly funded programs include a larger quantity of information. However, as can be observed, the average differences are not high. The Mann-Whitney U test (see Figure 5) shows that this difference is not statistically significant in items which define the educational design of the program (p>0.05 in all of the items).

Hypothesis Testing
The following detail the hypothesis tests that determine the differentiation between the quality of the programs based on the typology of the institution and between the specificity of the program are detailed, as they are integrated into its conception in a general or specific way the virtual environments, VR, and AR.

Public/Private
For hypothesis testing, items on the educational design of the program were selected as, on the whole, they present a similar average (under 2.5) and high correlation coefficients. This demonstrates a high consistency between the elements which define the educational structure of the program as units of information which are designed and systematically recorded and organised [50].
In order to determine the existence of differentiated values between the publicly (73%) and privately (27%) funded programs, the items relating to the educational design were compared. As can be seen in Table 9, no big differences were found. The items with the biggest differences concern the description of the objectives and the systems or tools for evaluation, the quality of which are slightly lower in the privately funded programs. One reason for this may be the need for transparency in projects which belong to public institutions. As a consequence, the publicly funded programs include a larger quantity of information. However, as can be observed, the average differences are not high. The Mann-Whitney U test (see Figure 5) shows that this difference is not statistically significant in items which define the educational design of the program (p>0.05 in all of the items).

Specificity of the Program
In regards to the specificity of the programs, the average differences according to the subsamples determined by the programs which partially reflect the inclusion of those technologies (n=113) and those which do so specifically (n=84) are only notably shown in the presentation of contents (i10) and the methodological orientation and strategies of teaching and learning (i11), as seen in the following Table 10. In these items, there is a better assessment of the programs of a general nature than of those which are specific. We believe that this responds to the fact that the programs which more clearly or more specifically show the use of virtual environments, virtual reality, or augmented reality place the focus of their attention on this product, which, on the whole, are applications included in the

Specificity of the Program
In regards to the specificity of the programs, the average differences according to the subsamples determined by the programs which partially reflect the inclusion of those technologies (n = 113) and those which do so specifically (n = 84) are only notably shown in the presentation of contents (i10) and the methodological orientation and strategies of teaching and learning (i11), as seen in the following Table 10. In these items, there is a better assessment of the programs of a general nature than of those which are specific. We believe that this responds to the fact that the programs which more clearly or more specifically show the use of virtual environments, virtual reality, or augmented reality place the focus of their attention on this product, which, on the whole, are applications included in the typologies of resource or educational tool, paying less attention to methodological description or to content. It seems that, in these programs, the focus on technology comes to the detriment of educational specificity. This specificity is achieved more satisfactorily when the use of these technologies is only named as part of the development of the program, not as the main element. The Mann-Whitney U test (see Figure 6) shows that the average differences of Items 10 and 11 regarding the specificity of the program are statistically significant (p < 0.01 in the case of Item 10 and p = 0.01 in the case of Item 11).  As can be observed in Figure 7; Figure 8, the interquartile range of Variables 10 and 11 in the general programs are located between two and three points, in other words, the degree of quality expected "is achieved" or "is achieved with conditions" in the case of programs that superficially integrate the tools. However, this range in the specific programs, those that deal only with the tool as the main element with its more technical characteristics, is situated between one and three points, which implies that there is a large number of programs in the latter case which do not achieve the quality standards. Therefore, there is a higher level of educational specificity in general programs that give priority to educational design in which the tool is only a resource of the program and appears on a secondary level.  As can be observed in Figure 7; Figure 8, the interquartile range of Variables 10 and 11 in the general programs are located between two and three points, in other words, the degree of quality expected "is achieved" or "is achieved with conditions" in the case of programs that superficially integrate the tools. However, this range in the specific programs, those that deal only with the tool as the main element with its more technical characteristics, is situated between one and three points, which implies that there is a large number of programs in the latter case which do not achieve the quality standards. Therefore, there is a higher level of educational specificity in general programs that give priority to educational design in which the tool is only a resource of the program and appears on a secondary level.
It should be mentioned that educational quality was not achieved in any of the cases. The averages show that all of the items were achieved with conditions, with the highest scores being achieved by the description of the objectives (i09), the contents (i10), and the methodological orientation (i11). A level of quality was not achieved for the determination of systems or assessment tools (i13). The definition of a quality design should imply a greater degree of specificity in all aspects.
integrate the tools. However, this range in the specific programs, those that deal only with the tool as the main element with its more technical characteristics, is situated between one and three points, which implies that there is a large number of programs in the latter case which do not achieve the quality standards. Therefore, there is a higher level of educational specificity in general programs that give priority to educational design in which the tool is only a resource of the program and appears on a secondary level.  It should be mentioned that educational quality was not achieved in any of the cases. The averages show that all of the items were achieved with conditions, with the highest scores being achieved by the description of the objectives (i09), the contents (i10), and the methodological orientation (i11). A level of quality was not achieved for the determination of systems or assessment tools (i13). The definition of a quality design should imply a greater degree of specificity in all aspects.

Type of Project
As in the previous case, the descriptive data of the averages only show notable differences between Items 10 and 11 according to the variable "type of project". Although the typology of the project is not dealt with in a homogeneous manner, given that a significant part of the sample are programs (1), projects (2), educational designs (3), or resources (4), there were notable differences in the analysis between items according to their typology (see Table 11). The Kruskal‫ؘ‬ -Wallis test (see Figure 9) confirms that the average difference between Items 10 and 11 and the type of project is statistically significant (p=0.01 in the case of Item 10 and p<0.01 in the case of Item 11).

Type of Project
As in the previous case, the descriptive data of the averages only show notable differences between Items 10 and 11 according to the variable "type of project". Although the typology of the project is not dealt with in a homogeneous manner, given that a significant part of the sample are programs (1), projects (2), educational designs (3), or resources (4), there were notable differences in the analysis between items according to their typology (see Table 11). The KruskalWallis test (see Figure 9) confirms that the average difference between Items 10 and 11 and the type of project is statistically significant (p = 0.01 in the case of Item 10 and p < 0.01 in the case of Item 11).  It should be mentioned that educational quality was not achieved in any of the cases. The averages show that all of the items were achieved with conditions, with the highest scores being achieved by the description of the objectives (i09), the contents (i10), and the methodological orientation (i11). A level of quality was not achieved for the determination of systems or assessment tools (i13). The definition of a quality design should imply a greater degree of specificity in all aspects.

Type of Project
As in the previous case, the descriptive data of the averages only show notable differences between Items 10 and 11 according to the variable "type of project". Although the typology of the project is not dealt with in a homogeneous manner, given that a significant part of the sample are programs (1), projects (2), educational designs (3), or resources (4), there were notable differences in the analysis between items according to their typology (see Table 11). The Kruskal‫ؘ‬ -Wallis test (see Figure 9) confirms that the average difference between Items 10 and 11 and the type of project is statistically significant (p=0.01 in the case of Item 10 and p<0.01 in the case of Item 11).  First of all, we analysed Item 10 referring to the presentation of contents, which has its highest value in Typology 3 (educational design). This result confirms one of the initial hypotheses, given that the educational design is defined as a document which provides structure to an educational intervention, specifying objectives, procedures, and contents. Therefore, this typology should presumably have the highest score. However, this is not achieved by other programs which organise an educational process, which join together a set of systematically planned activities or which are aimed at achieving defined objectives: educational program or educational project. With the exception of the educational design, an average of three was only reached for Item 15, which corresponds to the typology of scientific events. In this type of programs, the contents addressed are necessary to transmit the lines of interest to the attendants and participants. It must also be pointed out that the most pronounced typical deviations arise in Item 5 (educational tool) and Item 14 (networks).
As far as the tool is concerned, the specificity is extremely variable, given that it is a tool which facilitates or makes possible an item of learning or its teaching and that, on occasions, it is only defined because it can be applied in different contents according to teachers' needs. Networks, on the other hand, are proposed from different areas (Networks of Schools, Networks of Museums, Virtual Networks, etc.) with each one focusing its bases in very different ways. Therefore, the information provided may present disparity to a greater or lesser degree depending on each case.
Secondly, we analysed Item 11, referring to methodological orientation. Again, the typology of educational design stands out as far as the average score is concerned due to the same reasons described above. Apart from the educational design, no other typology achieved an average score of three. Indeed, only the typology of educational program achieved an average score of more than 2.5, complying with its definition as a document which makes it possible to detail and organise an educational process. This type of program generally arises in informal contexts. Therefore, it should integrate the methodology and strategies of teaching and learning and also foresee and plan its continuity. In this case, we were surprised to find that the most noticeable result of the typical deviation is found in the program corresponding to educational project. The most evident explanation for this is that this type of planning responds to short to medium-term programs, are characterised by a projective condition towards the achievement of research and development objectives and, therefore, the most pronounced aspect of its description are the objectives, subordinating, in some cases, the description of contents or of methodological orientation and leading to a great dissimilarity in quality.
As can be observed in Figures 10 and 11, the highest interquartile ranges in the presentation of contents are attained by educational designs, courses and scientific events. On the other hand, the lowest interquartile ranges are found in educational project, educational resource, educational tool and networks. In the case of the assessment of the methodology presented by the program, the highest interquartile range is reached in educational design, whereas the lowest ranges are found in courses, networks and isolated activities. Networks, etc.) with each one focusing its bases in very different ways. Therefore, the information provided may present disparity to a greater or lesser degree depending on each case. Secondly, we analysed Item 11, referring to methodological orientation. Again, the typology of educational design stands out as far as the average score is concerned due to the same reasons described above. Apart from the educational design, no other typology achieved an average score of three. Indeed, only the typology of educational program achieved an average score of more than 2.5, complying with its definition as a document which makes it possible to detail and organise an educational process. This type of program generally arises in informal contexts. Therefore, it should integrate the methodology and strategies of teaching and learning and also foresee and plan its continuity. In this case, we were surprised to find that the most noticeable result of the typical deviation is found in the program corresponding to educational project. The most evident explanation for this is that this type of planning responds to short to medium-term programs, are characterised by a projective condition towards the achievement of research and development objectives and, therefore, the most pronounced aspect of its description are the objectives, subordinating, in some cases, the description of contents or of methodological orientation and leading to a great dissimilarity in quality.
As can be observed in Figures 10 and 11, the highest interquartile ranges in the presentation of contents are attained by educational designs, courses and scientific events. On the other hand, the lowest interquartile ranges are found in educational project, educational resource, educational tool and networks. In the case of the assessment of the methodology presented by the program, the highest interquartile range is reached in educational design, whereas the lowest ranges are found in courses, networks and isolated activities.

Discussion and Conclusions
After analysing all of the variables of the study and applying the Q-Edutage scale, the results underline the fact that there is a coherence between the items which make up the educational structure of the programs [50,51], although their quality is average or low. A large proportion of the programs assessed do not achieve the level of quality required in terms of their educational design, or only do so with conditions. The results are an improvement in comparison with other previous studies [48,49,52] and, given that the design of the programs which make up the sample corresponds to the most up-to-date designs, this can be interpreted as a gradual improvement in the conceptualisation of educational proposals. However, aspects such as the overall focus of the heritage [53][54][55] or the internal assessment of the programs continue to be aspects in need of improvement, as has already been shown by previous studies [48,49,52,[56][57][58].
In the preparation stage of this study, one of the initial hypotheses suggested that all of the programs containing the search descriptors would be confined to a virtual learning environment (VLE), meaning that all of them allow for educational interactions on the internet. However, after the analysis of the contents of the description of the programs specifically reflecting the use of environments, virtual resources, and/or AR, it can be concluded that only 44% of them contained virtual contexts, platforms, environments, spaces, and communities in their description. Of these, although the majority describe only the learning of the user, only some point towards the exchange, digital twinning, and creation of networks as learning strategies. The remaining 56% do not correspond to educational proposals developed in virtual environments. Rather, they allude to the use of digital educational resources and tools integrated within them.
In this second group, the incorporation of museums, exhibitions, and virtual rooms (24%) stands out, along with the development of virtual itineraries (virtual route/walk/tour/trip are the terms recorded) (13%), the use of virtual reality (9.5%), virtual visits to digitalised environments (7.14%) and, to a lesser extent, the use of virtual guides, replicas, recreations, reconstructions, the digitalisation of monuments and 3D objects, virtual educational material, apps and virtual games, and interactive virtual tools.
As far as the term AR is concerned, its use is only explicit in a minority of the programs (3.57%). However, due to their similarities, in many cases, these terms are used as synonyms. This implies

Discussion and Conclusions
After analysing all of the variables of the study and applying the Q-Edutage scale, the results underline the fact that there is a coherence between the items which make up the educational structure of the programs [50,51], although their quality is average or low. A large proportion of the programs assessed do not achieve the level of quality required in terms of their educational design, or only do so with conditions. The results are an improvement in comparison with other previous studies [48,49,52] and, given that the design of the programs which make up the sample corresponds to the most up-to-date designs, this can be interpreted as a gradual improvement in the conceptualisation of educational proposals. However, aspects such as the overall focus of the heritage [53][54][55] or the internal assessment of the programs continue to be aspects in need of improvement, as has already been shown by previous studies [48,49,52,[56][57][58].
In the preparation stage of this study, one of the initial hypotheses suggested that all of the programs containing the search descriptors would be confined to a virtual learning environment (VLE), meaning that all of them allow for educational interactions on the internet. However, after the analysis of the contents of the description of the programs specifically reflecting the use of environments, virtual resources, and/or AR, it can be concluded that only 44% of them contained virtual contexts, platforms, environments, spaces, and communities in their description. Of these, although the majority describe only the learning of the user, only some point towards the exchange, digital twinning, and creation of networks as learning strategies. The remaining 56% do not correspond to educational proposals developed in virtual environments. Rather, they allude to the use of digital educational resources and tools integrated within them.
In this second group, the incorporation of museums, exhibitions, and virtual rooms (24%) stands out, along with the development of virtual itineraries (virtual route/walk/tour/trip are the terms recorded) (13%), the use of virtual reality (9.5%), virtual visits to digitalised environments (7.14%) and, to a lesser extent, the use of virtual guides, replicas, recreations, reconstructions, the digitalisation of monuments and 3D objects, virtual educational material, apps and virtual games, and interactive virtual tools.
As far as the term AR is concerned, its use is only explicit in a minority of the programs (3.57%). However, due to their similarities, in many cases, these terms are used as synonyms. This implies that these concepts are not clear for those who design the proposals, given that "augmented reality" (AR) enables the users to visualise elements or spaces via a digital device in the real world [17] and "virtual reality" (VR) transports the user to a projected context which may be immersive [14] via physical accessories such as glasses or headphones. In turn, we consider that this analogy in terms can be extended in society as it is an emerging technology [59,60].
After analysing the possibilities that the projects show with the implementation of virtual technologies, certain aspects stand out: the elimination of barriers of space and time, facilitation in the exchange of experiences among users, cooperation and collaboration among agents, the opening of new paths for transmission and communication, an increase in motivation, the development of attitudes of awareness and respect towards the role of heritage, the appearance of new ways of approaching heritage and educational experiences, and new forms of interaction. Last of all, one aspect which is repeated in many cases must be highlighted: the inclusion provided by new technologies in promoting the understanding of cultural assets and the global accessibility to heritage environments, which, in contrast, presents a low manifestation in the "accessibility" variable, which is only explicit in 7.6% of cases. The presence of adapted material and human resources in the programs is almost non-existent, which is one aspect for imminent improvement, both in their development and in their presence in the educational design, an indicator which has already been detected and which does not show any progress.
From this analysis, certain unavoidable necessities can also be extracted which authorities and institutions must bear in mind if they are to play an active role in the management, support, and financing of these applications to promote their implementation for the benefit of teaching: that (a) develop policies for incorporation, application, and development of virtual environments to heritage cultural; (b) promote research initiatives and projects on the effects of the use of virtual environments, VR and AR in the dissemination of cultural heritage; and (c) boost research on models, interactive designs and virtual prototypes to allow knowledge of heritage to be possible for a wide spectrum of the population.
Finally, it should be pointed out that some of the conclusions included here were drawn from variables which were not dealt with in a detailed manner during this study, but which were observed during the prior analysis of the programs and were deemed relevant. Among them, we can highlight the audiences which the majority of the programs are aimed at, given that the implementation of these technologies is closely related to digital natives. The biggest range of target users consists of the stages of compulsory education, particularly young people of between 14 and 20 years of age: compulsory secondary education and baccalaureate (25.9%), primary education (19.8%), and "all audiences" including those mentioned previously (25.9%). This high percentage is a clear indicator of the educational context in which these virtual learning environments and technological resources are developed. Programs which do not provide data on the community, population, or audience which the project is aimed at (16.7%) were excluded from the sample and the remaining percentage refers to specific audiences of different kinds (11.7%).
The final conclusion to be drawn alludes to the category of heritage which the programs refer to (see Table 4). In this regard, according to their frequencies, 56.3% of the programs are aimed at cultural heritage. Although it is true that the majority of the programs deal with historical, cultural, and artistic heritage, this aspect is a common denominator in the different assessment studies, the main cause of which can be considered to be that a high number of institutions use the term cultural heritage generically as a container for different categories and kinds of heritage.
To conclude, following the analysis and the review of literature, it can be stated that these cutting-edge technologies are extremely costly to perfect and implement; this is currently an embryonic field whose funds are mainly assigned to the documentation, reconstruction, restoration, and dissemination of heritage [61]. In the field of education, there are many more steps to take, but this research has enabled us to define the state of the art, to identify variables, to identify the quality of the educational proposals which currently exist, to extract the main advantages of the use of virtual environments (VR and AR), and to unravel key necessities with the perspective of educational improvement.