The Inﬂuence of Collaborative and Multi-Modal Mixed Reality: Cultural Learning in Virtual Heritage

: Studies in the virtual heritage (VH) domain identify collaboration (social interaction), engagement, and a contextual relationship as key elements of interaction design that inﬂuence users’ experience and cultural learning in VH applications. The purpose of this study is to validate whether collaboration (social interaction), engaging experience, and a contextual relationship enhance cultural learning in a collaborative and multi-modal mixed reality (MR) heritage environment. To this end, we have designed and implemented a cloud-based collaborative and multi-modal MR application aiming at enhancing user experience and cultural learning in museums. A conceptual model was proposed based on collaboration, engagement, and relationship in the context of MR experience. The MR application was then evaluated at the Western Australian Shipwrecks Museum by experts, archaeologists, and curators from the gallery and the Western Australian Museum. Questionnaire, semi-structured interview, and observation were used to collect data. The results suggest that integrating collaborative and multi-modal interaction methods with MR technology facilitates enhanced cultural learning in VH.


Introduction
The adoption of immersive reality technologies across different domains and application themes, such as architecture, medical practice, engineering, and tourism, has increased recently [1][2][3][4][5][6][7][8]. For instance, Alizadehsalehi, Hadavi [1] review the existing literature, case studies, and applications of immersive reality technology in Architecture, Engineering, and Construction (AEC) industry and outline a roadmap that promotes the integration of immersive reality technology, cloud computing, digital twins, emerging technologies in IoT and cognitive computing to solve a variety of construction and management issues in the industry. Similarly, Alizadehsalehi and Yitmen [2] present a framework that integrate digital twin, building information modelling (BIM), and immersive reality technology, aiming at monitoring construction progress.
The tourism industry, digital cultural heritage, and architectural heritage have benefited from immersive reality. In recent years, studies applied to these domains have demonstrated how the integration of cultural computing, 3D modelling, and immersive reality improve awareness of cultural heritage [4]. Furthermore, studies also show how immersive reality plays a role in reviving the tourism industry from its COVID-19 pandemic-induced economic challenges [5].
Recent studies in the virtual heritage (VH) domain have recognised the importance of collaboration, social interaction, and engagement in exhibiting technologies that museums provide to visitors [9,10]. In this regard, immersive reality technologies are becoming a popular choice to enhance visitors' experience.
Museums are shared spaces, and it is very crucial that immersive reality technologies embrace this characteristic. However, not all forms of immersive reality technologies can naturally enable collaboration between visitors. For instance, virtual reality (VR) creates an artificial barrier between visitors and between the real and virtual words. In contrast, Augmented Reality (AR) and Mixed Reality (MR) do not create artificial barrier between visitors because virtual objects are overlayed on top of visitors' views of the real world. Hence, social interaction between visitors and a contextual relationship between visitors and the real world can be maintained.
In this paper, we evaluate a mixed reality application designed and implemented to enhance cultural learning in museums. The mixed reality application (Clouds-based Collaborative and Multi-modal Mixed Reality) attempts to enable collaboration, engagement, and a contextual relationship in mixed reality applications that specifically aim at virtual heritage themes in the context of enhancing cultural learning.
This paper is a continuation of previous published works produced as part of the first author's PhD research project. The publications are summarised and presented in Table 1 to make the reading smoother and establish a connection between the papers. The published works are categorized into four research phases.
Phase one: Exploring the state-of-the-art • A Survey of Augmented, Virtual, and Mixed Reality for Cultural Heritage [11]. • A Comparison of Immersive Realities and Interaction Methods: Cultural Learning in Virtual Heritage [12].
Phase two: Establishing the conceptual base • Redefining Mixed Reality: User-Reality-Virtuality and Virtual Heritage Perspectives [13]. • Mixed Reality: A Bridge or a Fusion Between Two Worlds? [14]. The remainder of this paper is structured as follows. Section 2 will discuss existing studies in the context of providing theoretical background for the study. Section 3 will provide detailed discussion on the research model and explores various assumptions. Following that, Section 4 will explain the research methodology adopted. Section 5 will present detailed discussion on data analysis and results. Finally, Sections 6 and 7 will offer discussions and conclusions, including theoretical contribution, practical benefits to the virtual heritage domain, and future works. • Evaluates the clouds-based collaborative and multi-modal mixed reality in the context of cultural learning in virtual heritage.

•
Based on the outcome of the evaluation, it also provides some suggestions to the wider virtual heritage community on the topics of mixed reality, interaction methods, and cultural learning.

Theoretical Background
This section provides detailed discussion on different domains and exemplar cases that contributed to forming the primary research objective.

Mixed Reality and Virtual Heritage
Virtual heritage is an emerging field that applies immersive reality technologies and digital tools to cultural heritage (CH) to simulate, preserve, and disseminate tangible and intangible cultural heritage assets in the form of diverse multimedia approaches. Immersive reality is one of the approaches utilised for various virtual heritage application themes ranging from virtual reconstruction to virtual museum [11,12]. For instance, mixed reality enables user-centred and personalised presentation while allowing cultural heritage assets to be digitally accessible in the form of virtual reconstruction or virtual museums or exhibitions.
Mixed reality applications are emerging in many domains, following recent advances in immersive reality technology, such as the Microsoft HoloLens device. For instance, Pollalis, Minor [18] present a mixed reality application that utilises this device to allow object-based learning through mid-air gestural interaction and virtual representations of museum artefacts. Other examples of HoloLens-based applications in the domain include [15][16][17][19][20][21][22]. Mixed reality applications in virtual heritage that utilise similar technology tend to focus on virtual reconstruction, virtual representation, and virtual exhibition.
Virtual reconstruction and representation aim at enabling users to visualise and interact with digitally reconstructed tangible and intangible heritages. Such applications allow blending historical views from the past with their current appearance. For instance, damaged architectural assets can be virtually reconstructed at their historical location. Additional information beyond the virtual reconstruction itself can also be overlaid along with the virtual elements. MR can play an important role in the restoration of lost heritages, starting from interacting with the virtual reconstruction of statues and extending to reviving cultural practices in their original forms.
Virtual museums and virtual exhibitions intend to improve visitors' experience at museums and heritage sites, typically through personalised and immersive virtual tour guidance. In general, such applications simulate and enhance museums and heritage sites, including their tangible and intangible assets.

Collaborative and Multi-Modal Interaction Methods
Collaborative interaction methods in immersive reality applications consist of collaboration as a default characteristic. To this effect, the interaction method integrates and synchronises various input and audio-visual display devices [23]. The objective of the collaborative interaction method is to facilitate interaction with virtual environment that enables shared and multiuser experience. Similarly, multi-modal interaction methods consist of multiple modes of interaction, such as speech, gaze, gesture, touch, and movement. The integration of collaborative and multi-modal interaction methods facilitates collaboration between users and provides natural interaction. Furthermore, this interaction method with mixed reality adds a face-to-face collaboration to the experience and facilitates interaction among users. Bekele and Champion [12] (pp. 8-12) discuss collaborative and multi-modal interaction methods and how their integration with mixed reality enhances cultural learning in virtual heritage.

Collaboration, Engagement, Contextual Relationship, and Cultural Learning
Virtual environments can facilitate enhanced cultural learning experience (Ibrahim & Ali, 2018). Interaction methods, contextual relationship, and cultural context in virtual heritage also play an important role in enhancing cultural learning [24][25][26][27][28][29][30][31]. Enhancing cultural learning in VH applications, therefore, relies on immersive reality and interaction method to enable a contextual relationship, collaboration, and engagement between users and the virtual environment [24,[32][33][34]. Existing virtual heritage applications that utilise immersive reality technologies for cultural knowledge dissemination focus on users' interaction with the applications [35][36][37][38]. For instance, mixed reality allows interaction between users and the real-virtual world. This allows virtual heritage applications to establish a contextual relationship between users and the real-world. Bekele and Champion [12] (pp. 5-8) and Bekele and Champion [13] (pp. 5-7) discuss how collaborative and multi-modal mixed reality can enhance cultural learning through collaboration, engagement, and a contextual relationship in mixed reality virtual heritage environment.

Conceptual Model
In this section, we discuss our conceptual model. Based on the theoretical background presented in the previous section, we discuss the research model (framework) that led to the design, implementation, and evaluation of clouds-based collaborative and multi-modal mixed reality [17]. Figure 1 shows our conceptual model, which presents the characteristics of collaborative and multi-modal mixed reality affecting users' cultural learning experience via collaboration, engagement, a contextual relationship, and their associated enablers. Establishing enhanced cultural learning as our objective, we also outline how the major characteristics of collaborative and multi-modal mixed reality influence cultural learning in virtual heritage applications at museums and heritage sites. We further explore how these characteristics are connected to and influence each other to attain the primary objective. and the virtual environment [24,[32][33][34]. Existing virtual heritage applications that utilise immersive reality technologies for cultural knowledge dissemination focus on users' interaction with the applications [35][36][37][38]. For instance, mixed reality allows interaction between users and the real-virtual world. This allows virtual heritage applications to establish a contextual relationship between users and the real-world. Bekele and Champion [12] (pp. 5-8) and Bekele and Champion [13] (pp. 5-7) discuss how collaborative and multi-modal mixed reality can enhance cultural learning through collaboration, engagement, and a contextual relationship in mixed reality virtual heritage environment.

Conceptual Model
In this section, we discuss our conceptual model. Based on the theoretical background presented in the previous section, we discuss the research model (framework) that led to the design, implementation, and evaluation of clouds-based collaborative and multi-modal mixed reality [17]. Figure 1 shows our conceptual model, which presents the characteristics of collaborative and multi-modal mixed reality affecting users' cultural learning experience via collaboration, engagement, a contextual relationship, and their associated enablers. Establishing enhanced cultural learning as our objective, we also outline how the major characteristics of collaborative and multi-modal mixed reality influence cultural learning in virtual heritage applications at museums and heritage sites. We further explore how these characteristics are connected to and influence each other to attain the primary objective.

Collaborative Interaction, Collaboration (Social Interaction), Contextual Relationship, and Engagement
Collaborative interaction refers to the ability of interaction methods to enable effective and meaningful collaboration between users. As discussed in the previous section, collaboration, engagement, and contextual relationships influence cultural learning in virtual heritage. When viewed as characteristics of cultural learning in virtual heritage, collaborative interaction, therefore, can enable social interaction, a contextual relationship,

Collaborative Interaction, Collaboration (Social Interaction), Contextual Relationship, and Engagement
Collaborative interaction refers to the ability of interaction methods to enable effective and meaningful collaboration between users. As discussed in the previous section, collaboration, engagement, and contextual relationships influence cultural learning in virtual heritage. When viewed as characteristics of cultural learning in virtual heritage, collaborative interaction, therefore, can enable social interaction, a contextual relationship, and engagement. Hence, we hypothesise that collaborative interaction in mixed reality will have a positive effect on engagement, collaboration (social interaction), and a contextual relationship between users and the virtual environment.

Multi-Modal Interaction, Collaboration (Social Interaction), Contextual Relationship, and Engagement
Multi-modal interaction methods in mixed reality enable users to manipulate the virtual environment and interact with the application via multiple modes, such as gesture, speech, movement, and gaze. These characteristics lead to more natural way of interaction that requires less effort form users. As a result, users will not be distracted by the complexity of the interaction methods, that in turn results in enhanced engagement that facilitates a real-virtual environment to establish a contextual relationship between users and the environment itself. Hence, we hypothesise that multi-modal interaction in mixed reality will have a positive effect on engagement and the contextual relationship between users and the virtual environment.

Collaboration (Social Interaction) and Cultural Learning
Collaboration (social interaction) in virtual environments, as discussed in the previous section, is one of the characteristics of collaborative and multi-modal mixed reality. We have discussed in the introductory section that museums are shared spaces. As such, social interaction is often implicit in the visiting experience. However, contextual cultural interaction with artefacts, displays, and related media is seldom effectively leveraged. Interaction methods in virtual heritage applications need to embrace this potential. We hypothesise that collaboration (social interaction) in mixed reality will have a positive effect on cultural learning.

Contextual Relationship and Cultural Learning
Contextual relationship is a three-way relationship between users, the real world, and the virtual environment [13]. The relationship between the virtual environment and the real world is as crucial as the social interaction between users. We hypothesise that contextual relationship in mixed reality will have a positive effect on cultural leaning.

Engagement and Cultural Learning
Engagement in virtual environments, as discussed in the previous section, is one of the characteristics of collaborative and multi-modal mixed reality. We hypothesise that engagement in mixed reality will have a positive effect on cultural leaning. Figure 2 shows SS Xantho, launched in 1848, which is one of the world's first iron ships and western Australia's first coastal steamer. Xantho was selected as the cultural context for the collaborative and multi-modal mixed reality application we evaluate in this paper [17,39]. Xantho was selected because of its significance to the maritime archaeology of western Australia (it has also been depicted in Aboriginal rock art), it was used as a "tramp steamer", pearler, and convict ship, before sinking in 1872. Besides a permanent section in the Western Australia Shipwreck Museum, featuring the ship and related artifacts, the museum has made available 3D models of the ship and its engine " . . . the only known example of the first high pressure, high revolution engines ever made." As part of this study, two mixed reality applications, Walkable Mixed Reality Map [14] and Clouds-based Collaborative and Multi-modal Mixed Reality [17] were designed and implemented. Both applications use the story of Xantho as their cultural context. The evaluation in this study will focus on the Clouds-Based Collabora Modal Mixed Reality. By using this mixed reality application at the Wes Shipwreck Museum, visitors can collaboratively interact with 3D models, and textual information related to Xantho. The experience is delivered to crosoft HoloLens device. A total of two users can collaborate and interact w reality experience at the same time. Users have a choice of speech, gaze movement to use to interact with the mixed reality environment. They ca 3D models, read text, and play audio and video the media content prese users experiencing the mixed reality environment can collaborate and comm navigating through the story of Xantho.

Study Context
The experience begins with the application asking users to provide sta a shared location that will be used to load the mixed reality environment shows users interacting with the application (see the Supplemental Ma video of the mixed reality experience). The stage ID can be set and passed by a curator or one of the participants who plays a role of a guide [17]. We to refer to this published article. Once users supply the stage ID, the Ho will load the mixed reality environment at the shared location and users s with the environment. The experience takes approximately 15-20 min and ments. The first segment introduces users to Microsoft HoloLens and the int ods they can utilise. The introduction is delivered by a male virtual guide. ment, users select to begin the story of Xantho and then the Walkable Mixe (second segment) is projected on the floor. At this stage, users start to expl collaboratively. This segment focuses on the early life of Xantho. After th can freely navigate through segment three (focuses on the wreck of Xantho four (focuses on the discovery of the wreck of Xantho). Interaction with c environment is achieved via a multi-modal interaction method that com gaze, gesture, and movement. This provided users with the flexibility of The evaluation in this study will focus on the Clouds-Based Collaborative and Multi-Modal Mixed Reality. By using this mixed reality application at the Western Australia Shipwreck Museum, visitors can collaboratively interact with 3D models, videos, audio, and textual information related to Xantho. The experience is delivered to users via Microsoft HoloLens device. A total of two users can collaborate and interact with the mixed reality experience at the same time. Users have a choice of speech, gaze, gesture, and movement to use to interact with the mixed reality environment. They can interact with 3D models, read text, and play audio and video the media content presented. The two users experiencing the mixed reality environment can collaborate and communicate while navigating through the story of Xantho.
The experience begins with the application asking users to provide stage ID to locate a shared location that will be used to load the mixed reality environment onto. Figure 3 shows users interacting with the application (see the Supplemental Materials to view video of the mixed reality experience). The stage ID can be set and passed to users either by a curator or one of the participants who plays a role of a guide [17]. We invite readers to refer to this published article. Once users supply the stage ID, the HoloLens devices will load the mixed reality environment at the shared location and users start interacting with the environment. The experience takes approximately 15-20 min and has four segments. The first segment introduces users to Microsoft HoloLens and the interaction methods they can utilise. The introduction is delivered by a male virtual guide. After this segment, users select to begin the story of Xantho and then the Walkable Mixed Reality Map (second segment) is projected on the floor. At this stage, users start to explore the content collaboratively. This segment focuses on the early life of Xantho. After this stage, users can freely navigate through segment three (focuses on the wreck of Xantho) and segment four (focuses on the discovery of the wreck of Xantho). Interaction with content and the environment is achieved via a multi-modal interaction method that combines speech, gaze, gesture, and movement. This provided users with the flexibility of switching between different modes. Multimodal Technol. Interact. 2021, 5, x FOR PEER REVIEW 8 of 17

Measures
The instruments used for the evaluation were questionnaires and semi-structured interviews. The questionnaire used for this study had a total of nine measurement items scored on a 5-point Likert scale, one open question, and six demography questions (see Tables 2 and 3). The measurement methods were adopted from Technology Acceptance Model (TAM) and Bae, Jung [40]. The semi-structured interview had five predetermined questions.

Measures
The instruments used for the evaluation were questionnaires and semi-structured interviews. The questionnaire used for this study had a total of nine measurement items scored on a 5-point Likert scale, one open question, and six demography questions (see Tables 2 and 3). The measurement methods were adopted from Technology Acceptance Model (TAM) and Bae, Jung [40]. The semi-structured interview had five predetermined questions.

Data Collection
The survey and interview were conducted over two evaluation sessions that took place at the Western Australian Shipwreck Museum on the 7 and 14 October 2021. The evaluation was conducted by the primary author as part of his PhD research. Experts, archaeologists, curators, and researchers from the museum participated in the evaluation.
After completing the mixed reality experience, participants were given a tablet computer to respond to the questionnaire. Once responses were gathered, participants were asked five semi-structured questions. The interview was recoded on a recording device (smartphone) and transcribed for further analysis.
A total of 11 experts from different departments of the Western Australian Shipwreck Museum participated in the evaluation. Table 2 shows demographical details of the participants. According to the data gathered from the two evaluation sessions, the majority of participants were female (6 female, 4 male, and 1 preferred not to identify gender). The majority of participants were aged between 40 and 49. With regards to participants' previous experience with immersive reality technology in general, the responses show that 7 participants were novice users, and 3 participants had never used the technology (one participant did not respond to this survey item). However, participants' response to a survey item that asked their previous experience with Microsoft HoloLens showed that the majority of participants were new to the technology (8 never used the technology, and 3 were novice users).   I think visitors will find the system easy to use and follow. Open and Interview Questions Do you think this technology can be used to enhance visitors' interest in the museum's collections?
Were the two of you able to communicate while exploring the shared mixed reality experience?
Were you able to interact with the system using all modes of interaction, such as gaze, speech, and gesture?
Do you have any other thoughts or comments about your experience?

Results
In this section we present the results obtained from analysing the data gathered from survey items, open question, and semi-structured interview. Table 3 Table 2 shows demographical details of the participants. According to the data gathered from the two evaluation sessions, the majority of participants were female (6 female, 4 male, and 1 preferred not to identify gender). The majority of participants were aged between 40 and 49. With regards to participants' previous experience with immersive reality technology in general, the responses show that 7 participants were novice users, and 3 participants had never used the technology (one participant did not respond to this survey item). However, participants' response to a survey item that asked their previous experience with Microsoft HoloLens showed that the majority of participants were new to the technology (8 never used the technology, and 3 were novice users).

Results
In this section we present the results obtained from analysing the data gathered from survey items, open question, and semi-structured interview. Table 3 and Figure 4 summarise questionnaire items scored on a 5-point Likert scale. The results are grouped into three categories based on the three characteristic of collaborative and multi-modal mixed reality we identified in Sections 2 and 3. Figure 4. This bar chart shows participants (n = 11) response to questionnaire items (see Table 3) scored on 5-scale Likert (strongly disagree = 1, somewhat disagree = 2, neither agree nor disagree = 3, somewhat agree = 4, and strongly agree = 5).

Collaboration (Social Interaction)
In Section 3.1, we hypothesised that collaborative interaction in mixed reality will have a positive effect on engagement, collaboration (social interaction), and contextual Figure 4. This bar chart shows participants (n = 11) response to questionnaire items (see Table 3) scored on 5-scale Likert (strongly disagree = 1, somewhat disagree = 2, neither agree nor disagree = 3, somewhat agree = 4, and strongly agree = 5).

Collaboration (Social Interaction)
In Section 3.1, we hypothesised that collaborative interaction in mixed reality will have a positive effect on engagement, collaboration (social interaction), and contextual relationship between users and the virtual environment. Participants response to the survey items "It was easy for me to collaborate with the person I shared the mixed reality experience with", "It was easy for me to share and explain what I was seeing", "It was easy for me to use speech command to interact with the system", and "It was easy for me to use gesture command to interact with the system" were used to validate weather collaborative and multi-modal interaction methods enable collaboration (social interaction) in mixed reality. The results (see Table 3 and Figure 4) indicate that the collaborative and multi-modal aspects of the mixed reality experience enable collaboration (social interaction) between users. Furthermore, participants response to a question "were the two of you able to communicate while exploring the shared mixed reality experience?" validates the importance of collaboration (social interaction).
For instance, one participant responded to the question saying " . . . I think it's good when people get along. They communicate in all forms from experience, communication is an important thing for the experience . . . " Similarly, the following responses from the participants underline the role collaboration plays in terms of enhancing visiting experience and cultural learning. And that was also because we work together? If it was two strangers that were working together on it, it might not be quite as, as easily as intuitive . . . " Participant 6. " . . . always . . . " Participant 7. " . . . we didn't have any collaborative experiences . . . that's because the application or the experience was already loaded . . . " Participant 8. " . . . Yeah, look, it was because you, you know that you can see them there. And you can ask, well, how did you get there?" Participant 9. " . . . I tried to communicate with (name omitted) and he was in his own little world . . . " Participant 10. " . . . I think for me, because it was kind of challenging anyway, because I tried to make it work. I was focused more on what I was experiencing. I noticed that the first two ladies seem to interact quite well . . . " Hence, based on the results obtained from the survey items and interviews, we can validate that collaborative and multi-modal interaction methods in mixed reality have positive effects on social interaction and engagement.

Engagement
Participants' response to the survey items "It was easy for me to collaborate with the person I shared the mixed reality experience with", "It was easy for me to share and explain what I was seeing", "I enjoyed this shared mixed reality experience", "It was easy for me to relate the virtual experience with physical items in the gallery", "It was easy for me to use speech command to interact with the system", and "It was easy for me to use gesture command to interact with the system" were used to validate weather collaborative and multi-modal interaction methods enable engagement in mixed reality. The results (see Table 3 and Figure 4) indicate that collaborative and multi-modal interaction methods in mixed reality enhance users' engagement. In addition, participants response to the questions "were the two of you able to communicate while exploring the shared mixed reality experience?" and "were you able to interact with the system using all modes of interaction, such as gaze, speech, and gesture?" indicate that collaborative and multi-modal interaction methods enhance users' engagement in mixed reality environment.
For instance, one participant stated that "I think having that combination (gesture and speech) is good, especially for people with disabilities". This statement shows the role that the multi-modal interaction method plays in terms of disseminating a mixed reality experience to people with different abilities and backgrounds. The following responses from participants support our assumption that a multi-modal interaction method enhances users' engagement with a mixed reality environment. Based on the results obtained from the survey items and interviews, we can validate that collaborative and multi-modal interaction methods in mixed reality have positive effects on engagement.

Contextual Relationship
Contextual relationship refers to establishing a specific relationship between users, cultural context, and the immersive reality systems. In Section 3, we have hypothesised that collaborative interaction in mixed reality will have a positive effect on contextual relationship. Participants' response to the survey items "It was easy for me to collaborate with the person I shared the mixed reality experience with", and "It was easy for me to relate the virtual experience with physical items in the gallery" were used to validate whether the collaborative interaction method enables a contextual relationship in mixed reality. The results (see Table 2 and Figure 4) indicate that collaborative interaction in mixed reality enables a contextual relationship. The results from Sections 5.2 and 5.3 can support this view because a contextual relationship is the result of collaborative and multi-modal interaction.

Enhanced Cultural Learning
In this paper, we have argued that collaboration (social interaction), engagement, and a contextual relationship in mixed reality enhance cultural learning in virtual learning.
The results presented above show that collaborative and multi-modal interaction methods enable these characteristics. Therefore, we can conclude that collaborative and multi-modal mixed reality has a positive effect on cultural learning in virtual heritage. Furthermore, this assumption is validated by participants' response to survey items "I would like to see more items from the gallery presented in the system" and "I think the experience can enhance visitors' interest to explore more collections in the museum" and their response to the question "do you think this technology can be used to enhance visitors' interest in the museums' collections".
The following responses from the participants validate that the collaborative and multimodal mixed reality enhances visitors' interest in learning about the museums' collection.

Discussion
The objective of this study was to validate whether collaborative and multi-modal mixed reality can facilitate enhanced cultural learning in virtual heritage. Overall, the finding supports our proposed hypotheses that collaboration (social interaction), engagement, and contextual relationship in mixed reality influence cultural learning in virtual heritage. However, the study's findings also identify some limitations that hinder the learning experience. These limitations are categorised into two groups, multi-media content (cultural context) and usability.

Multi-Media Content
Participants were asked to provide and share any thought or comment about their experience (only five participants responded to this open question). Their response suggest that the experience needs improvement in terms of the multi-media content and 3D models included in the experience. The following suggestions were made by the participants. We believe that addressing this feedback will improve the overall cultural learning in the mixed reality experience.

Usability
Feedback received from participants suggests that visitors might find interacting with the system a difficult task. This is supported by the results of the evaluation. The results of the survey item "I think visitors will find the system easy to use and follow" received the lowest score compared to the other items. This is to some extent influenced by a lack of previous experience with immersive reality and Microsoft HoloLens in particular. Table 2 shows that a total 8 out of 11 participants had never used Microsoft HoloLens prior to the evaluation session. The following remarks were made by the participants. Based on the evaluation results and the remarks from participants, the interaction design needs improvement to address the suggestions. Visitors need to be presented with easy-to-understand instructions prior to engaging with the experience. The mixed reality application had a segment that provides instructions to users. However, the instructions were part of the experience. They need to be presented to users before the experience begins. To this effect, printed material or a video that demonstrates interaction methods of HoloLens can be used to introduce users to the overall experience.

Conclusions
In this paper, we have presented results of the evaluation of a clouds-based collaborative and multi-modal mixed reality application that took place at Western Australia Shipwreck Museum. The application was designed and implemented, aiming at enhancing cultural learning in virtual heritage via a combination of collaborative interaction, multimodal interaction, and mixed reality. SS Xantho, one of the world's first iron ships and western Australia's first coastal steamer, was used as a cultural context for the evaluation. Surveys and interviews were conducted to gather data from 11 participants. The collected data were analysed to validate whether collaboration, engagement, and a contextual relationship in mixed reality enhance cultural learning in virtual heritage. The results indicate that these characteristics facilitate enhanced cultural learning in virtual heritage. Furthermore, the results were interpreted to identify limitations, suggestions, and direction for future research in the domain.

Future Directions
Immersive reality display technologies, more specifically the Microsoft HoloLens, are expensive to install in museums as permanent exhibits. Even if the mixed reality application in this article is Microsoft HoloLens native application, it can be customised and deployed to other AR/MR headsets. Alternatively, the application can be customised for cloud native deployment. For instance, Amazon Web Services have released a cloud-based AR/VR platform called Amazon Sumerian. This platform enables museums to create, deploy, and run browser-based 3D, AR and VR applications. Museums can exploit this platform to disseminate their AR and VR experiences to a wider global audience. Hence, this article sets its future research focus on customising the mixed reality applications for multi-device and cloud deployment.
One of the findings of the evaluation was the difficulty of interacting with Microsoft HoloLens for first time users. Participants of the evaluation (experts, curators, and museum professionals) suggested that the general audience of museums (visitors of various background) would find the interaction mechanism (gesture, gaze, and speech) of HoloLens difficult to operate without prior knowledge and practice. They have also suggested that the younger generation would find the interaction mechanism relatively easy to learn. Hence, this article sets its future research direction on designing an interaction mechanism that is easy to learn and that accommodates different demographics of visitors. Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The web platform source code, and the survey results data presented in the research article are available on request from the corresponding author.