Next Article in Journal
Advances in Human–Machine Systems, Human–Machine Interfaces and Human Wearable Device Performance
Next Article in Special Issue
Application of Artificial Intelligence and Virtual Reality in Soft Skills Training with Modeled Personality
Previous Article in Journal
Reviewing Breakthroughs and Limitations of Implantable and External Medical Device Treatments for Spinal Cord Injury
Previous Article in Special Issue
Cultural Differences in the Use of Augmented Reality Smart Glasses (ARSGs) Between the U.S. and South Korea: Privacy Concerns and the Technology Acceptance Model
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

HybridFilm: A Mixed-Reality History Tool Enabling Interoperability Between Screen Space and Immersive Environments

1
School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, China
2
Key Laboratory for Software Engineering of Hebei Province, Qinhuangdao 066004, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(15), 8489; https://doi.org/10.3390/app15158489
Submission received: 19 June 2025 / Revised: 23 July 2025 / Accepted: 28 July 2025 / Published: 31 July 2025
(This article belongs to the Special Issue Virtual and Augmented Reality: Theory, Methods, and Applications)

Abstract

History tools facilitate iterative analysis data by allowing users to view, retrieve, and revisit visualization states. However, traditional history tools are constrained by screen space limitations, which restrict the user’s ability to fully understand historical states and make it challenging to provide an intuitive preview of these states. Most immersive history tools, in contrast, operate independently of screen space and fail to consider their integration. This paper proposes HybridFilm, an innovative mixed-reality history tool that seamlessly integrates screen space and immersive reality. First, it expands the user’s understanding of historical states through a multi-source spatial fusion approach. Second, it proposes a “focus + context”-based multi-source spatial historical data visualization and interaction scheme. Furthermore, we assessed the usability and utility of HybridFilm through experimental evaluation. In comparison to traditional history tools, HybridFilm offers a more intuitive and immersive experience while maintaining a comparable level of interaction comfort and fluency.

1. Introduction

In recent years, the limitations of traditional desktop applications in managing digital content have become increasingly apparent, particularly when handling historical records of content revisions. These applications typically generate revision logs, each representing the content’s state at specific time points [1,2]. While these tools often provide mechanisms for users to navigate between different states, such as through text [3], thumbnails [4], or miniature 3D models [5], they face challenges in effectively conveying the temporal relationships between content states. As emphasized by Shneiderman [6], these historical tools are crucial in the visualization process, allowing users to view, retrieve, and revisit different states to support iterative analysis. However, the limited screen space on desktop devices often forces these tools to reduce the amount of information they can display, leading to a less intuitive user experience.
The challenge arises from the need to balance the main task interface, which typically occupies the majority of the screen space, with the graphical history record, often confined to smaller regions [7]. This design trade-off complicates the presentation of semantic information about spatial data, such as 2D views and 3D models, simultaneously. Consequently, users attempting to understand historical states face an increased cognitive load.
Mixed reality (MR) presents a promising solution by enabling users to interact with virtual content while perceiving the real-world physical environment, offering an intuitive and immersive experience. Researchers have successfully applied 3D visualizations from existing desktop software to physical spaces for analyzing software architecture [8,9,10], performance [11,12], and project management [13]. As MR technology becomes more widely adopted, the distinction between data visualizations displayed on screens and those projected into physical space is increasingly blurring [14]. Studies have demonstrated that 3D representations outperform traditional 2D displays in terms of speed, recall, and reduced cognitive load [9]. Additionally, combining 2D displays with 3D MR environments has been shown to enhance data comprehension [15].
Building on these advancements, we propose HybridFilm, an innovative historical tool that seamlessly integrates screen space with immersive reality, enabling smooth interoperability between the two environments. By leveraging MR technology, HybridFilm offers a more intuitive and immersive approach to visualizing historical content, reducing cognitive load and enhancing users’ ability to comprehend and navigate complex data. The main contributions of this work are as follows:
  • Establishment of a Multi-Source Spatial Fusion Framework: This framework enhances desktop operations by minimizing the interference of immersive reality components, thus preventing disruptions to user workflows while interacting with desktop applications.
  • Design of a Multi-Source Spatial Historical Data Visualization and Interaction Scheme: Based on the “focus + context” model, this design utilizes spatial cognition to facilitate a deeper understanding of historical states and their temporal relationships.
  • Usability and Effectiveness Verification: Building upon the historical tools used in molecular 3D model analysis software, the usability and effectiveness of HybridFilm are validated through practical assessment.

2. Related Work

The relevant work discussed in this article can be categorized into three key areas: graphical history visualization, the integration of MR with screen space, and immersive visualization and interaction techniques.

2.1. Graphical History Visualization

To manage the state or history of digital content, it is essential to develop history mechanisms that organize history entries effectively. Heer et al. [7,16] investigated history-switching mechanisms, classifying them into two types. The first approach, known as operation logging, defines each operation along with its inverse operation. To traverse the history, the inverse operation must be performed sequentially to backtrack. The second approach logs the individual states of the application, allowing users to traverse history by switching between application states without the need for sequential backtracking. Heer et al. note that these two approaches are not mutually exclusive, and hybrid solutions are also viable. Zhang et al. [5] similarly summarized two approaches from the perspective of media file changes. One is a state-based approach, where changes are inferred by comparing two states before and after a modification. The other is an operation-based approach, which records the editing operations performed by the user and applies these operations to the states to transform them into subsequent states. In this work, we adopted the approach of recording individual application states, which offers greater flexibility in switching history. This also extends the original history mechanism of the application.
The provision of intuitive graphical visualizations and interactions for operational history has long been a key area of exploration within human–computer interaction (HCI) [7]. The type of content managed by the tool, such as text [17], 2D images [4,18], and 3D scenes [5,19,20], significantly influences these visualizations. Previous works focusing on text or 2D content (e.g., drawings and images) have explored various visual representations, such as operation layers [21], before-and-after state snapshots [22], and history timeline views [23]. Recently, Lilija et al. [20] employed 3D trajectories to visualize spatial records, while Zhang et al. [5] utilized 3D miniatures to depict the states of 3D scenes. In our approach, we aim to visualize a broad range of data types, including text, 2D images, and 3D models. Additionally, we have designed natural interactions to control objects within immersive environments.

2.2. Combination of MR and Screen Space

Immersive environments are particularly well-suited for tasks involving spatial or multidimensional data, while 2D visualizations excel at handling static and abstract information [24,25]. In recent years, researchers have begun to explore the integration of immersive environments with screen space to harness the advantages of both. Studies have shown that combining immersive environments with screen space improves users’ ability to understand data [14,15,26,27], enhances user satisfaction, and increases system use efficiency [28]. Lee et al. [14] proposed guidelines for this innovative hybrid design space, emphasizing that meaningful 2D/3D transformations offer significant benefits for immersive analytics users. Recent research has increasingly focused on exploring the design space for hybrid 2D/3D immersive analysis systems [29]. To improve interaction within these hybrid spaces, researchers have also investigated the design considerations when screen space from various flat devices [30,31,32] is combined with MR environments. This trend suggests that, in the near future, the distinction between working on a desktop and working in an AR/VR space will continue to blur.
Similar to our approach, Mohammad et al.’s work [17] addresses the interoperability between immersive systems and desktop systems, enabling the ‘pulling’ of data from 2D screens and ‘pushing’ visualized data to 2D screens. Narrative interaction-based works [33,34] often focus on the storyline between virtual and physical environments, creating a complete narrative experience by integrating real-world scenes with 3D spatial characters and other elements. In contrast to their work, HybridFilm is specifically designed to manage digital content for screen-based applications. It seeks to overcome the limitations of 2D historical data management by leveraging 3D spatial history data management.

2.3. Immersive Visualization and Interaction

When designing immersive interfaces and interactions, the most intuitive and natural approaches are often prioritized as these behaviors align more closely with human tendencies. For example, several studies have employed various metaphors to reduce the cognitive load associated with user learning, using elements such as tulips [35], bracelets [36], and color palettes [37], all of which serve as highly expressive metaphors. David et al. [38] proposed a method for automating the control of UI layouts within MR applications. In our work, we adopt the metaphor of “film” for visualization design, which traditionally serves as a medium for recording history. Additionally, the shape of “film” is more suited to the screen and desktop environments. Research has shown that the positioning of the UI in virtual reality (VR) relative to the desktop enhances comfort, mobility, and task performance while also reducing physical exertion and strain [39].
Both controllers and gesture recognition are prevalent interaction methods in immersive environments. For most users, directly manipulating objects in 3D space with an unconstrained hand appears to be more intuitive and natural [17], and gesture-based interactions can effectively lower conversion costs [40,41,42]. However, gesture-based interactions are often constrained by tracking performance, and they tend to be less accurate compared to controllers [43]. Wu et al. [44] proposed a novel gesture-based interaction strategy that utilizes the physical support and space provided by everyday objects for intuitive interaction. In our design, we focus on gesture interaction to reduce the switching cost between controllers and mouse–keyboard input. Simultaneously, we have designed touch interactions between virtual objects and the screen to ensure interaction smoothness.

3. HybridFilm

In contrast to the traditional tool of a single-screen space, the historical tool of multi-source space fusion entails communication and interaction across multiple devices, as illustrated in Figure 1. The subsequent sections offer a comprehensive explanation of HybridFilm’s multi-source fusion framework, along with its visualization and interaction designs.

3.1. Multi-Source Spatial Fusion Framework Design

The hardware of the multi-source fusion history tool consists of two components: the desktop and the MR system. These two components communicate with each other through the communication layer and achieve functional integration via the application layer. HybridFilm utilizes the Windows 11 operating system, an Intel Core i5-10200H quad-core processor running at 2.4 GHz, 16 GB of RAM, and an NVIDIA GeForce GTX 1650 with 8 GB of video memory. The external display has a width of 140 cm, a resolution of 1920 × 1080 pixels, and a refresh rate of 120 Hz. The MR system is powered by the HoloLens 2, a head-mounted display device designed for 3D scene operation, display, and interaction, which supports MR interaction design and applications.
The multi-source fusion framework necessitates the transfer of application historical data from desktop space to MR space. The communication and spatial positioning between multiple devices are critical factors that constrain the spatial efficiency of multi-source fusion.

3.1.1. Date of Communication

In the process of collaborative interaction between devices in a multi-source fusion space, the data transmitted includes instructions, HoloLens recognition information, hardware information, and historical data.
Instructions: During the collaborative interaction between two devices in the multi-source fusion space, either device sends an interaction instruction to the other device, which then responds accordingly. For example, when the desktop operation results in an update of historical data, the corresponding display in the mixed-reality space is also updated. Interaction information is divided into two types: status codes and status information. The status code corresponds to the API in the 3D modeling editor, while the status information serves as a prompt.
HoloLens Recognition Information: This includes the recognition data from the HoloLens camera, which consists of the original coordinates of the QR code and the success of the recognition.
Hardware Information: This refers to certain parameters of the screen touch display, including physical dimensions, screen resolution, and device pixel ratio. These data are prepared for coordinate system conversion.
Historical Data: Historical data includes the node indexes of the historical model DAG and the storage information of each node. The historical data is visualized in the mixed-reality space.

3.1.2. Communication

To ensure the quality of multi-source spatial fusion communication for HybridFilm, it is implemented using the MRTK 2.8.0 [45] open-source toolkit within the Unity 2022.3.15 [46] and Python 3.9 environments. Figure 2 illustrates the overall architecture of multi-source spatial data instructions. This approach facilitates the establishment of a communication bridge between the desktop editor application and the immersive environment.
Communication Process of the Desktop Space: Initially, the desktop enters an active state, enabling WebSocket listening and continuously monitoring for incoming instructions from the other end. If no instructions are received, the WebSocket listening status is maintained. Upon receiving an instruction, the current scene information on the desktop is saved to the storage module, after which the instruction is transmitted to the other end. The system then returns to the WebSocket listening state, awaiting new instructions. This cycle repeats until the program is closed, at which point the WebSocket listening terminates.
Communication Process of the MR Space: Initially, the MR end enters an active state, enabling WebSocket listening and continuously monitoring for incoming instructions from the other end. If no instruction is received, the WebSocket listening status is maintained. Upon receiving an instruction, the system reads the accompanying data, updates the visualization scene, and then sends the instruction to the other end. Subsequently, it returns to the WebSocket listening state, awaiting new instructions. This process repeats until the program is closed, at which point the WebSocket listening terminates.

3.1.3. Spatial Positioning

The spatial positioning of MR scenes within the multi-source fusion space is determined through the HoloLens 2 coordinate system and the desktop coordinate system, as illustrated in Figure 3. The HoloLens 2 operates relative to a starting point designated as the origin. Specifically, HoloLens 2 detects a QR code placed on the display, which serves as a spatial anchor point. It then extracts the three-dimensional position data of the four corner points and boundary features of the QR code, thereby determining the position of the MR scene display relative to the HoloLens 2.
The plane coordinate system is expressed in pixels, whereas the world coordinate system is measured in meters. To ensure the accurate and effective display of multi-source fusion spatial scenes, we perform a conversion of length units, as described in Equation (1).
L e n g t h W C S = L e n g t h P C S × P i x e l R × L e n g t h S P P
where L e n g t h W C S means the unit length of world coordinate system; L e n g t h P C S is the unit length of plane coordinate system; P i x e l R means the ratio of physical pixels to logical pixels of display devices; L e n g t h S P P (Equation (2)) is the ratio of the physical length of the monitor to the Number of Physical Pixels in the length direction of the monitor.
L e n g t h S P P = L e n g t h D P / L e n g t h N P P
The coordinate positioning of objects within MR scenes is determined based on the screen QR code coordinates. Specifically, set the recognized screen QR code coordinates to [ x p , y p , z p ] , any point on the screen with coordinates is [ x , y , z ] , and the coordinate difference is obtained by Equation (3),
x d = x x p y d = y y p z d = z z p
The translation matrix T can be obtained from Equations (1)–(3) as shown in Equation (4).
T = 1 0 0 0 0 1 0 0 0 0 1 0 x d · P i x e l R · L e n g t h S P P y d · P i x e l R · L e n g t h S P P z d · P i x e l R · L e n g t h S P P 1
By using Equation (4), the coordinates of any point [ x , y , z ] on the screen to the multi-source fusion space [ x a , y a , z a ] can be obtained as shown in Equation (5).
[ x a , y a , z a , 1 ] = [ x , y , z , 1 ] × T

3.1.4. Historical Mechanism Design

History tools integrated into desktop applications, such as those for editing 2D images [4] and 3D scenes [5], typically include a history mechanism for operation logging. However, certain complex operations, such as batch calculations, cannot be undone or redone.
“Focus + Context” is an information visualization technique designed to display both detailed information and an overall overview simultaneously. This technique allows users to explore specific details while preserving a global perspective by enlarging the focal area (the part of interest) and reducing the contextual area (background information).
To address the complex operational challenges associated with the operation logging history mechanism, we propose an enhanced history mechanism based on the “Focus + Context” model. This mechanism not only records the application state but also enables state traversal, allowing the application to be restored to its previously stored configuration. This approach facilitates a clearer preview of the application’s state and offers greater flexibility in switching between states without the need to sequentially perform fallback operations.
We adopt a branching model [47] to organize the history items in a tree structure, where operations performed after a fallback operation create a new branch. The underlying data structure is a Directed Acyclic Graph (DAG), where each node represents a specific application state, and the edges, along with their direction, represent the temporal order of state transitions. Each node contains several metadata attributes, including node numbers, snapshots of the scene, 3D models, timestamps, and textual hints. The node number, akin to a Git version number [3], is generated at the time of committing the state. The textual hint is provided by the user during the commit process and serves to describe the state. Furthermore, the node corresponding to the current state is identified, and new nodes and branches are created after committing a new state, appending to the current node and designating the new node as the active one.
Figure 4 shows the changes that occur when states B and C are switched. The historical records are organized in reverse chronological order, with the creation timestamps arranged from top to bottom. The vertical alignment of the records reflects the temporal relationships between the historical states. Additionally, visual encoding using color has been employed to highlight the branch relationships of the nodes. Specifically, the current state is represented in rose red, all ancestor nodes are displayed in gray, all child nodes are shown in green, and other branch nodes are depicted in blue. These color effects are dynamic: when the current state is changed, the color scheme is updated according to the branch relationships of the newly selected node. After the update, the current node will move to the top of the page.

3.2. Visualization Design

To achieve an intuitive and natural visualization, as shown in Figure 5, we employed the metaphor of a “roll of film”. The historical record is extracted from the screen and presented as a “film”, analogous to its real-world function of recording history. The “roll of film” serves as a historical model in the form of a comic strip, encompassing snapshots and textual descriptions of historical events. Users interact seamlessly with the “film” to generate a “photo” or 3D model, which allows them to preview the scene’s state. Finally, the user selects the desired “photo” or 3D model and places it into the rectangular picture frame on the screen to complete the state transition. This design is both intuitive and engaging, adding an element of fun while enhancing the user experience.
In designing a historical representation, the graphical history should support analysis in a non-intrusive manner. The visualization should serve as the primary focal point, while the history functions as a secondary display [7]. Given the potential complexity and large size of the underlying DAG, tree diagrams can occupy significant space, detracting from the overall application interface. In contrast, linear representations are more suited to rectangular screens. Therefore, we employ a “roll of film” depiction for the historical model, presenting branching history records in a linear format, as illustrated in Figure 4 and Figure 6c. The history records are arranged vertically in reverse chronological order based on their creation timestamps, with each record item featuring a scene thumbnail and a textual hint. Due to space limitations, the “film” is constrained in terms of the number of history items, which can be navigated through a gesture we designed. Additionally, we use visual color coding to distinguish the branching relationships of nodes: rose red for the current state, gray for ancestor nodes, green for child nodes, and blue for other branching nodes. Some overly bright colors are unsuitable for immersive environments, and we have carefully adjusted these colors to ensure they are visually effective in such settings. These color effects are dynamic: when the current state is switched, the color effects are updated based on the new branching relationships, and the current node is repositioned to the first line of the page following the refresh.
HybridFilm enables users to easily preview the history without needing to switch states. We designed “photos” and 3D models for each historical record to provide detailed information about the history. As shown in Figure 6a, the “photo” panel displays the history metadata, including scene snapshots, node numbers, textual hints, and timestamps. The 3D model (Figure 6c) represents the structure as it appears in the scene. These details offer a comprehensive view of the history state and serve as helpful hints when switching states. Users can place multiple models or “photos” together simultaneously to better understand the changes between states.
The spatial arrangement of the user interface considers both immersive and desktop screen environments, with screen content taking precedence and history content serving as a supplementary display, as depicted in Figure 4. Since users need to interact with the history content, we position the “roll of film” near the right side of the screen, oriented towards the user, to facilitate natural interaction. The generated “photos” and models are placed near the left side of the screen, facing the user, allowing for easy observation of the historical details.

3.3. Interaction Design

To enhance user-friendliness and minimize the learning curve, we designed the interaction to be as intuitive and natural as possible. Switching between augmented reality (AR)/VR controllers and traditional input devices such as the mouse and keyboard can be both time-consuming and frustrating [17]. To address this, we incorporated natural gestures as a primary mode of interaction. To ensure consistency, we retained the use of the keyboard and mouse for desktop interactions, while, in the immersive environment, we primarily designed for one-handed operation. This approach mirrors the way users typically operate a computer: one hand remains on the keyboard or mouse while the other is free for other tasks, such as reaching for a drink or handling objects. We categorize the HybridFilm interactions into four types: conversion, selection, adjustment, and removal.

3.3.1. Conversion

To facilitate interoperability between the immersive environment and the screen space, we designed the following two operations:
  • Submitting: Interactions requiring detailed operations or text input have been shown to be more easily accomplished on the desktop [48]. To maintain operational consistency, we designed the submission interaction through the graphical interface on the desktop (Figure 7). First, the user fills in the Tag box with a textual prompt, then enters the command in the Command box to submit the scenario. The system then performs an automatic operation to generate a new node, which is appended to the DAG, switches the current node, and finally refreshes the visualization of the “film” in the immersive space. Additionally, the Command box supports the entry of other commands that are recognized by the system.
  • Switching: The user selects an item from the history list, then chooses the generated “photo” or model, drags it to the screen, and touches it (Figure 8). The system then performs a series of automatic operations: it first switches the current node, then switches the application scene, and finally refreshes the visualization of the “roll of film” in the immersive space. This interaction strategy is similar to the approaches proposed by Mohammad et al. [17] and Wu et al. [44]. It leverages the existing physical support and space of real-world objects for intuitive interaction, blending the digital and physical boundaries while utilizing metaphors in augmented reality to embody the abstraction process.

3.3.2. Selection

The selection operation is used to choose objects in the immersive space, including both historical items and details. We employed a hand ray approach, in which a ray is emitted from the index finger. This ray is used to align the manipulated object, and the object is selected when the index finger and thumb are pinched together, a method found to be the most accurate [49]. The object can be dragged by maintaining the pinch gesture between the index finger and thumb. However, hand rays can visually interfere with mouse and keyboard operations and may potentially cause false touches. To mitigate this, we designed a gesture involving the sliding of the index finger and thumb to toggle the hand rays on and off. This gesture enhances the haptic experience.
To deactivate the hand rays, the tip of the thumb must touch the first, second, and third joints of the index finger sequentially for a brief period, and vice versa (Figure 9a,c). Let the distances between the tip of the thumb and the first, second, and third joints of the index finger be d 1 , d 2 , d 3 , respectively. Let t represent the moment in time, D the distance threshold, and T the time threshold. If t = t i , then d 1 = d ( 1 , i ) , d 2 = d ( 2 , i ) , d 3 = d ( 3 , i ) ( i = 0 , 1 , 2 , ) . The conditions for toggling the hand rays on and off as the moments t i , t j , and t k pass in sequence are as follows:
  • Opening: when t k t i < T and d ( 3 , i ) < D , d ( 2 , j ) < D , d ( 1 , k ) < D .
  • Closing: when t k t i < T and d ( 1 , i ) < D , d ( 2 , j ) < D , d ( 3 , k ) < D .

3.3.3. Adjustment

Due to space constraints, the “roll of film” may not be able to display the entire history and may need to be flipped. The gesture of sliding the index finger and thumb is used to control the scrolling of the “film” both upward and downward. This gesture is analogous to turning a real wheel or knob. The second joint of the index finger sequentially touches the first and second joints of the thumb to indicate upward scrolling, and vice versa (Figure 9b), with one history item being scrolled at a time. Let the distances between the second joint of the index finger and the first and second joints of the thumb be denoted as x 1 and x 2 , respectively. Let t represent the moment in time, X the distance threshold, and B indicate whether the hand ray is active ( B = True ) or inactive ( B = False ). If t = t i , then x 1 = x ( 1 , i ) , and x 2 = x ( 2 , i ) for ( i = 0 , 1 , 2 , ) . The “film” passes through the moments t i and t j in sequence under the following two conditions:
  • Scrolling up: when B = False and x ( 1 , i ) < X and x ( 2 , j ) < X .
  • Scrolling down: when B = False and x ( 2 , i ) < X and x ( 1 , j ) < X .

3.3.4. Removal

Redundant 3D objects in immersive environments can impair visual clarity. To address this, we designed a virtual collision detector to remove such objects from the space. The virtual collision detector is modeled after a real-life garbage bin (Figure 10), and redundant objects are deleted by dragging and touching them.

4. Experiments and Results

In this section, we selected PyMOL 3.1, a desktop application for 3D model manipulation, to evaluate the usefulness and effectiveness of HybridFilm. This was conducted through questionnaire surveys and interviews based on comparative experiments.

4.1. Desktop Application for Experiment

We selected the molecular graphics system PyMOL (Figure 11) as a case study to evaluate HybridFilm performance with users.
The reasons for choosing PyMOL are as follows:
  • It supports both 2D and 3D content.
  • It offers an API that allows users to write scripts, making it highly extensible and compatible with our program architecture.
  • It includes the traditional desktop history tool, PyMOL Scenes, which facilitates comparability.
As shown in Table 1, PyMOL Scenes (Figure 11) offers functions for history submission, switching, and previewing. In contrast, HybridFilm (Figure 5) provides additional immersive functionality. It supports natural interaction with history previews, allowing users to freely arrange different history previews for comparison. Furthermore, unlike PyMOL Scenes, which requires switching between history states before zooming, HybridFilm enables zooming directly within the history previews.

4.2. Experimental Setup and Participants

In this study, we employed a HoloLens 2 HMD connected via cable to a PC running Windows 11, equipped with an Intel i5-10200H CPU, 16 GB of RAM, and an NVIDIA RTX 1650 GPU. Recruitment for the experiment was conducted through the campus network, resulting in 16 student volunteers. Among them, 12 were male and 4 were female. All participants were computer science majors and right-handed. The participants’ ages ranged from 21 to 25 years (M = 23.9, SD = 1.1). All had normal vision and were able to perceive objects and text cues displayed by the device.
To assess participants’ proficiency with immersive devices, we administered a pre-study questionnaire that categorized proficiency into four levels: never used, beginner, intermediate, and expert. The survey results revealed that 3 participants had never used immersive devices, 7 were beginners, 4 were intermediate, and 2 were experts in terms of proficiency with any VR/AR/MR device. In addition, 6 participants had never used gesture-recognition devices, 4 were beginners, 5 were intermediate, and 1 was an expert in this area. The participants were assigned numbers from P1 to P16.

4.3. Experimental Procedure

The experiment was divided into two parts, as shown in Figure 12: the introduction and usage of HybridFilm in the first part, and the introduction and usage of the PyMOL Scenes functionality in the second part. To mitigate potential learning effects, the order of the experimental tasks was randomly assigned to participants through draw lots method. The participants in the experiment were randomly assigned to two groups, with 8 individuals in each group. A 5 min break was provided between the two parts.
The first part involved the introduction and usage of HybridFilm. The introduction lasted approximately 5 min, during which participants were briefed on the research objectives and significance, followed by a demonstration of the system’s various features. Participants were then instructed to wear the head-mounted display and engage with the system, including tasks such as submitting and switching scenes, for about 10 min. Upon completing this portion, participants were asked to remove the HMD and fill out a questionnaire assessing the system’s usability and usefulness.
The second part focused on the introduction and usage of the PyMOL Scenes functionality. Similar to the first part, the introduction lasted around 5 min, followed by a demonstration of the PyMOL Scenes features. Participants then used these features for approximately 10 min. Afterward, participants completed the user experience questionnaire, a widely used tool in user experience research, which assessed their subjective experience with the system and the PyMOL Scenes functionality.
The user experience questionnaire benchmark, completed by participants, is a questionnaire that evaluates various dimensions of subjective experience using a 7-point Likert scale [50]. Finally, participants were interviewed in a semi-structured format to discuss the benefits and challenges of using the system. The interview lasted approximately 5 min. All survey data were subsequently compiled and analyzed.

4.4. Experimental Results

4.4.1. Usability and Utility of HybridFilm

We customized a questionnaire based on a 7-point Likert scale to assess the usability and utility of HybridFilm. The rating descriptions of the questionnaire are presented in Table 2, while Figure 13a and Figure 13b show the average scores for each subject of the usability and utility metrics of HybridFilm, respectively. The ratings shown in Figure 13 reveal that the participants generally agreed on the usability and utility of HybridFilm (average scores ≥ 6.0). Notably, the subjects ‘Preview’ under Usability and ‘Submit’, ‘Switch’, and ‘Preview’ under Utility received higher levels of approval, with average scores exceeding 6.5. Additionally, we incorporated feedback from retrospective interviews, which are summarized below.
The participants found it beneficial to use MR for managing the historical state of digital content on their desktops as it allowed them to commit digital content and restore it to a previous state when needed, as well as view previews between different states. P1 remarked, “This feature is useful, and I believe 3D content can be better understood using MR than 2D displays.” P6 noted, “It was both cool and useful to integrate virtual content with the screen to assist in using the computer.” The participants also found the system easy to use, with tasks such as submission and switching being relatively straightforward. In comparison with other history tools, the participants offered the following feedback: P4: “When I use other related history tools, there is only a simple 2D view preview that can only be accessed after switching, which is not very intuitive.” Another participant stated, P11: “It requires some learning effort, but once learned, this interaction is easy.” The participants felt that our system effectively facilitated the observation of changes between two states and the evolution between individual states, enabling them to easily comprehend the temporal relationships that preceded these changes. Regarding the system’s applicability, P7 commented, “The color transformation mechanism is very clever and enables easy observation of the before-and-after relationships between individual states.” P8 added, “Using MR provides a clearer view of the structure of the 3D model.”
The comments from all the participants were used to construct a word cloud to evaluate the HybridFilm. After removing common stopwords, the comments were classified into three categories: positive feedback, neutral feedback, and negative feedback. The resulting word cloud is presented in Figure 14. The analysis of the word cloud indicates that the majority of the participant comments are concentrated in the positive feedback category (orange), with a small proportion of neutral feedback (blue), and no negative feedback. This suggests that the system received an overall favorable response from the participants.

4.4.2. Comparative Evaluation

We compared the subjective experience of using HybridFilm to traditional desktop history tool PyMOL Scenes using a 7-point Likert scale, with the results presented in Figure 15. In conjunction with the retrospective interviews, the following provides a summary of the participants’ subjective evaluations for each item in the questionnaire.
To verify whether the differences between HybridFilm and PyMOL were statistically significant, we used the Wilcoxon signed-rank test. Since the comparison was based on evaluation data from the same users, the data are related. Therefore, we chose the Wilcoxon test to compare the differences between the two tools.
Wilcoxon Signed-Rank Test Calculation Method: We computed the test statistics by ranking the absolute differences between the pairs of scores and then determining the sum of the ranks for the positive and negative differences. The null hypothesis for this test assumes that there is no significant difference between the two tools, while the alternative hypothesis suggests a significant difference.
p-value Threshold: We set the significance level at α = 0.05. Therefore, if the p-value is less than 0.05, we reject the null hypothesis, indicating that the two groups exhibit a significant difference.
In the results shown in Figure 15, we marked the statistical significance, with an asterisk indicating that the p-value is less than 0.05, signifying that the difference between HybridFilm and PyMOL is statistically significant for that criterion. For example, for Intuitiveness, the Wilcoxon test resulted in a p-value of 0.026, which means HybridFilm performs significantly better than PyMOL in this subject. However, for eye comfort, the p-value was 0.143, indicating that the difference is not statistically significant. Through Wilcoxon test and p-value analysis, we can clearly observe that HybridFilm demonstrates significant advantages over PyMOL in the criteria of intuitiveness, immersion, and satisfaction. However, in the other three subjects, the differences between the two tools are minimal.
Overall, HybridFilm demonstrates comparable performance to traditional desktop history recording tools in the subjects of eye comfort, hand comfort, and fluency while offering superior intuitiveness, immersion, and user satisfaction.

5. Discussion and Future Work

The results demonstrate that HybridFilm offers superior usability and utility, being more intuitive, immersive, and providing higher user satisfaction compared to traditional desktop history tools. Additionally, it is comparable to traditional tools in terms of eye and hand comfort, as well as fluency. In this section, we discuss the design implications of HybridFilm based on these findings, along with its limitations and potential directions for future research.

5.1. Discussion

HybridFilm integrates screen space with an immersive environment, aligning virtual content with the screen space and facilitating seamless interaction. This design ensures a more cohesive experience, achieving a true fusion of reality and virtuality. Evaluations indicate that HybridFilm outperforms traditional history tools by addressing their lack of intuitiveness while not interfering with primary on-screen tasks. Furthermore, HybridFilm enhances usability and utility by extending the features of desktop applications through its history tool submission, switching, and previewing functionalities.
The “roll of film” represents a novel immersive approach to visualizing historical data. While immersive visualizations of graphical data already exist, HybridFilm was assessed and found to share similar advantages, including increased intuitiveness and immersion. Additionally, HybridFilm’s visualization optimizes the screen space, with the linear nature of the “film” both illustrating the relationship between states and fitting the rectangular screen format without disrupting the main task.
During the design phase, gesture-based interaction was initially perceived as potentially more fatiguing due to the additional movement required compared to traditional mouse or keyboard operations. Additionally, switching between mouse/keyboard and gesture input was expected to introduce a transition challenge, although this was considered less demanding than switching between mouse/keyboard and VR/AR controllers. The evaluations, however, revealed that HybridFilm’s comfort and fluency were comparable to traditional desktop history tools, an outcome that was unexpected.

5.2. Limitations and Future Work

The limitations of a hypothetical non-invasive all-day-wearable head-mounted display have yet to be fully addressed, and we are currently limited to using the HoloLens 2. Due to its 52° field of view (FOV) and the limited accuracy of its gesture-tracking camera, it is not possible to fully capture motion outside the camera’s FOV, which compromises the overall user experience. Furthermore, performance constraints have prevented the display of more detailed and complex 3D models.
We are considering storing historical data and associated metadata on a cloud server to support multi-user sharing and asynchronous collaborative work, akin to Git. However, the viability of this approach remains uncertain, and further research will be conducted to assess its feasibility and refine the concept. In the interim, we are exploring the possibility of introducing domain-specific features. Although the initial concept was validated using a molecular graphics system as a case study, HybridFilm currently lacks specialized features tailored to the molecular domain. We are considering collaborating with experts in the molecular field to incorporate functionalities such as model dissection for the study of molecular structures. Additionally, modeling and editing software such as Blender and 3ds Max may theoretically align with our program architecture, although their implementation will require targeted modifications, which we plan to explore in future research.
Finally, the participants in our experiment were college students, who are typically highly receptive to and adaptable to novel interaction paradigms. As such, the performance of younger or older individuals, or those with differing educational backgrounds, remains unknown. Therefore, it is necessary to extend this research to a larger and more diverse user population. While we have evaluated the usability and utility of the system, future work could expand the assessment to include additional parameters such as accuracy, consistency, and scalability. The current comparative evaluation, although somewhat subjective, stems from the fact that PyMOL is a classic desktop tool, whereas HybridFilm is designed with a different purpose in mind. The two differ in both technical approaches and application scenarios. However, this disparity has pointed us toward future research directions. In subsequent work, we plan to design a quantitative evaluation framework tailored to specific application domains. By collecting more precise and objective performance data, we aim to further highlight HybridFilm’s core advantages in immersive interaction, multidimensional data presentation, and other aspects compared to similar immersive tools, thus providing a stronger foundation for its deeper application in professional fields.

6. Conclusions

This paper proposes a novel history tool, HybridFilm, which offers users an innovative interface that facilitates interoperability between screen space and immersive environments. This interaction enables the management of the historical state of digital content in desktop applications. HybridFilm is designed to provide a more intuitive experience for users in navigating the details of historical states. We evaluated the usability and utility of HybridFilm through a case study, comparing the subjective user experience with that of traditional desktop history tools. The results indicate that HybridFilm is both usable and useful, offering a more intuitive and immersive experience than traditional tools while maintaining similar levels of comfort and smoothness in use. Additionally, HybridFilm achieved higher user satisfaction. Moving forward, we plan to extend HybridFilm to specific domains and will design and implement domain-specific evaluations to further assess its effectiveness.

Author Contributions

Conceptualization, L.Z. and D.G.; methodology, L.Z.; software, M.Z.; validation, L.Z., M.Z., and Y.L.; writing—original draft preparation, L.Z. and M.Z.; writing—review and editing, L.Z.; supervision, D.G.; funding acquisition, D.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported in part by the National Science Foundation of China (Grant No. 61802334), Natural Science Foundation of Hebei Province (F2022203015 and F2025203008), and Innovation Capability Improvement Plan Project of Hebei Province (22567637H).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Dataset available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Rupp, D.; Kuhlen, T.; Weissker, T. TENETvr: Comprehensible Temporal Teleportation in Time-Varying Virtual Environments. In Proceedings of the 2023 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Sydney, Australia, 16–20 October 2023; pp. 922–929. [Google Scholar]
  2. Battut, A.; Ratovo, K.; Beaudouin-Lafon, M. OneTrace: Improving Event Recall and Coordination with Cross-Application Interaction Histories. Int. J. Human–Comput. Interact. 2025, 41, 3241–3258. [Google Scholar] [CrossRef]
  3. Git. Git-Scm.com. 2024. Available online: https://git-scm.com/ (accessed on 18 June 2025).
  4. Chen, H.T.; Wei, L.Y.; Chang, C.F. Nonlinear revision control for images. ACM Trans. Graph. (TOG) 2011, 30, 105. [Google Scholar] [CrossRef]
  5. Zhang, L.; Agrawal, A.; Oney, S.; Guo, A. Vrgit: A version control system for collaborative content creation in virtual reality. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, Hamburg, Germany, 23–28 April 2023; pp. 1–14. [Google Scholar]
  6. Sauermann, L.; Bernardi, A.; Dengel, A. Overview and Outlook on the Semantic Desktop. In Proceedings of the Semantic Desktop Workshop, Galway, Irelan, 6 November 2005; Volume 175, pp. 1–18. [Google Scholar]
  7. Heer, J.; Mackinlay, J.; Stolte, C.; Agrawala, M. Graphical histories for visualization: Supporting analysis, communication, and evaluation. IEEE Trans. Vis. Comput. Graph. 2008, 14, 1189–1196. [Google Scholar] [CrossRef]
  8. Hoff, A.; Seidl, C.; Lanza, M. Immersive Software Archaeology: Exploring Software Architecture and Design in Virtual Reality. In Proceedings of the 2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) IEEE, Rovaniemi, Finland, 12–15 March 2024; pp. 47–51. [Google Scholar]
  9. Mehra, R.; Sharma, V.S.; Vikrant Kaulgud Sanjay Podder Burden, A.P. Towards immersive comprehension of software systems using augmented reality. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, Melbourne, Australia, 21–25 December 2020; pp. 1267–1269. [Google Scholar]
  10. Kreber, L.; Diehl, S.; Weil, P. Idevelopar: A programming interface to enhance code understanding in augmented reality. In Proceedings of the 2022 Working Conference on Software Visualization (VISSOFT), Limassol, Cyprus, 3–4 October 2022; pp. 87–95. [Google Scholar]
  11. Merino, L.; Hess, M.; Bergel, A.; Nierstrasz, O.; Weiskopf, D. Perfvis: Pervasive Visualization in Immersive Augmented Reality for Performance Awareness. arXiv 2019, arXiv:1904.06399. [Google Scholar]
  12. Waller, J.; Wulf, C.; Fittkau, F.; Döhring, P.; Hasselbring, W. Synchrovis: 3d visualization of monitoring traces in the city metaphor for analyzing concurrency. In Proceedings of the 2013 First IEEE Working Conference on Software Visualization (VISSOFT), Eindhoven, The Netherlands, 27–28 September 2013; pp. 1–4. [Google Scholar]
  13. Sharma, V.S.; Mehra, R.; Kaulgud, V.; Podder, S. An extended reality approach for creating immersive software project workspaces. In Proceedings of the 2019 IEEE/ACM 12th International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE), Montreal, QC, Canada, 27 May 2019; pp. 27–30. [Google Scholar]
  14. Lee, B.; Cordeil, M.; Prouzeau, A.; Jenny, B.; Dwyer, T. A design space for data visualisation transformations between 2d and 3d in mixed-reality environments. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, New Orleans, LA, USA, 29 April–5 May 2022; pp. 1–14. [Google Scholar]
  15. Cavallo, M.; Dolakia, M.; Havlena, M.; Ocheltree, K.; Podlaseck, M. Immersive insights: A hybrid analytics system forcollaborative exploratory data analysis. In Proceedings of the 25th ACM Symposium on Virtual Reality Software and Technology, Parramatta, NSW, Australia, 12–15 November 2019; pp. 1–12. [Google Scholar]
  16. Myers, B.A. Interaction Techniques–History, Design and Evaluation. In Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 11–16 May 2024; pp. 1–3. [Google Scholar]
  17. Seraji, M.R.; Stuerzlinger, W. Hybridaxes: An immersive analytics tool with interoperability between 2d and immersive reality modes. In Proceedings of the 2022 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct), Singapore, 17–21 October 2022; pp. 155–160. [Google Scholar]
  18. Chen, H.T.; Wei, L.Y.; Hartmann, B.; Agrawala, M. Data-driven adaptive history for image editing. In Proceedings of the 20th ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, Redmond, WA, USA, 26–28 February 2016; pp. 103–111. [Google Scholar]
  19. Doboš, J.; Steed, A. 3D revision control framework. In Proceedings of the 17th International Conference on 3D Web Technology, New York, NY, USA, 4 August 2012; pp. 121–129. [Google Scholar]
  20. Lilija, K.; Pohl, H.; Hornbæk, K. Who put that there? temporal navigation of spatial recordings by direct manipulation. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 25–30 April 2020; pp. 1–11. [Google Scholar]
  21. Myers, B.A.; Lai, A.; Le, T.M.; Yoon, Y.; Faulring, A.; Brandt, J. Selective undo support for painting applications. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, Seoul, Republic of Korea, 18–23 April 2015; pp. 4227–4236. [Google Scholar]
  22. Zong, J.; Barnwal, D.; Neogy, R.; Satyanarayan, A. Lyra 2: Designing interactive visualizations by demonstration. IEEE Trans. Vis. Comput. Graph. 2020, 27, 304–314. [Google Scholar] [CrossRef] [PubMed]
  23. Salvati, G.; Santoni, C.; Tibaldo, V.; Pellacini, F. Meshhisto: Collaborative modeling by sharing and retargeting editing histories. ACM Trans. Graph. (TOG) 2015, 34, 205. [Google Scholar] [CrossRef]
  24. Liu, Y.; Liao, S.; Jin, Y.; Ma, M.; Tang, W. Embodied Cognition and MR-Based Interactive Narrative Design: The Case of ‘Encountering Sanmao’ at the Former Residence of Zhang Leping. In Proceedings of the 2024 10th International Conference on Virtual Reality (ICVR), Bournemouth, UK, 24–26 July 2024; pp. 153–160. [Google Scholar]
  25. Jin, Y.; Ma, M.; Liu, Y. Comparative study of HMD-based virtual and augmented realities for immersive museums: User acceptance, medium, and learning. ACM J. Comput. Cult. Herit. 2024, 17, 13. [Google Scholar] [CrossRef]
  26. Wang, X.; Besançon, L.; Rousseau, D.; Sereno, M.; Ammi, M.; Isenberg, T. Towards an understanding of augmented reality extensions for existing 3D data analysis tools. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 25–30 April 2020; pp. 1–13. [Google Scholar]
  27. Mandalika, V.B.H.; Chernoglazov, A.I.; Billinghurst, M.; Bartneck, C.; Hurrell, M.A.; Ruiter, N.D.; Butler, A.P.H.; Butler, P.H. A hybrid 2D/3D user interface for radiological diagnosis. J. Digit. Imaging 2018, 31, 56–73. [Google Scholar] [CrossRef] [PubMed]
  28. Zhu-Tian, C.; Tong, W.; Wang, Q.; Bach, B.; Qu, H. Augmenting static visualizations with paparvis designer. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 25–30 April 2020; pp. 1–12. [Google Scholar]
  29. Fröhler, B.; Anthes, C.; Pointecker, F.; Friedl, J.; Schwajda, D.; Riegler, A.; Tripathi, S.; Holzmann, C.; Brunner, M.; Jodlbauer, H.; et al. A Survey on Cross-Virtuality Analytics. In Proceedings of the Computer Graphics Forum Hoboken, New Jersey, USA, 08 February 2022; Volume 41, pp. 465–494. [Google Scholar]
  30. Zhu, F.; Grossman, T. Bishare: Exploring bidirectional interactions between smartphones and head-mounted augmented reality. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 25–30 April 2020; pp. 1–14. [Google Scholar]
  31. Langner, R.; Satkowski, M.; Büschel, W.; Dachselt, R. Marvis: Combining mobile devices and augmented reality for visual data analysis. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, Online, 8–13 May 2021; pp. 1–17. [Google Scholar]
  32. Reipschlager, P.; Flemisch, T.; Dachselt, R. Personal augmented reality for information visualization on large interactive displays. IEEE Trans. Vis. Comput. Graph. 2020, 27, 1182–1192. [Google Scholar] [CrossRef] [PubMed]
  33. Lee, Y.J.; Ji, Y.G. Effects of visual realism on avatar perception in immersive and non-immersive virtual environments. Int. J. Human–Comput. Interact. 2025, 41, 4362–4375. [Google Scholar] [CrossRef]
  34. JBem, M.; Chabot, S.R.; Brooks, V.; Braasch, J. Enhancing museum experiences: Using immersive environments to evaluate soundscape preferences. J. Acoust. Soc. Am. 2025, 157, 1097–1108. [Google Scholar] [CrossRef] [PubMed]
  35. Bowman, D.A.; Wingrave, C.A. Design and evaluation of menu systems for immersive virtual environments. In Proceedings of the IEEE Virtual Reality 2001 IEEE, Yokohama, Japan, 13–17 March 2001; pp. 149–156. [Google Scholar]
  36. Reiter, K.; Pfeuffer, K.; Esteves, A.; Mittermeier, T.; Alt, F. Look & turn: One-handed and expressive menu interaction by gaze and arm turns in vr. In Proceedings of the 2022 Symposium on Eye Tracking Research and Applications, Seattle, WA, USA, 8–11 June 2022; pp. 1–7. [Google Scholar]
  37. Chen, X.; Guo, D.; Feng, L.; Chen, B.; Liu, W. Compass+ Ring: A Multimodal Menu to Improve Interaction Performance and Comfortability in One-handed Scenarios. In Proceedings of the 2023 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Sydney, Australia, 16–20 October 2023; pp. 473–482. [Google Scholar]
  38. Lindlbauer, D.; Feit, A.M.; Hilliges, O. Context-aware online adaptation of mixed reality interfaces. In Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology, New Orleans, LA, USA, 20–23 October 2019; pp. 147–160. [Google Scholar]
  39. Cheng, Y.F.; Luong, T.; Fender, A.R.; Streli, P.; Holz, C. ComforTable user interfaces: Surfaces reduce input error, time, and exertion for tabletop and mid-air user interfaces. In Proceedings of the 2022 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Singapore, 17–21 October 2022; pp. 150–159. [Google Scholar]
  40. Satriadi, K.A.; Ens, B.; Cordeil, M.; Jenny, B.; Czauderna, T.; Willett, W. Augmented reality map navigation with freehand gestures. In Proceedings of the 2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), Osaka, Japan, 23–27 March 2019; pp. 593–603. [Google Scholar]
  41. Wentzel, J.; Lakier, M.; Hartmann, J.; Falah Shazib Géry Casiez Vogel, D. A Comparison of Virtual Reality Menu Archetypes: Raycasting, Direct Input, and Marking Menus. IEEE Trans. Vis. Comput. Graph. 2024, 1–15. [Google Scholar] [CrossRef] [PubMed]
  42. Li, Y.; Fischer, F.; Dwyer, T.; Ens, B.; Crowther, R.; Kristensson, P.O.; Tag, B. Alphapig: The nicest way to prolong interactive gestures in extended reality. In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems, Yokohama Japan, 26 April–1 May 2025; pp. 1–14. [Google Scholar]
  43. Huang, Y.J.; Liu, K.Y.; Lee, S.S.; Yeh, I.C. Evaluation of a hybrid of hand gesture and controller inputs in virtual reality. Int. J. Human–Comput. Interact. 2021, 37, 169–180. [Google Scholar] [CrossRef]
  44. Wu, S.; Byrne, D.; Steenson, M.W. “Megereality”: Leveraging Physical Affordances for Multi-Device Gestural Interaction in Augmented Reality. In Proceedings of the Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 25–30 April 2020; pp. 1–4. [Google Scholar]
  45. Mixed Reality Toolkit (MRTK) 2.8.0. 2023. Microsoft. Available online: https://github.com/microsoft/MixedRealityToolkit-Unity/releases/tag/v2.8.0 (accessed on 18 June 2025).
  46. Unity 2022.3.15. Unity Technologies. 2023. Available online: https://unity.com/releases/editor/whats-new/2022.3.15 (accessed on 13 July 2024).
  47. Robinson, A.C.; Weaver, C. Re-visualization: Interactive visualization of the process of visual analysis. In Proceedings of the Workshop on Visualization, Analytics & Spatial Decision Support at the GIScience Conference, Muenster, Germany, 20 September 2006; pp. 1–21. [Google Scholar]
  48. Dube, T.J.; Arif, A.S. Text entry in virtual reality: A comprehensive review of the literature. In Proceedings of the Human-Computer Interaction. Recognition and Interaction Technologies: Thematic Area, HCI 2019, Held as Part of the 21st HCI International Conference, HCII 2019, Orlando, FL, USA, 26–31 July 2019; Proceedings, Part II 21. Springer International Publishing: Cham, Switzerland, 2019; pp. 419–437. [Google Scholar]
  49. He, Y.; Hu, Y.; Feng, H.; Li, C.; Shen, X. Comparative Analysis of 3D Interactive Modes in Different Object Layouts in Mixed Reality. In Proceedings of the Ninth International Symposium of Chinese CHI, Online, 16–17 October 2021; pp. 120–126. [Google Scholar]
  50. Joshi, A.; Kale, S.; Chandel, S.; Pal, D.K. Likert Scale: Explored and Explained. Br. J. Appl. Sci. Technol. 2015, 7, 396–403. [Google Scholar] [CrossRef]
Figure 1. Framework comparison of the HybridFilm multi-source space and the traditional single-screen space.
Figure 1. Framework comparison of the HybridFilm multi-source space and the traditional single-screen space.
Applsci 15 08489 g001
Figure 2. Instructions and data flow framework between desktop space and MR space in HybridFilm.
Figure 2. Instructions and data flow framework between desktop space and MR space in HybridFilm.
Applsci 15 08489 g002
Figure 3. Spatial positioning of MR scenes based on the HoloLens 2 and desktop coordinate systems.
Figure 3. Spatial positioning of MR scenes based on the HoloLens 2 and desktop coordinate systems.
Applsci 15 08489 g003
Figure 4. State transitions of each node when states B and C are swapped. A, B, C, D, E, F, and G represent different historical states.
Figure 4. State transitions of each node when states B and C are swapped. A, B, C, D, E, F, and G represent different historical states.
Applsci 15 08489 g004
Figure 5. Spatial arrangement of HybridFilm components.
Figure 5. Spatial arrangement of HybridFilm components.
Applsci 15 08489 g005
Figure 6. Components of HybridFilm: (a) photo; (b) 3D model; (c) film.
Figure 6. Components of HybridFilm: (a) photo; (b) 3D model; (c) film.
Applsci 15 08489 g006
Figure 7. Desktop GUI for executing submission operations, including a tag box for user-entered text prompts, a command box for entering commands, and an execute button for performing operations.
Figure 7. Desktop GUI for executing submission operations, including a tag box for user-entered text prompts, a command box for entering commands, and an execute button for performing operations.
Applsci 15 08489 g007
Figure 8. Two ways for performing the switching operation: (a) using the model, and (b) using a “photo”.
Figure 8. Two ways for performing the switching operation: (a) using the model, and (b) using a “photo”.
Applsci 15 08489 g008
Figure 9. Hand gesture interaction in HybridFilm: (a) joints used for interaction gestures; (b) adjustment; (c) selection. The red circle in (a) represents the finger joint points of identifiable pinch gestures. The red arrows without dots in (b,c) indicate the direction of film movement, while the red arrows with dots represent the direction of hand ray switching and movement.
Figure 9. Hand gesture interaction in HybridFilm: (a) joints used for interaction gestures; (b) adjustment; (c) selection. The red circle in (a) represents the finger joint points of identifiable pinch gestures. The red arrows without dots in (b,c) indicate the direction of film movement, while the red arrows with dots represent the direction of hand ray switching and movement.
Applsci 15 08489 g009
Figure 10. Garbage bin used for deleting objects upon contact.
Figure 10. Garbage bin used for deleting objects upon contact.
Applsci 15 08489 g010
Figure 11. PyMOL Scenes functional interface.
Figure 11. PyMOL Scenes functional interface.
Applsci 15 08489 g011
Figure 12. Evaluation of the experimental procedure. Groups A and B conducted evaluation experiments, questionnaires, and interviews using HybridFilm and PyMOL Scenes, respectively.
Figure 12. Evaluation of the experimental procedure. Groups A and B conducted evaluation experiments, questionnaires, and interviews using HybridFilm and PyMOL Scenes, respectively.
Applsci 15 08489 g012
Figure 13. Average scores for usability and utility metrics of HybridFilm. (a) Subjects of usability; (b) subjects of utility.
Figure 13. Average scores for usability and utility metrics of HybridFilm. (a) Subjects of usability; (b) subjects of utility.
Applsci 15 08489 g013
Figure 14. The word cloud of participant evaluations, with orange representing positive feedback and blue representing neutral feedback.
Figure 14. The word cloud of participant evaluations, with orange representing positive feedback and blue representing neutral feedback.
Applsci 15 08489 g014
Figure 15. Box plot of 7-point Likert scale scores comparing HybridFilm and PyMOL Scenes. Red asterisks indicate significant differences, and red dots represent outliers.
Figure 15. Box plot of 7-point Likert scale scores comparing HybridFilm and PyMOL Scenes. Red asterisks indicate significant differences, and red dots represent outliers.
Applsci 15 08489 g015
Table 1. Features of HybridFilm and PyMOL Scenes.
Table 1. Features of HybridFilm and PyMOL Scenes.
History ToolsFeatures
SubmitSwitchPreviewZoomMove FreelyInteraction
HybridFilm
PyMOL Scenes
Table 2. Rating descriptions for the usability and utility questionnaire of HybridFilm.
Table 2. Rating descriptions for the usability and utility questionnaire of HybridFilm.
CategorySubjectRating Description
Strongly DisagreeDisagreeSomewhat DisagreeNeutralSomewhat AgreeAgreeStrongly Agree
UsabilitySubmit1234567
Switch1234567
Preview1234567
UtilitySubmit1234567
Switch1234567
Preview1234567
Change1234567
Evolution1234567
Sequential1234567
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhou, L.; Zhang, M.; Liu, Y.; Guo, D. HybridFilm: A Mixed-Reality History Tool Enabling Interoperability Between Screen Space and Immersive Environments. Appl. Sci. 2025, 15, 8489. https://doi.org/10.3390/app15158489

AMA Style

Zhou L, Zhang M, Liu Y, Guo D. HybridFilm: A Mixed-Reality History Tool Enabling Interoperability Between Screen Space and Immersive Environments. Applied Sciences. 2025; 15(15):8489. https://doi.org/10.3390/app15158489

Chicago/Turabian Style

Zhou, Lisha, Meng Zhang, Yapeng Liu, and Dongliang Guo. 2025. "HybridFilm: A Mixed-Reality History Tool Enabling Interoperability Between Screen Space and Immersive Environments" Applied Sciences 15, no. 15: 8489. https://doi.org/10.3390/app15158489

APA Style

Zhou, L., Zhang, M., Liu, Y., & Guo, D. (2025). HybridFilm: A Mixed-Reality History Tool Enabling Interoperability Between Screen Space and Immersive Environments. Applied Sciences, 15(15), 8489. https://doi.org/10.3390/app15158489

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop