Next Article in Journal
Discrete Element Simulations of Damage Evolution of NiAl-Based Material Reconstructed by Micro-CT Imaging
Previous Article in Journal
Electrification in Maritime Vessels: Reviewing Storage Solutions and Long-Term Energy Management
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Evaluation of Accessibility Parameters in Education Software That Supports Three-Dimensional Interactions

by
Ana Kešelj Dilberović
,
Krunoslav Žubrinić
*,
Mario Miličević
and
Mihaela Kristić
Faculty of Electrical Engineering and Applied Computing, University of Dubrovnik, Ćira Carića 4, 20000 Dubrovnik, Croatia
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(10), 5258; https://doi.org/10.3390/app15105258
Submission received: 28 March 2025 / Revised: 26 April 2025 / Accepted: 5 May 2025 / Published: 8 May 2025
(This article belongs to the Section Computing and Artificial Intelligence)

Abstract

This paper analyzes and evaluates different types of interactions in educational software that uses three-dimensional (3D) interactions and visualization technologies, with a focus on accessibility in such environments. To determine the accessibility parameters based on 3D user interactions, a two-phase pilot study was conducted. In the first phase, the GeometryGame research instrument was tested, in which objective metrics were collected. In the second phase, subjective metrics were collected in the form of participants’ opinions on certain aspects of accessibility and interactions implemented in the GeometryGame. The most important interaction parameters were evaluated, including the size of the objects, the presence of distractions, the intuitiveness of the user interface, and the size and distance of the interaction elements from the virtual hand. The results provide insight into user preferences and highlight the importance of customizing the user interface to ensure effective and accessible 3D interactions. Based on objective measurements and subjective user feedback, recommendations were developed to improve the accessibility of educational software that supports 3D interactions; to increase usability, reliability, and user comfort; and to enhance future research in this area.

1. Introduction

Advances in digital technology have led to the widespread use of 3D graphical objects on desktop computers and mobile devices, allowing users to gain a deeper understanding of specific content and topics [1]. One of the most important applications of visualization technologies is in education, where they allow students to better acquire complex concepts from different domains such as mathematics, physics, biology, and history through exploration and interaction with materials such as 3D models, simulations, and interactive displays in an innovative way [2].
Introducing these technologies into the educational process encourages students’ active participation, allows them to visualize abstract concepts, and provides them with the opportunity to experiment in a safe environment, which greatly contributes to their understanding and engagement in learning. Students with learning difficulties or learning disabilities face challenges in the educational process, such as a limited ability to sustain attention and a corresponding inability to focus on what is required in the educational process for an extended period. Learning visualization technologies can engage their interest and direct their attention over an extended period to ensure effective learning. The use of interactive 3D models, simulations, and immersive technologies can enable them to actively participate and better understand abstract concepts through hands-on and visually engaging approaches, contributing to their cognitive development and motivation to learn [3].
Extended Reality (XR) is a term that is often used as an umbrella term for all current and future immersive technologies that combine the real and virtual worlds [4]. According to this view, the ITU describes XR as an environment composed of real, virtual, or mixed elements, where the “X” symbolizes each emerging type of reality—such as augmented, assisted, mixed, virtual, or diminished reality [5].
The two currently most influential XR technologies are virtual reality (VR) and augmented reality (AR). VR is a technology that completely immerses the user in a digital environment and prevents them from seeing the physical world around them [4]. In contrast, AR augments the real world by superimposing virtual elements on it, allowing the user to perceive both at the same time [6]. Depending on the degree of perceived presence, VR can be further subdivided into assisted or mixed reality (MR). In assisted reality, virtual content appears clearly artificial and is simply overlaid on the user’s view, whereas in MR, digital elements are seamlessly integrated into the physical space and allow the user to interact with them as if they were part of the real world [4]. MR enables more complex interactions by allowing virtual objects to respond to real-world stimuli and vice versa [7]. W3C Note on the XR Accessibility User Requirements outlines various user accessibility requirements for XR and the associated architecture [7,8].
XR technologies have numerous benefits for the educational process, especially in terms of increasing student engagement and learning efficiency. They enable interactive teaching methods that give students the opportunity to actively participate and thus lead to greater motivation. Empirical studies suggest that immersive experiences contribute to better cognitive processing of information and longer retention [9,10,11,12]. The needs and requirements of users often depend on the context of use [8]. To support this, such technologies could be adapted to different educational contexts and user profiles and thus contribute to an inclusive educational process. In addition to the benefits of XR technologies in education, there are also some limitations. For example, AR displays can be affected by occlusion, i.e., objects such as hands in the field of view that negatively impact the quality of interaction. Furthermore, current AR applications face challenges such as limited display of information due to relatively small smartphone screens that require constant holding and aligning of the device, which limits user interaction and content sharing [13]. In addition, many existing applications, such as holographic pyramids, lack interactivity, which reduces their effectiveness in enhancing the learning experience [14].
One of the most important features of the learning experience is the interaction with the learning software system [15]. Traditional user interfaces (UIs) were designed for the use of the mouse, keyboard, and touchscreen as input devices. Advances in computer technology have led to the development of new interfaces in areas where traditional devices are often inadequate. Visualization technologies in such environments require more intuitive forms of interaction and are therefore looking for new UIs that support such interactions.
Another important aspect of educational software systems is accessibility. Digital accessibility refers to the extent to which a computer program, website, or device is acceptable and suitable for use by people of all groups, including people with disabilities and older people, leading to a higher degree of their inclusion in society. Despite the rapid development of visualization technology, research into the accessibility and integration of these technologies still lags the needs of the market, and solutions need to be found that can overcome the barriers that these technologies present to users.
The goals of this research were to investigate and define the basic accessibility parameters by evaluating 3D user interactions in educational software and to formulate recommendations for improving the accessibility of existing and future educational software solutions. The accessibility guidelines defined in this study are specifically tailored to the needs of 3D interaction. They enable the creation of UI that not only fulfills the basic accessibility requirements but also improves the user experience. Based on previous research findings and the theoretical framework, three hypotheses were formulated and then tested against the results of the study.
The sections of this paper are organized as follows. After the introduction, the second section of this paper describes previous research in the field of 3D user interaction and accessibility of educational software systems with a focus on natural interactions. The third section describes the research conducted, focusing on the research instrument used in the form of an educational application and the metrics used to define the accessibility parameters. In this section, research hypotheses were formulated. The quantitative and qualitative research results are described in detail in the fourth section, while the fifth section summarizes the findings, highlights the limitations of the research conducted, presents the testing of the previously formulated hypotheses, and gives recommendations for adapting the representation of 3D objects and 3D interactions in educational software systems. Conclusions and plans for future research are presented in the sixth section.

2. Related Work

In this section, we provide an overview of user interactions with digital systems, including natural interactions, and summarize the research findings on defining accessibility parameters with a focus on environments with 3D interactions.

2.1. User Interactions with Digital Learning Systems

A digital learning system is basically a collection of digital tools and resources, such as texts, videos, quizzes, and simulations, that work together to enhance learning. These systems include hardware, software, and digital materials that allow students to access educational content, monitor their progress, and interact with teachers and other students [16]. The ability to interact with such a system is a fundamental requirement for the learning process, regardless of whether it takes place in a traditional or online context [17].
Usability factors have an important influence on user attitudes toward e-learning applications and indirectly on the results achieved. Previous studies have shown that the user’s satisfaction when using an application is influenced not only by the quality of the information but above all by the user’s attitude toward the application and the interface [18]. Users initially explore the various functions of the system to familiarize themselves with its capabilities. Over time, they focus on a stable set of functionalities that best suit their tasks. The increase in usage functionality has a positive effect on the perception of current performance, but also on objective performance measurements in later phases of use [19]. If the task is clearly defined and users are using a familiar tool, previous experience can help them to solve tasks with new technologies. In practice, however, it has been shown that users’ habits and previous experiences can reduce the effectiveness of use if the task is open and the new technology works differently from what the user is used to [20].
User interaction latency in an XR system refers to the time delay between the user’s physical action and the corresponding system response [21]. This latency is an important factor that affects the user experience, especially in terms of perceived realism, immersion, and task performance [13]. High latency can disrupt the sense of presence, cause motion sickness, and reduce the accuracy of the task. This is particularly problematic for applications that require fine motor control or fast reactions. Experiments in non-VR environments suggest that a latency of less than 100 ms has no impact on the user experience. Other research suggests that latency in VR environments should be less than 50 ms to feel responsive, while the recommended latency should be less than 20 ms [21].
Estimating the size of objects that users interact with is a particular challenge in XR environments. The accuracy of such estimates can vary considerably depending on the research context, the technological setup, and the specific characteristics of the virtual environment. A consistent finding of most studies dealing with the estimation of object size in VR environments is a tendency to underestimate, which often depends on the actual size of the objects being evaluated. One study reports that the size of virtual objects is underestimated by about 5%, regardless of their position in space [22]. Other research shows that when using VR devices with binocular disparity, such as head-mounted displays (HMDs), users tend to perceive virtual objects as 7.7% to 11.1% smaller than their actual size, regardless of the shape of the object [23]. In AR environments with handheld controllers, the detection threshold for size changes ranged from 3.10% to 5.18% [24]. In contrast, other studies reported even finer perception thresholds—less than 1.13% of the height of the target object and less than 2% of the width. These values are usually considered as the point of subjective equality (PSE), which indicates the deviation at which users have a 50% probability of correctly estimating the size of a virtual object [25].
User interaction encompasses the actions and reactions of a user in a digital environment. It describes the user’s activity when using a computer or applications. Most people are used to interacting with computers through standard 2D interfaces using a mouse and keyboard, with the screen being the output device. However, with the advancement of technology, especially in virtual environments, conventional input is not always practical [26]. This still raises the question of what is the most natural way to interact with these virtual systems and which devices are best suited for this.
In contrast, 3D interaction is a type of interaction between humans and computers in which users can move and interact freely in a 3D space. The interaction itself involves information processing by both humans and machines, where the physical position of elements within the environment is critical to achieving the desired results. The space in such a system can be defined as a real physical environment, a virtual environment created by computer simulation, or a hybrid combination of both environments. If the real physical space is used for data input, the user interacts with the machine via an input device that recognizes the user’s gestures. If, on the other hand, it is used for data output, the simulated virtual 3D scene is projected into the real world using a suitable output device (hologram, VR glasses, etc.). It is important to note that interactive systems that display 3D graphics do not necessarily require 3D interaction. For example, if a user views a building model on their desktop computer by selecting different views from a classic menu, this is not a form of 3D interaction. If the user clicks on a target object within the same application to navigate to it, then the 2D input is converted directly into a virtual 3D location. This type of interaction is called 3D interaction [27].
Most 3D interactions are usually more complex than 2D interactions because they require new interface components, which are usually realized with special devices. These types of devices offer many opportunities for designing new user experience-based interactions [28]. The main categories of these devices are standard input devices, tracking, control, navigation and gesture interfaces, 3D mice, and brain–computer interfaces.
Two different methods of interaction in the XR environment are direct and indirect. When considering their impact on user experience, engagement, and learning outcomes, both offer different benefits that can complement each other in a learning context. Direct interaction in XR environments mimics real-life activities, increasing physical involvement and potentially boosting intrinsic motivation. Indirect interactions with classic UI elements provide greater precision through an interface, although they usually require more cognitive effort and affect usability and motivation in different ways. A combination of both interaction methods can create a balanced and effective learning environment, as this approach supports hands-on learning in the initial phase and facilitates precision tasks in more advanced phases [29].
Various approaches have been developed to overcome the limitations of mapping 2D inputs to 3D spaces. One of the earliest methods is the triad cursor, which allows the manipulation of objects using a mouse relative to their projected axes [30], while another method uses two cursors to perform translation, rotation, and scaling simultaneously [31]. Alternatives include a handle box [32] and virtual handles [33] for applying transformations to virtual objects, while Shoemake presented a technique that allows object rotation by drawing arcs on a virtual sphere with the mouse [34]. Although these techniques are more than 30 years old, they are still used in the form of widgets in modern tools such as Unity3D. Other tools, such as AutoCAD, 3D Studio Max, and Blender, use orthogonal views instead of widgets for more precise manipulation of 3D objects [35].
Numerous studies have investigated different approaches for the selection and manipulation of objects in virtual 3D spaces. One of the basic techniques is the use of a “virtual hand”, which enables direct object manipulation by mirroring the user’s real hand movements onto a virtual counterpart. This type of interaction is very intuitive for humans [35], while the size of the object and the goals influence the grasping kinematics in 3D interactions. When designing a VR assessment, it is important to consider the size of the virtual object and the goals of the task as factors that influence participants’ performance [36]. The user’s perception of object size in XR environments is important for natural and intuitive interaction, as it influences the feeling of presence and navigation, and, indirectly, learning efficiency. Misperception of object size can lead to reduced efficiency and frustration for the user when interacting with virtual content. The point of subjective equality (PSE) is the point at which a user perceives two objects to be the same size, even if they are physically different sizes. In XR environments, the PSE is often used to quantify perceptual errors in size estimation, which helps to optimize the display of objects so that the user experience is as realistic and intuitive as possible [25].
The distractibility of objects in 3D interactions can affect task performance and decision making. Distraction can slow down behavior and increase costly body movements. Most importantly, distraction increases the cognitive load required for encoding, slows visual search processes, and decreases reliance on working memory. While the impact of visual distraction during natural interactions may appear localized, it can still trigger a range of downstream effects [37].
There are various devices, such as wearable sensors, touch screens, and computer vision-based interaction devices, that allow users to interact realistically in 3D [38,39]. Among the sensors that successfully recognize natural hand gestures, the Leap Motion controller [40] stands out. Leap Motion is a small peripheral device that is primarily used to recognize hand gestures and finger positions of the user. The device uses three infrared LEDs and two CCD sensors. According to the manufacturer of Leap Motion, the accuracy of the sensor in detecting the position of the fingertip is 0.01 mm. The latency of the Leap Motion Controller is influenced by various factors, including hardware, software, and display systems. As discussed in Leap Motion’s latency analysis, the latency of the overall system can be reduced to less than 30 milliseconds with specific adjustments and under optimal conditions [41]. In practice, however, latency can vary, and the reported average motion time error is around 40 milliseconds [42]. Due to its high recognition accuracy and fast processing speed, Leap Motion is often used by researchers for gesture-based interactions [27,43].
One of the challenges in implementing hand gesture interaction is the fact that there is no standard, model, or scientifically proven prototype for how the user can interact with a 3D object [26]. Although webcams are one of the most important devices for exploring possible interactions with 3D objects, no solution has yet been found for the optimal estimation of hand positions considering environmental and circuit constraints [44]. In [45], a system for hand tracking in VR and AR using a web camera was presented. In contrast to the cameras built into VR devices, which restrict the positioning of the hand and are uncomfortable for some users, this system allows greater freedom of hand movement and a more natural position. The system achieves a high gesture prediction accuracy of 99% and enables simultaneous interaction of multiple users, which significantly improves collaboration in 3D VR/AR environments. The use of markers as one of the possible solutions for realizing interactions via webcams is presented in [46].

2.2. Natural Interactions

Natural interaction, where humans interact with machines in the same way as in human communication—through movements, gestures, or speech—is one of the possible solutions for intuitive interaction with 3D interfaces [14].
A natural user interface (NUI) allows users to interact with digital systems through intuitive, human-like actions, such as speech, hand gestures, and body movements that are similar to the way humans interact with the physical world. This approach moves away from traditional input devices such as keyboard, mouse, or touchpad [47]. Natural interactions are not necessarily 3D interactions, even though they are often used in 3D interfaces. We often associate the term NUI with natural interactions. They have the advantage that the user can use a wider range of basic skills when interacting than in conventional GUIs (e.g., UIs, which mainly rely on the mouse and keyboard for interaction) [48].
Much of the concept of natural interactions and interfaces is based on Microsoft’s definitions and guidelines. Bill Buxton, a senior researcher at Microsoft, points out that NUI “exploits skills that we have acquired through a lifetime of living in the world, which minimizes the cognitive load and therefore minimizes the distraction” [49]. He also emphasizes that NUI should always be developed with the context of use in mind [49].
Gestures are seen as a technique that can provide more natural and creative ways of interacting with different software solutions. An important reason for this is the fact that hands are the primary choice for gestures compared to other body parts and serve as a natural means of communication between people [50]. In general, it is assumed that a UI becomes natural through the use of gestures as a means of interaction. However, when looking at the multi-touch gestures, e.g., on the Apple iPad [51], which are generally considered an example of natural interaction, it becomes clear that the reality is somewhat more complex. Some gestures on the iPad are natural and intuitive to the user, such as swiping left or right with one finger on the screen, which allows you to turn pages or move content from one side of the screen to the other, simulating the analog world. However, some gestures need to be learned, such as swiping left or right with four fingers to switch between applications. Such gestures are not intuitive for users and require additional learning, as it is not obvious how such an interaction should be performed. This requires an understanding of the relationship between the gesture and the action being performed [49].
By interaction types, NUIs can be divided into four groups: multi-touch (the use of hand gestures on the touch screen), voice (the use of speech), gaze (the use of visual interaction), and gestures (the use of body movements) [52]. Another new group of interfaces based on electrical biosignal sensing technologies can be added to this classification [53].
The emergence of multi-touch displays represents a promising form of natural interaction that offers a wide range of degrees of freedom beyond the expressive capabilities of a standard mouse. Unlike traditional 2D interaction models, touch-based NUIs go beyond flat surfaces by incorporating depth, enhancing immersion, and simulating the behavior of real 3D objects. This allows users to interact and navigate in fully spatial, multi-dimensional environments [54]. Touchscreen interactions are considered the most commonly used method for natural interactions as they are available on the most commonly used devices, such as smartphones and tablets.
Voice user interfaces (VUIs) have experienced significant growth due to their ability to enable natural, hands-free interactions. They are usually supported by machine learning models for automatic speech recognition. However, they still face major challenges when it comes to accommodating the enormous diversity and complexity of human languages [55]. A major problem is the limited support for numerous global languages and dialects, many of which are underrepresented in existing language datasets. This underrepresentation can lead to misinterpretation and a lack of inclusivity in VUI applications [56]. In addition, nuances such as context, ambiguity, intonation, and cultural differences pose difficulties for current natural language processing systems. Addressing these issues is critical to the broader adoption and effectiveness of VUIs, especially in multilingual and multicultural environments [57]. Recent advances in VUI technology include emotion recognition, where the system can recognize and respond to the user’s emotional tone of voice, and multilingual support, which enables interactions in multiple languages [58]. In two-way voice communication, natural language processing enables the system to understand the meaning of the spoken words, while the speech synthesis system generates responses in the form of human-like speech. Modern VUIs often integrate systems to handle complex conversations with multi-round interactions [59].
Gaze-tracking UIs use eye movements, in particular the direction of the user’s gaze, to control or manipulate digital environments. They interpret eye movements and translate them into commands or actions on a screen or in a virtual environment [60]. As hands-free interfaces, they are particularly beneficial for assistive technologies, XR, and public installations where traditional input methods can be impractical [52]. To ensure that the device works as accurately as possible, it must be individually calibrated for each user to calculate the gaze vector [61]. The two main types of gaze-based interaction are gaze pointing and gaze gestures. Gaze pointing uses the user’s gaze as a pointer and allows the user to select or interact with objects on a screen without having to physically touch the device. The UI responds to the location on the screen or in space that the user is looking at, often with a cursor or focus indicator [62]. Gaze gestures recognize and interpret complex eye movement patterns such as blinking or switching gaze between targets as commands, enabling a wider range of gestures and interactions [63,64]. Gaze-recognizing systems can adapt the content or interface depending on where the user is looking [65]. Recent advances, including multimodal approaches that combine gaze with speech or gestures, aim to mitigate these issues and improve the user experience [66] or combine them with technologies that measure brain activity to create a brain–computer interface [67]. Advances in this area aim to improve accuracy and overcome problems with interaction errors such as the “midas touch”, which occurs when the system’s response to a user action and the user’s expected outcome of that action do not match [68]. New findings in the field of deep learning have also influenced gaze-tracking technologies, so that more and more authors are working on the application of machine and deep learning in this field. Numerous authors have created and used their own annotated datasets for deep learning, some of which are publicly available, such as the Open EDS Dataset [69] or Gaze-in-Wild [70]. Among the approaches used, convolutional neural networks (CNNs) showed the best results in eye tracking and segmentation [71,72].
Gesture-based NUIs allow users to interact with devices by using certain gestures, such as hand or body movements or facial expressions, as input commands. These technologies rely on users’ body movements to interact with virtual objects and perform tasks, and they rely heavily on the ergonomics associated with various interactions [73]. One form is hand gesture recognition, where systems use cameras and different sensors (e.g., Leap Motion or Kinect) to capture hand movements and interpret them as commands for tasks such as selecting objects or navigating menus [44,74]. Wearables such as data gloves, video cameras, and sensors such as infrared or radar sensors [75] are also used for gesture recognition. Wearables provide more accurate and reliable results and can give haptic feedback, which gives them an advantage over visual systems. In addition, there is a third category for systems that combine both approaches. Nowadays, sensors and cameras are preferred for hand tracking as they offer greater freedom of movement [76]. Another form of gesture-based NUIs is whole-body gesture recognition, which captures body movements for more immersive interactions, as often seen in VR or AR applications [77,78]. Similar to hand gesture recognition, cameras and depth sensors are often used to recognize full-body gestures. Sensors are also used in the form of suits that track body movements via various sensors such as accelerometers, gyroscopes, and magnetometers that detect limb movements, posture, and orientation [44]. On a larger scale, ultrasonic or radar sensors are used to recognize the position and movement of the body in space [75], as well as computer vision algorithms that enable the recognition of whole-body gestures through video analysis [79,80]. Facial gesture recognition focuses on recognizing facial expressions as input so that devices can respond to emotions or gaze, which is common in accessibility technologies [81]. Finally, multimodal gestures combine different types of inputs, such as hand, voice, and gaze inputs, to create a more intuitive and efficient interaction experience [82].
Technologies for recording electrical biosignals, such as brain–machine interfaces (BCIs), which are based on electroencephalography (EEG) and electromyography (EMG), are often referred to under the collective term ExG [53]. With BCIs, it is possible to send commands to a computer using only the power of thought. A standard non-invasive BCI device that does not require brain implants is usually available in the form of an EEG headset or a VR headset with EEG sensors. The design of the device has evolved so that today smaller devices are produced in the form of EEG headbands that measure brain activity and other data such as blood flow in the brain, while the mapping of the collected signals to the user’s gestures is complex and relies heavily on a learned model [83]. EMG signals are used for various purposes, independently or in combination with other sensors, for example, for interpreting the user’s intentions when interacting with virtual objects [84], for device management [85,86], for analyzing muscle mass during rehabilitation [87], or for speech recognition [88,89]. The application of electrical interfaces to capture biosignals faces a number of problems, such as the detection of biological signals and the difficulty of precise localization of activity. Biological signals are variable, and errors in detection and classification increase the error rate, resulting in a lower speed of information processing compared to classical interaction methods. An additional problem is the so-called “midas touch” effect, which has already been mentioned for UIs with eye tracking. In addition to the technical challenges, there are also ethical problems, as, for example, the EEG data used during processing can reveal personal information of the user [53]. The scope of such interfaces in mobile and XR applications is limited due to performance accuracy as well as stability between sessions [83], but they are essential in certain areas such as rehabilitation, prosthetics, and assistive technologies, especially for people with severe disabilities for whom such interfaces are the only possible form of interaction with systems [53].
Achieving naturalness in every context of use and for all users is a challenge. While gestures, speech, and touch play an important role in many NUIs, they only feel truly natural to users if they match their abilities and specific usage scenarios as well as the context of the application and gestures. Given the technology available at the beginning of the 21st century, it is almost impossible to create a 3D UI that feels natural to all users. Instead of trying to make NUIs universal, one should focus on tailoring each of these interfaces to specific users and contexts [49].

2.3. Accessibility Parameters

UIs for gesture-based interaction should be designed according to established principles to ensure an intuitive and effective user experience. The system must be able to determine exactly when gesture recognition should begin and end to ensure that unintended movements are not misinterpreted. The order in which actions are performed is also crucial. Designers must clearly define each interaction step required to complete a process. In addition, the UI should be context-aware and provide immediate feedback to the user to confirm a successful action [38], while individual spatial abilities should be considered, as these abilities significantly affect the user’s performance [90]. In a global and culturally diverse environment, it is important to consider cultural differences, as they can influence the meaning, pace, and execution style of gestures [8].
Universal design aims to ensure that interactive systems can be used by as many users as possible, regardless of their abilities or experience [91]. In this context, in 3D environments, usability parameters such as object size, UI responsiveness, and intuitiveness of interaction must be adaptable to different physical, cognitive, and perceptual needs. A flexible design that allows users to customize interaction modes or change spatial layouts improves accessibility and supports user autonomy. This inclusive approach not only improves the overall user experience but also promotes equity in accessing immersive 3D technologies.
The most neglected accessibility principles in serious games are “operability” and “robustness”, where the principle of operability concerns user control and interaction, while robustness can be improved through assistive technologies. To improve accessibility, it is recommended to include features such as automatic transcriptions, sign language, photosensitivity control, external VR devices, and contextual help. It is emphasized that developers of serious games must make considerable efforts to improve accessibility [92].
As the field of accessibility matures, there are many studies that focus on specific subfields of accessibility research (e.g., people with autism or visual impairment) or on accessibility in different types of applications (e.g., websites, mobile applications, and VR games) [93]. Most existing accessibility guidelines for applications are based on the W3C Web Content Accessibility Guidelines [94], the W3C Mobile Accessibility Guidelines [95], and the Guide to Applying WCAG 2 to Non-Web-based Information and Communication Technologies [96]. They offer a series of recommendations to make digital content more accessible for all users, based on four basic principles: Perceivable, Operable, Understandable, and Robust (POUR). Applying these principles in development ensures that all users, including those with different types of impairments, can access, navigate, and understand the content. Certain industries, such as the gaming industry, have recently invested considerable resources in improving the accessibility of their products.
Previous research on accessibility in AR/VR environments has shown that these technologies present significant barriers to accessibility, which can lead to problems with the use of their features or to the exclusion of certain user groups, particularly people with certain forms of impairment. Issues such as limited evaluation of the effectiveness of solutions, lack of standardized testing methods, and technical barriers such as limited device resources and the need for environmental monitoring were identified. Successful systems emphasize UI adaptability and user involvement in the design process to ensure accessibility [97]. These findings could be used to identify key areas where further work is needed to develop immersive platforms that are more accessible to people with certain forms of disabilities [98].
When designing immersive applications for accessibility, there are some important principles to consider. Immersive applications need to provide redundant output options such as audio descriptions or subtitles to ensure that users can interact with the content in a way that best suits their abilities. They should also support redundant input methods and allow the user to interact with the environment by either changing the orientation of the head or using a controller. These applications need to be compatible with a wide range of assistive technologies so that users can use their own way of interacting between the physical world and the virtual space [99]. In addition, customization options for content presentation and input modalities should allow users to tailor the experience to their specific needs [100]. Applications should also offer direct assistance or facilitate support when the user encounters difficulties, through dynamic customization of the UI, intelligent agents, or the possibility of external support from friends or caregivers [99]. Finally, implementing the principles of inclusive design will ensure that text content and controls are accessible to a broad population (e.g., by displaying text in an appropriate size and color) [101].
The 3D interactions can be defined by three aspects: usability, reliability, and comfort [102]. Usability describes the ease of learning, understanding, and remembering interactions, where high ease of use means simplicity compared to other interactions. Reliability refers to the likelihood that the tracking device will interpret the interactions correctly. It is expected that interactions with high reliability will be recognized most of the time. Comfort describes the physical effort and discomfort of performing gestures, while interactions with high comfort can be performed with ease and minimal effort. To make their devices more usable, manufacturers of devices that offer 3D interaction claim to study human behavior and apply evidence-based design practices to identify the gestures, interactions, and haptic sensations that work best for their users [103]. They apply this knowledge in the design of their hardware and software, which programmers can use to control their devices.
The transition from 2D accessibility to accessibility in 3D environments is a major transition. Users are confronted with new types of interface components and interaction styles that may seem strange, unintuitive, or unnatural. In addition to the challenge of adapting to the new elements, additional complexities arise as accessibility principles from the 2D world need to be merged with those of real-world interaction. While expertise in 2D interactions is fundamental to translating accessibility to 3D, a perfect 1:1 conversion is often not possible.
As far as the authors are aware, there is no comprehensive research in the literature on the accessibility parameters of 3D user interactions in educational software systems that include 3D visualization.

3. Research Design and Instruments

At the beginning of this section, we outlined the research and put forward hypotheses to be tested in the research. In the following subsections, we describe the research instrument and metrics we used to identify the key parameters that influence the accessibility of 3D user interactions.
To create better XR interfaces, it is important to recognize the influence of certain accessibility parameters on task performance and user experience. This research is a pilot study that aims to collect initial data to identify the key parameters that influence the accessibility of 3D user interactions in educational software solutions based on 3D visualization technologies.
Based on previous research findings and theoretical foundations [20,35,36,37,42], we formulate three hypotheses that we evaluate using the results of the analysis of 3D user interactions in educational software:
H1: 
In precision-oriented XR applications, smaller 3D objects relative to the virtual hand will improve task accuracy and user satisfaction.
H2: 
Visual distractions reduce task performance and user satisfaction.
H3: 
Indirect interaction methods lead to better task efficiency, accuracy, and satisfaction than direct interaction.
By empirically evaluating these hypotheses, this study contributes to the development of the accessibility of XR interfaces. The defined guidelines will be specifically tailored to the needs of 3D interaction. They will enable the creation of UIs that not only fulfill the basic accessibility requirements but also improve the user experience.
The study consisted of two phases. In the first phase, a research instrument called GeometryGame was used to collect quantitative data. The second phase collected qualitative data related to participants’ attitudes toward certain aspects of accessibility and interaction implemented in the GeometryGame. Figure 1 illustrates the experimental environment used in the first phase of the study.
The quantitative data provide a basis for analyzing the effectiveness and accessibility during the interaction with the research tool, while the qualitative data provide information about the users’ satisfaction with the achieved precision, the ease of interaction, and the general subjective impressions during the interaction (e.g., the users’ frustration).

3.1. Description of Study

In 2023, a pilot study was conducted at the University of Dubrovnik and at the Faculty of Electrical Engineering and Computing at the University of Zagreb, in which the participants participated individually under the supervision of an examiner. Before the study began, each subject was verbally informed about the context, the research procedure, and the purpose and objectives of the study. Each subject gave informed consent before participating in the user tests.
The hardware on which the test was carried out consisted of an Acer laptop, model Nitro AN515-46, AMD Ryzen 7 6800H 3.20 GHz processor, NVIDIA GeForce RTX 3050 graphics card, 16 GB RAM, and a first-generation Leap Motion device.
The Leap Motion Controller was chosen as the primary input device for this study as it offers a combination of precision, accessibility, and suitability for natural 3D interaction. Its tracking accuracy allows for precise detection of hand movements, which is important in experiments evaluating the speed of task execution and fine motor control. The controller requires no additional external cameras and allows for natural interaction with the bare hand. In terms of practical application, it is available, easy to set up, compact, and portable, which was important for us as the research took place in various remote locations. The device easily connects to the most common XR development environments via its official SDK, making it easy to create applications and develop interactions [102].
Due to its characteristics, the Leap Motion controller sensor has attracted a lot of attention in the field of hand gesture recognition and has captivated researchers from various fields. All these features make the device particularly suitable for use in NUI research in 3D environments [104].
Each test subject has 15 min to complete the first phase of the study. Due to the nature of the study, the specifics of the used hardware, the duration of the study per test subject, and the desire for each participant to use the application under identical conditions, the number of test participants was limited, so that ultimately 53 people took part in the study.
Since the time required to complete the tasks is an important aspect in evaluating the usability and efficiency of the UI, a statistical analysis of the time required to complete the tasks in the first part of the study was performed. The measures used in the analysis were the mean (M), median (C), standard deviation (σ), minimum (min) and maximum (max) values, range (R), lower (Q1) and upper (Q3) quartiles, interquartile range (IQR), symmetry index (S), and flatness index (K). The unit of measurement for all results was the second.
In the second part of the research, participants completed a survey created in Google Forms. This part of the research was not time-limited. The survey statements evaluate the participants’ subjective experience of interacting with 3D objects and focus on the experience of ease, precision, intuitiveness, and frustration when handling the objects.

3.2. Research Tool

GeometryGame is the prototype of an educational software for teaching geometric solids in the form of a 3D game, in which four 3D interactions (grab-and-release, pinch, swipe, and 3D button press with a virtual finger) are implemented. The UI of this game is in Croatian.
The development of this tool is based on concepts and guidelines previously defined and implemented in collaboration with a student of the Applied/Business Computing at the University of Dubrovnik as part of her master’s thesis [105]. This game was developed using the cross-platform game development framework Unity, version 2020.2.6f1. A 1st generation Leap Motion device manufactured by Ultraleap, San Francisco, CA, United States was chosen as an input device to implement the 3D user interactions.
It was designed to focus on the user’s interactions, and users could always see their virtual hands on the screen. The relationships between distance and size were precisely defined by these virtual hands. The tasks in the game were simple and required no prior knowledge on the part of the participants. The interactions are explained by three aspects: usability, reliability, and comfort, described in the documentation of the manufacturer of Leap Motion [102]. The aspects are described with the qualitative descriptors “high”, “medium”, and “low”, and in this tool, we choose highly and moderately usable and comfortable interactions.
The game consists of four levels divided into sublevels, and the types of interactions used on each level are listed in Table 1.
The definition of the accessibility parameters focuses on a detailed analysis of the four elements:
1.
The size of the object in relation to the user’s virtual hand, analyzed by the grab-and-release interaction;
2.
The presence of distracting 3D elements using pinch interaction with both hands;
3.
The intuitiveness of the use of UI elements, analyzed by the swipe interaction;
4.
The size of UI elements and their distance from the user’s virtual hand, analyzed by pressing 3D buttons with a virtual finger.
These interactions allow for easier and more natural handling of objects in the virtual space, as they are as similar as possible to the user’s natural movements. This ensures that interactions are accessible and simple for all users, promoting universality and inclusivity in the digital environment. By adapting the accessibility guidelines to the specific requirements of 3D interactions, interfaces should be created that not only fulfill the basic accessibility requirements but also improve the user experience. The integration of graphical interfaces, such as sliders to control rotation, was developed with the aim of enabling precise manipulation of objects without causing unnecessary effort or frustration for the user. This approach not only sought to meet existing accessibility standards but also to extend them by introducing new parameters relevant to 3D interactions.

3.2.1. Level 1: Retrieval of Object

The goal of the interactions on the first level, shown in Figure 2, is to place three geometric solids (sphere, cube, and pyramid) in the correct boxes. Participants must perform a certain interaction to progress in the game, regardless of the result. In situations where they cannot perform the required interaction, the alternative option is to skip the sublevel.
At this level, we want to investigate the optimal size of the object in relation to the user’s virtual hand and evaluate the user’s interaction under conditions of reduced visual contrast. The distance between the object and the virtual hand is defined by the Unity transformation component, which determines the position, rotation, and scaling of objects in 3D space. All calculations related to the size of the virtual hand and the object are expressed in the form of ratios. This level is divided into four sublevels:
  • Basic—where objects are the same size as the virtual hand (Figure 2a);
  • Shrunken—where objects are 50% smaller than the virtual hand (Figure 2b);
  • Enlarged—where objects are 100% larger than the virtual hand (Figure 2c);
  • Reduced contrast—sublevel with reduced contrast between the background color and the color of the 3D objects, while the ratio of the virtual hand size to object size is the same as in the basic sublevel (Figure 2d).

3.2.2. Level 2: Resizing the Object

The second level, shown in Figure 3, implements interactions that involve grabbing a ball and enlarging it with a pinch gesture. The pinch gesture requires participants to bring their index fingertip and thumb together while keeping their other fingers open.
This level was divided into two sublevels, differentiated by the use of distractors in the form of other objects. The aim of this approach was to relate the results obtained to the level of complexity of the interaction and to assess how distracting elements affect performance.
The goal of the first sublevel (without distractors) was for the subject to reach for the ball and enlarge it to approximately the same size as the large pink ball.
The second sublevel is more challenging as the subject had to select a specific ball described in text on the screen (e.g., green ball) and enlarge it to approximately the same size as the pink ball. In this scenario, distractors were placed to investigate how such elements affect the subject’s performance during the interaction.
Given the shape of the object whose size has changed, the resizing was limited to a proportional resizing. The original aspect ratio was retained, and the assessment was based solely on the overall scale factor in relation to the target object. To minimize the influence of the inaccuracy of the sensor and the output device used, we opted for a relatively safe threshold value of ±5% of the given object size. This value was chosen based on findings from previous research on UIs, where such tolerances were considered sufficiently accurate but at the same time realistically achievable for users to balance the natural variability of human interaction and the need for precision.

3.2.3. Level 3: Rotating the Object

The third level shown in Figure 4 is divided into two sublevels to further enrich the user experience and explore whether users prefer direct interaction with a 3D object or indirect interaction via a UI element slider.
In the direct interaction (Figure 4a), the participants had the task of picking up a dice, holding it in one hand, and making rotations with the help of hand movements. A grab-and-release interaction was implemented, which, together with the rotation using swipe motion, aimed to encourage exploration of the dice and to convey a 3D spatial perception to the participants by searching for a specific number.
In the indirect interaction (Figure 4b), the participants used the standard UI element slider. In this scenario, the participants must turn the dices using the slider until they reach a position with a clearly visible given number.

3.2.4. Level 4: Pressing the Button Using Virtual Finger

The fourth level shown in Figure 5 is designed like a trivia quiz. The questions were simple and required no prior knowledge from the participants. Interaction at this level is performed by tapping the buttons at different positions with a virtual finger. This structure allows for a consistent quiz experience, exploring how variations in object size and button spacing can affect user interaction.
Each sublevel retains the same concept but varies the size of the buttons and the distance of the buttons from the user’s virtual hand. The first two sublevels differ in the position of the buttons that the test subject must press: horizontal (Figure 5a) and vertical (Figure 5b). In the last sublevel, a distracting background is implemented in addition to the buttons in the vertical position (Figure 5c).
On the sublevel with horizontal alignment, the buttons have a size of 0.24×0.17, and the buttons are arranged at a spatial distance of −10. On the sublevel with vertical alignment, the size of the button is reduced to 0.14 × 0.13, and the spatial distance between the button and the virtual hand is increased to −15. The sublevel with distracting background uses the smallest buttons (0.12 × 0.17) placed closer to the participants (−5).
The size of the buttons was chosen by experimenting with different sizes and applying Fitts’s law [105]. Fitts’s law is a predictive model for human movement used in human–computer interaction. This law states that the time required to move quickly to a target area is a function of the ratio between the distance to the target and the width of the target [106]. It is often used in UI design to optimize the size and position of elements to improve the speed and accuracy of user interactions.

3.3. Design of the Survey Questionnaire

In addition to the demographic questions (age, gender, education, and occupation), the questionnaire contained repeated statements for each level of GeometryGame. These statements assess the participants’ subjective experience of interacting with objects and focus on the experience of ease, precision, intuitiveness, and frustration when handling the objects. Participants rated their attitude to the statements on a Likert scale from 1 to 5, with 1 indicating complete disagreement and 5 indicating complete agreement. User satisfaction with the interactions is defined in this context as the degree to which the user’s needs, expectations, and preferences are met by the interaction. The statements for the first level were as follows:
  • It was very easy for me to reach the 3D object.
  • I think precision is very important when reaching the 3D object.
  • The interaction with the 3D object was intuitive for me.
  • While reaching the 3D object, I was frustrated.
  • I am satisfied with the precision I achieved while reaching the 3D object.
  • I had no difficulty reaching the 3D object.
The statements differed only slightly in terms of the levels and tasks performed by the participants, with the basic structure of the statements remaining constant and the only change being the type of interaction with the object.

3.4. Metrics Used for Evaluation of Results

To objectively analyze the data collected during the user’s interaction with the GeometryGame, the relevant information was systematically recorded and stored in a database. Table 2 shows a system that was used to quantify the results. For each level, the activities and the corresponding result types stored in the database are shown.

4. Results

In this section, we describe the results of the pilot study conducted. First, we describe the demographic data of the participants. In the following subsection, we present and analyze the quantitative data collected with GeometryGame. The qualitative data from the questionnaire is presented and analyzed in the third subsection.

4.1. Demographic Data

A total of 53 participants took part in this research. Of these, 32.08% (17 participants) were between 18 and 25 years old, 26.42% (14 participants) were between 25 and 35 years old, 22.64% (12 participants) were between 35 and 45 years old, 9.43% (5 participants) were between 45 and 55 years old, and 9.43% (5 participants) were over 55 years old.
By gender, 47.20% of participants were female, while 52.80% were male. Other options, such as “other” and “do not want to specify”, were not selected by any of the participants.
Considering that most participants were students, it was expected that a higher proportion of participants would have a secondary school degree, namely, 35.80%. A total of 5.70% had a bachelor’s degree, while 39.60% of the participants had a master’s degree. A total of 1.90% of participants completed postgraduate professional studies. A significant proportion, 17%, of participants had a doctorate.
In terms of profession, 77.40% of participants were from technical sciences, while 9.40% were interdisciplinary. Social sciences accounted for 7.50% of participants, natural sciences accounted for 3.8%, and biomedicine and health accounted for 1.90% of the total number of participants.

4.2. Quantitative Data Analysis

Regarding the quantitative analysis, the data collected during the testing of GeometryGame were exported from the database and statistically analyzed. The quantitative data analysis is based on interaction parameters defined in the metrics for evaluating the results, described in Section 3.3.
The first metric used in the analysis at all levels was the time taken to perform the activities of the level. The indicators used in this analysis were the mean (M), median (C), standard deviation (σ), minimum (min) and maximum value (max), range (R), lower (Q1) and upper (Q3) quartiles, interquartile range (IQR), coefficient of skewness (S), and kurtosis (K).
The second parameter in the analysis was the use of the object-return option in the tasks. It was used at all levels except the fourth quiz level, where participants did not have this option, because they had to answer each question only once.
The third parameter used in the analysis is the accuracy of the participants in performing the tasks. The data for this parameter were collected at the first and fourth levels.
At all sublevels, participants were offered the option to skip the task, and this was the fourth parameter used in the quantitative data analysis.

4.2.1. Execution Time

Table 3 shows the statistical analysis of the time required to complete the tasks at the first level. The unit of measurement for all results listed in the table is seconds.
At the first level, users must pick a specific object and place it in the correct box by grabbing and releasing it. When analyzing the data of the first metric for all sublevels of the first level, significant differences were found in the time spent on the tasks. The mean and median vary between the sublevels, indicating that the tasks were of different difficulty or complexity for the participants. Since the same interaction was implemented in all sublevels, the size of the 3D object that the participants had to reach had a significant influence on the difficulty of the tasks. Participants completed the tasks fastest in the reduced-contrast sublevel, where the mean completion time was 28.72 s, while they completed the tasks slowest in the sublevel with a shrunken object, with a mean time of 52.90 s.
At the basic sublevel, greater variability was observed in the time spent on the tasks, indicating more challenging or complex tasks for the participants. The variability of the data is reflected in their range, which is 142.74. In the sublevel with enlarged objects, the means and medians are larger, but the range and standard deviation show less variability in the data compared to the first sublevel. This could indicate that the subject’s movements are more consistent with the size of the object.
The sublevel with the shrunken objects shows the highest variability in task completion time, suggesting that it was challenging for participants to reach small 3D objects. Interestingly, participants completed the tasks in the reduced-contrast sublevel almost 1.5 times faster than in the basic sublevel. This difference in speed occurred even though the 3D shapes or geometric solids were the same size in both sublevels, and the only difference was the implementation of low contrast. This result suggests that it is important to first familiarize users with the interactions they will use in the application. Familiarity with the interactions can provide them with the necessary confidence and understanding of the interactions, which can help them overcome challenges more easily, even in worse conditions. Therefore, it is important to consider the learning process of the participants in addition to the design aspects to ensure an optimal user experience.
The distribution of time within the individual sublevels shows variability, with the data clustering relatively closely around the central values. This suggests that most participants had similar experiences with the speed of task completion within each sublevel. However, there are also individuals who need significantly more time for the same tasks, which could be due to different abilities or personal preferences of the participants.
On the second level of the game, the user must select a geometric solid and enlarge it with both hands using the pinch gesture. The results of the statistical analysis of this level by sublevels can be found in Table 4.
The execution time of the tasks at the sublevel with distractors was significantly higher, indicating an increased complexity of the tasks due to the presence of distracting 3D elements. The standard deviation is also higher on the sublevel with distractors (37.98 s) than on the sublevel without distractors (29.59 s), indicating greater variability in execution times between participants. This could be due to the different methods used by the participants to overcome the challenges posed by the distractors. The minimum and maximum times confirm this variability and indicate a wide range of subject experience. The IQR was significantly higher on the sublevel with distractors (44.82 s) than on the sublevel without distractors (23.86 s), providing further evidence that participants approached and completed the tasks with greater variability. The indices for symmetry and flattening show higher values on the first sublevel, indicating the presence of more extreme values.
Overall, these results show that the distractors had a significant impact on the participants’ experience by increasing the time required to complete the tasks and leading to greater variability in performance. This suggests that the presence of 3D distractors increases complexity and challenge, causing participants to focus and exert more effort to successfully complete the tasks.
The third level of the game focuses on manipulation with 3D objects and offers participants the challenge of reaching and rotating a virtual dice. The results of the statistical analysis at this level can be found in Table 5.
On the first sublevel, the participants use direct interaction with a 3D object (cube) to find the specified number. On the second sublevel, an indirect interaction with a slider is used to rotate the cube and find the corresponding number. When analyzing the statistical data for the sublevels, it is noticeable that the performance of the participants is different, which is partly due to the differences in the interactions implemented in each sublevel.
In the direct interaction, participants had to manage a combination of interactions, including grabbing and releasing as well as rotating a 3D object, resulting in an average processing time of 17.83 s. The complexity of this combination of interactions explains why the participants needed more time for this task in the direct interaction. In contrast, the indirect interaction used a simplified approach with only one type of interaction—a hand movement to one side, using a graphical slider to rotate the 3D object. This made the task easier for the participants, which is reflected in a significantly shorter average processing time of 7.83 s. The median, standard deviation, minimum, and maximum execution times confirm this conclusion and indicate greater consistency in task execution time between participants in indirect interaction. The lower variability in this sublevel suggests that the simplified approach with only one type of interaction allowed participants to solve the task more easily and faster. The direct interaction has a higher IQR, indicating a greater time span, while the lower IQR at the indirect interaction sublevel indicates a more consistent performance of the participants in task execution. The indices of symmetry and flatness also show differences between the sublevels.
The fourth level of the game was designed to test participants’ interaction with a virtual 3D surface in different contexts. Each sublevel of this level retains the basic concept of a quiz, but varies in the dimensions and placement of the interactive graphical elements, while the third sublevel adds distracting elements. Table 6 shows the results of the statistical analysis at this level.
On the sublevel with horizontal orientation, the buttons were 0.24×0.17 in size and arranged at a spatial distance of −10. The results show an average task time of 7.06 s with a standard deviation of 2.99 s. This shows that the participants became accustomed to the task relatively quickly. The median of 5.98 s and a relatively small IQR of 4.13 s indicate a constant performance of the participants in the task. In the vertical plane, the size of the button is reduced to 0.14×0.13, and the spatial distance between the button and the virtual hand is increased to −15, resulting in a significantly higher mean processing time of 16.81 s. The large standard deviation of 13.09 s and a larger IQR of 13.88 s indicate fluctuations in the participants’ performance, which points to difficulties in solving the task.
This is a consequence of the increased task complexity caused by the smaller buttons and the larger spatial distance, requiring more precision from the participants. Although the distracting background sublevel uses the smallest buttons (0.12 × 0.17), it places them closer to the participants (−5), possibly facilitating interaction. The average time to complete the task is similar to the first sublevel (7.24 s with a standard deviation of 3.39). The smaller IQR of 3.67 s indicates that the participants completed the task evenly despite the challenge posed by the smaller buttons. This means that the smaller distance can compensate for the smaller size of the buttons, allowing participants to interact with the buttons more successfully. In the sublevel with vertical orientation, higher values of the flatness index indicate the presence of several extreme values that reflect the challenges the participants faced.

4.2.2. Use of the “Object Return” Option

The second parameter in the analysis is the use of the object return option in the tasks, and the number of times when participants used this option is shown in Table 7.
If we analyze the number of objects returned at the first level per sublevel in relation to the size of the object the participants were trying to reach, we find a significant correlation between these two parameters. For example, the number of objects returned on the basic sublevel was significant (73), and this option was used by 66.04% of the participants, while on the sublevel with enlarged objects, no subject used this option. However, further analysis of the data revealed that this result was due to difficulties in accurately grasping the 3D object. This is reflected in the increased number of incorrectly inserted objects, suggesting that the tasks were completed with less precision due to difficulties in handling larger 3D objects relative to the size of the hand. The sublevel with the shrunken objects has the highest number of objects returned (100). This indicates that the precision and handling of small 3D objects were a challenge for the participants. At the reduced contrast sublevel, a total of 51 objects were returned, and this option was used by 47.17% of participants.
On the second level, there is a significant difference in the use of the return object option between the sublevels, as participants used this option almost twice as often on the distractor sublevel. This can be explained by the presence of 3D distractor elements, which made the task considerably more difficult. If we look at the percentage of participants who successfully solved this task on the first attempt, we assume that the threshold used to determine the achieved size equality of the objects was well chosen considering the details of the task they were asked to solve (58.49% of participants at the sublevel without distractors and 35.85% of participants at the sublevel with distractors solved this task on the first attempt).
On the third level, on the sublevel with direct interaction, a lower number of uses of the object return option were recorded compared to the previous levels. The return option was used a total of 24 times, with each participant returning an object a maximum of 4 times, and 28.30% used this option. This result indicates a relatively high level of success and adaptability of participants in solving this task, which means that participants were able to manage the combination of interactions effectively.
In the indirect interaction sublevel, this option is not applicable. The reason for this is the nature of indirect interaction using a slider, which is designed so that it cannot go beyond the reach of the participant’s virtual hand, eliminating the need to return the 3D object.

4.2.3. Accuracy of Performing Tasks

Table 8 shows the results of solving tasks based on this categorization, with the percentage of participants who achieved each of these results.
At the basic sublevel, 62.26% of participants managed to insert all three 3D objects correctly, while a smaller number of participants (30.19%) inserted one object incorrectly. Only a few participants (7.55%) inserted two objects incorrectly. This indicates that objects of basic size are relatively easy to access for most participants. On the other hand, at the sublevel with the enlarged objects, there is a significantly higher number of participants who inserted two objects incorrectly (43.40%) or who were unable to insert a single object correctly (16.98%). This result indicates an increased level of difficulty of this sublevel compared to the basic sublevel, which is probably because the 3D objects are twice the size of a virtual fist.
The sublevel with shrunken objects shows an improvement with a higher number of participants correctly inserting all three 3D objects (83.02%). This improvement indicates that the task on this sublevel was better adapted to the participants’ expectations and abilities. On the reduced contrast sublevel, most participants (64.15%) managed to insert all three objects correctly, a smaller number (35.85%) of participants had one error, and no one had two or three errors. Better results on the last two sublevels could indicate that the participants have adapted to the implemented interactions, which enables them to perform better by repeating the interactions.
The accuracy of the participants at the fourth level, represented by the percentage of correctly solved tasks, is shown in Table 9.
On the sublevel with horizontal alignment of the buttons, most participants achieved a high accuracy of 98.11%, which means that this sublevel was relatively easy to solve. The high success rate indicates that the size of the objects and the distance were optimally adapted to the abilities of most participants, so that they were able to interact precisely. The sublevel with vertical orientation shows a slight decrease in the success rate, at 92.45%. Although this level is still very successful, the smaller decrease in the percentage of correct responses indicates a slightly higher level of task difficulty compared to the first sublevel due to small changes in object size and distance. The sublevel with the distracting background shows a significant decrease in success, with a percentage of only 62.26% correct answers. This decrease indicates that the task conditions in this sublevel were significantly more difficult for the participants, resulting in a more challenging interaction due to the smaller object size, greater distance, and the introduction of distractions that affected the participants’ ability to focus and be precise.
Analysis of all three sublevels shows that variations in task design, such as changes in object size and distance, have a direct impact on subject performance and satisfaction.

4.2.4. Use of the “Skip Sublevel” Option

At all sublevels, participants were offered the option to skip the sublevel, but participants rarely used this option. At the first and second levels, this option was only used once. At the first level, the participant who used this option gave up after 28.96 s on the sublevel with enlarged objects. On the second level, this option was used on the sublevel without distractors by one subject, who gave up after 64.81 s. The option to skip was used most frequently on the third level (5 times), on the sublevel with direct interaction. On average, the participants gave up after 17.69 s. On the fourth level, not a single subject used the option to skip a sublevel.

4.3. Qualitative Data Analysis

4.3.1. Simplicity of Implemented Interactions

In the second part of the research, participants completed a survey consisting of six questions to assess their satisfaction with the integrated interactions. The design of this survey is described in Section 3.3.
Figure 6a shows that most participants have no problem reaching objects at the first level. At the second level (Figure 6b), participants at the sublevel without distractors generally found that enlarging the 3D object was easy, as 50.95% of participants gave scores of 4 and 5. The interactions at the sublevel with distractors were rated even better, as 62.26% of participants gave the same scores. At the sublevel with distractors, a decrease in ratings of 1 and 2 can be observed. This pattern could indicate that the experience gained in the first level without distractors helped the participants to develop skills that helped them to solve the tasks in the next sublevel more easily, despite the presence of distracting elements.
The participants’ subjective ratings of the ease of rotating the 3D object at the third level of the game show a significant difference between the two sublevels (Figure 6c). On the level with direct interactions, most participants (49.06%) thought that rotating the 3D object was very easy. The remaining distribution of ratings shows a balanced number of participants with ratings of 3 and 4, while very few gave a rating of 2, and none gave a rating of 1. For indirect interactions, 83.01% of participants gave a rating of 5, a much smaller number of participants gave ratings of 3 and 4 (16.98%), while no participant gave ratings of 1 and 2. This could indicate that rotating the object, regardless of the type of interaction, was very easy, but that participants generally preferred indirect interaction with the slider.
The results of the ratings at the fourth level (Figure 6d) show a tendency for positive ratings to increase as progress is made through the sublevels. This result is consistent with previous observations on other types of interactions at the first and second levels. It can also confirm that the learning process had a positive effect on the participants’ ability to adapt to the interaction in the virtual space.

4.3.2. Importance of Precision in Implemented Interactions

Participants at the first level generally consider precision in reaching 3D objects to be important, with most participants giving a score of 5, as shown in Figure 7a.
The analysis of the survey results on the second level in Figure 7b shows that the participants rate the importance of precision in the resizing of 3D object magnification differently. At the sublevel without distractors, grades 1 and 2 were given by 28.30% of participants, while 20.75% of participants gave the highest grade of 5. Grades 3 and 4 were given by 20.75% and 30.19% of participants, respectively, showing that participants had a wide range of opinions on the importance of precision in this type of interaction. At the sublevel with distractors, a significantly larger number of participants, 52.83%, gave the highest rating of 5, emphasizing the high importance of precision at this sublevel. In contrast, the lowest ratings of 1 and 2 were only given by 5.66% of participants, indicating that participants generally see a greater need for precision at this sublevel. These results suggest that the task where distractors were introduced may have been more demanding. It is also possible that the experience on the sublevel without distractors may have given participants a deeper understanding of the task and the skills required to accurately magnify the 3D object.
At the third level (Figure 7c), a similar number of participants (58.49% and 54.72%) gave the highest rating in both sublevels, indicating that most participants considered precision to be very important for successfully solving the task.
Similarly, most participants in all sublevels at the fourth level (Figure 7d) confirmed the importance of precision when interacting with 3D buttons with the virtual hand and gave it the highest rating.

4.3.3. Intuitiveness of the Implemented Interactions

When considering the intuitiveness of interacting with a 3D object at the first level (Figure 8a), the results show that participants generally perceived the interaction of grasping and releasing as intuitive at all sublevels, although there were slight differences in perception between sublevels.
A substantial number of participants at the second level (Figure 8b) rated the interaction without distractors as very intuitive, 33.96% giving ratings of 5 and 20.75% giving ratings of 4. The interaction with distractors was rated slightly better, with 39.62% rating 5 and 33.96% rating 4, suggesting that participants found the interaction with the 3D object at this sublevel to be even more natural. The lower number of low ratings in both sublevels indicates that the participants easily adopt the interaction mechanisms, and the increase in high ratings in the sublevel with distractors indicates the aforementioned effectiveness of the learning process and the participants’ adaptation to the interaction through the levels.
In the analysis of participants’ answers on the intuitiveness of interacting with a 3D at the third level, as shown in Figure 8c, we can see that most participants rated this type of interaction as very intuitive. Participants clearly preferred indirect interaction with the slider, as 86.79% of participants rated this type of interaction as very intuitive. In contrast, 62.26% of participants found the direct interaction, in which a combination of a few interactions (grasping and releasing and rotating the object) was implemented, to be very intuitive.
Participants’ subjective impressions of the intuitiveness of the interaction when pressing the 3D button with a virtual finger (Figure 8d) show that most participants rated this interaction as very intuitive and gave it a score of 5. This indicates that this type of interaction with the 3D button is generally well accepted by participants at all sublevels.

4.3.4. Perception of Frustration with Implemented Interactions

Figure 9a shows that most participants experienced low to moderate levels of frustration at the first level. There is a slight trend toward an improvement or decrease in feelings of frustration as participants progress through the sublevels of this level.
A significant number of participants at the second level (Figure 9b) experienced little or no frustration (33.96 at the sublevel without distractors, and 35.85% at the level with distractors). A rating of 5, indicating a high level of frustration, was less common (16.98% and 13.21% of participants, respectively). Ratings of 3 and 4 were similar, but the number of 2 ratings increased at the sublevel with distractors, indicating a slight increase in frustration among participants. This increase in frustration could be due to the presence of additional 3D objects (spheres) that made the task more difficult.
An analysis of the data at the third level (Figure 9c) shows that participants in both sublevels indicated a low level of frustration, as 62.26% and 88.68% of participants indicated the lowest rating of frustration. These results indicate that participants at this level felt that the tasks were achievable within the set expectations and requirements.
Figure 9d shows that participants at the fourth level experienced the least frustration when pressing the horizontally oriented buttons, indicating that the combination of the larger button size and the smaller distance to the virtual hand was the most acceptable. In the sublevel with a distracting background, the number of participants experiencing frustration was similarly high despite the smaller button size. We can assume that the smaller distance between the buttons and the virtual hand compensated for the smaller size and facilitated the interaction. On the other hand, the increased frustration in the sublevel with vertically aligned buttons could be a consequence of the smaller buttons and the larger distance to the virtual hand.

4.3.5. Satisfaction with the Degree of Precision Achieved During Implemented Interactions

The results of the evaluation of the first level (Figure 10a) show that the participants were generally satisfied with the precision achieved at this level, with satisfaction being particularly high at the sublevels with shrunken objects and reduced contrast. Challenging tasks, such as retrieving smaller 3D objects in this case, are often associated with a greater sense of achievement when they are successfully mastered, which may contribute to higher subject satisfaction.
As shown in Figure 10b, participants were satisfied with the precision they achieved when performing the tasks on both sublevels of the second level. At the sublevel without distractors, 56.60% of participants gave ratings of 4 and 5. The result indicates a generally high level of satisfaction with the precision achieved when enlarging the 3D object. A smaller number of participants rated the subjective perception of the precision achieved lower, which confirms the positive perception of the precision required for the successful change of size of the 3D object. At the sublevel with distractors, the number of ratings of 5 is slightly lower at 24.53%, but the rating of 4 was 30.19%. Participants were still relatively satisfied with their accuracy despite the greater challenge posed on this sublevel. Ratings of 2 and 3 were also slightly more common on this sublevel, suggesting that some participants struggled with distractors.
Figure 10c shows how participants rated their satisfaction with the precision they achieved when rotating the 3D object. Most participants were generally very satisfied with the precision they achieved at this level, and no participant rated the precision as the lowest. In the direct interaction sublevel, most participants gave the two highest scores (39.62% gave a score of 5, and 32.07% gave a score of 4). Significantly more participants (73.58%) were very satisfied with the indirect interaction, which confirms the high rating of precision when using the slider to rotate the 3D object.
The data in Figure 10d show that most participants gave high ratings in all sublevels. This confirms the ability of participants to achieve precision despite the changes to interface design and button interaction efficiency shown in the previous tasks.

4.3.6. Ease of Use of Implemented Interactions

The data in Figure 11a show that participants were generally satisfied with the ease of retrieving 3D objects at all sublevels of the first level, with a focus on the sublevels with shrunken objects and reduced contrast, where 81.14% and 69.48% of participants rated their subjective feeling with scores of 4 and 5, respectively.
At the second level (Figure 11b), most participants at the level without distractors (50.95%) gave ratings of 4 and 5, indicating that most participants felt they could enlarge the 3D object with ease. The lower grades, 1 and 2, received a smaller number of votes, 33.96%, indicating that a smaller number of participants encountered difficulties when enlarging the 3D object. At the level with distractors, grades 4 and 5 were also prevalent at 49.05%, while the prevalence of grade 3 was slightly higher, suggesting that participants encountered greater challenges due to the presence of distracting 3D elements.
The analysis of participants’ subjective impressions of the ease of rotating the 3D object in Figure 11c shows that most participants found the implemented interaction easy to perform. Most participants (41.51%) gave the highest score of 5 for direct interaction, which means that the tasks were easy to perform with a combination of interaction. In the indirect interaction sublevel, where a slider was used to rotate the 3D object, 77.36% of participants gave a score of 5, indicating that using the slider as a graphical interface element was even more intuitive and made it easier for participants to interact with the 3D object.
When analyzing the participants’ perception of the button press in three different sublevels (Figure 11d), a decrease in satisfaction can be observed with a decrease in button size. The sublevel with horizontal orientation and the largest buttons has the highest number of positive ratings. This is followed by a moderate decrease in the sublevel with a vertical orientation. The sublevel with a distracting background has similar results than the sublevel with a vertical orientation, although it has the smallest buttons, which is probably due to the buttons being much closer to the virtual hand. Nevertheless, it is clear that the size of the button plays a role in the efficiency of the interaction.

5. Discussion

In this section, the research hypotheses are tested based on the results of the analysis of the defined accessibility parameters from the previous section. Each hypothesis is evaluated against user interaction data, with particular attention paid to the effects of object size, visual complexity, and interaction modality on user performance and experience. The shortcomings of this study are also described and discussed. The aim is to find out how certain accessibility parameters affect task performance in XR environments. Following this evaluation, in the remainder of this section, we propose two general and four detailed recommendations to improve the usability, reliability, and comfort of users when interacting with software solutions based on 3D visualization technologies and to address the shortcomings identified in the pilot study to be addressed in future research.
The definition of accessibility parameters in this research is based on the evaluation of 3D user interactions. The focus was on a detailed analysis of elements, such as the size of the object, its distance from the user, and the intuitiveness of the use of graphical interfaces, which should ensure efficient and intuitive manipulation of objects in a virtual environment.

5.1. Limitations of the Research

The results of this pilot study should be interpreted with caution due to several methodological limitations. One of the main limitations is the relatively small number of participants, which was a consequence of the testing approach in which each respondent had to use the application in a strictly controlled environment. In particular, all participants used the same hardware and software setup and followed an identical sequence of levels and sublevels during the interaction. While this ensured consistency of test conditions, it may also have led to a learning effect as participants gradually became more familiar with the tasks, possibly leading to better performance in later sublevels. It is also important to note that none of the participants reported any form of disability, which limits the generalizability of the results to more diverse user populations.
None of the subjects reported a noticeable delay in interacting with the application during the experiment. Given this, we conclude that latency was not a major disruptive factor in the specific context of our study. Regardless, we acknowledge that it will be important in future similar studies to systematically measure and analyze the effects of latency in applications where real-time responsiveness is important. In addition, although demographic data such as age, gender, and education level were collected, they were not included in the statistical analysis. The reason for this was the unbalanced distribution of participants across demographic groups, which limited the ability to draw valid conclusions based on these groups without risking bias or compromising the validity of the results.
Based on their previous education and background, it can be assumed that all participants had basic computer skills and felt comfortable working with digital interfaces. Informal conversations during the study also indicated that some participants had been exposed to 3D technologies before, for example, through casual gaming, demonstration workshops, trade fairs, or similar events. However, as their experience with 3D interaction technologies was not formally recorded in the questionnaire, these parameters are not included in the analysis. In addition, only one type of hand-tracking device, the Leap Motion controller, and one type of output device, the laptop screen, were used throughout the study. Although this ensured consistency of interaction methods, it limits the transferability of the results to other XR systems with different sensors or input devices. Generalizing these results to other natural interaction technologies would require additional empirical validation.
To better understand the influences of individual user characteristics and to support the development of universally accessible XR applications, future research should be conducted with a larger and more demographically diverse group of participants.
Despite these limitations, we believe that these results provide valuable initial insights into how XR applications with 3D scenes and interactions can be designed to support better accessibility and usability for a wide range of users. With further research based on larger and more demographically diverse samples, the principles derived from this study could make a meaningful contribution to the development of inclusive and universally designed XR systems.

5.2. Evaluation of the Hypotheses

H1 states that in precision-oriented XR applications, smaller 3D objects relative to the virtual hand will improve task accuracy and user satisfaction. The results obtained support this hypothesis by showing that a ratio of 1:0.5 between the size of the virtual hand and the 3D object resulted in the most favorable user experiences in precision-based tasks. Participants reported higher accuracy and greater satisfaction when interacting with smaller objects at this scale.
H2 states that visual distractions reduce task performance and user satisfaction. The results partially support this hypothesis. It has been shown that the presence of visual distractions, such as additional 3D elements or complex backgrounds, negatively affects task performance in time-limited scenarios and leads to longer processing times. This supports the claim that distractions are detrimental under time pressure. However, in scenarios without a strict time limit, these distractions were not detrimental to performance, and in some cases, they can be neutral or even beneficial if users can proceed at their own pace. This result contrasts with the second hypothesis.
H3 states that indirect interaction methods lead to better task efficiency, accuracy, and satisfaction than direct interaction. The data strongly support this hypothesis. Indirect interaction techniques, such as the use of sliders, resulted in better task efficiency, higher accuracy, and greater user satisfaction compared to the direct manipulation of 3D objects. Participants reported that the indirect method was more intuitive and provided better control, which contributed to a shorter task completion time.

5.3. Accessibility Recommendations

Accessibility does not happen by chance; it must be intentional. The integration of accessibility into software must be carefully planned from the start, and it is necessary to prioritize it at every step of the design process, rather than just aiming for inclusivity. It is important to emphasize that there is no universal best way of interaction that applies to UIs for solving all types of tasks. Therefore, before selecting a type of interaction and integrating accessibility features into such software solutions, it is important to first evaluate the goal and purpose of the software.
Recommendation 1:
When implementing accessibility options in software solutions, you should first consider the goal and purpose of the software solution.
Execution times for tasks at all four levels show a similar trend—execution times for tasks are longer when a new interaction is used for the first time. This finding highlights the importance of consistent alignment between levels and tasks when developing interaction mechanisms to promote user adaptation and optimize task execution.
Recommendation 2:
When implementing interactions, it is important to ensure consistency between the different levels and tasks.
The first parameter analyzed is the size of the object with which the user interacts directly. The results of the study show that in direct interaction, the interface with a ratio of 1:0.5 between the size of the virtual hand and the 3D object provided the best experience for the participants and created a balance between the challenge of the interaction and the satisfaction of the participants. This suggests that using smaller 3D objects with this ratio could be useful for tasks that require precision. It allows participants to develop and apply precision, leading to a greater sense of satisfaction and achievement. On the other hand, for tasks where execution time is critical, a 1:1 ratio between the virtual hand and the 3D object showed the best results. This ratio allows for faster and more efficient execution of tasks, reduces the likelihood of frustration, and allows participants to better focus on speed of execution and a greater likelihood of accuracy and precision of interaction with the 3D object.
Recommendation 3:
For software solutions where the accuracy of the user in solving tasks is important, a smaller size of 3D objects with a ratio of 1:0.5 to the virtual hand is optimal, while for solutions where speed in performing interactions is more important, a ratio of 1:1 between object and virtual hand is recommended.
The second parameter analyzed for accessibility is distractions during interaction in the form of other 3D objects or distracting backgrounds. In the sublevels without distractions, task completion times were shorter than in the sublevels with distractions. This suggests that such an environment is ideal for time-limited tasks where the focus is on speed and efficiency. However, no participant skipped a level in these sublevels, suggesting that the distractions, while challenging, are not too much of an obstacle and could aid concentration and engagement in the task. At these sublevels, participants were generally more satisfied with the ease of task performance and the importance of accuracy. This approach could be useful in maintaining participants’ attention and improving their skills.
Recommendation 4:
The design of 3D interfaces in time-critical tasks (e.g., quizzes with a time limit) should be free of potential distractions, while distracting elements can be useful to increase user engagement when interacting with objects in longer, potentially more monotonous tasks. The application’s accessibility options should allow the user to change the interactivity of distracting elements by making them static or removing them completely from the task.
The third accessibility parameter, which refers to the intuitiveness of the use of UI elements, is based on the analysis of direct and indirect interaction with 3D objects. The results of data analysis show that indirect interaction, such as the use of a slider, is more intuitive and accessible and enables more precise and efficient manipulation of 3D objects than direct interaction. The sublevel with indirect interaction using a slider has a shorter task execution time than the sublevel with direct interaction with a 3D object. Although one might expect direct interactions to be more efficient and “natural” for participants, the results indicate the opposite. Participants rated the ease of rotating the 3D object and satisfaction with the precision achieved slightly higher in the sublevel with indirect interaction. This result can be explained by the fact that most participants do not have prior experience with 3D interactions, and the gesture recognition technology currently in use is not yet mature enough to ensure natural and intuitive direct interaction. This points to the importance of clear and intuitive UI elements for 3D interactions, especially when learning new skills or tasks, and suggests that improving gesture recognition technology could increase the effectiveness of direct interactions in the future.
Recommendation 5:
Users should be able to use different UI elements according to their preferences and be able to add or remove these elements via the accessibility settings.
The last parameter analyzed was the size and distance of interaction elements on the UI. The data analysis in the research showed that larger buttons and appropriate spacing allow for a balance between ease of pressing and precision, thus reducing user frustration. This is consistent with Fitts’s Law, which states that the time required for a targeted action depends on the distance to the target and the size of the target. The size of the button, 0.24 × 0.1676, and the distance of −10 spatial units allowed users to interact more easily and accurately, resulting in less frustration and greater user satisfaction.
Recommendation 6:
For tasks that require speed and efficiency, larger UI elements and an optimal distance between these elements and the virtual hand should be used according to Fitts’s law. For tasks that require precision, it is important to find a balance between the size of the UI elements and their distance from the virtual hand, where a smaller distance can reduce the difficulty users could have in accessing smaller UI elements.
The implementation of the proposed accessibility recommendations has the potential to improve both the usability and inclusivity of XR applications. Central to this is the idea that all accessibility features should be closely aligned with the primary goals and intended purpose of the software. Maintaining consistency of interaction design across different levels and task types reduces cognitive load and allows users to form stable mental models. This, in turn, improves task performance and reduces the likelihood of errors.
Building on this, the research findings on the optimal size ratio between virtual 3D objects and the user’s virtual hand provide practical guidance for interaction design. These ratios can serve as a basis for manual sizing of objects in static systems or as baseline values in systems with automatic scaling based on object context. In addition, using Fitts’s law to determine the optimal size and spacing of UI elements can lead to measurable improvements in interaction efficiency, especially in time-critical scenarios.
This study also emphasizes the differentiated role of distracting visual elements. While such elements have a negative impact on performance in tasks under time pressure, they can promote engagement and concentration in longer or monotonous tasks without time pressure. This result underlines the importance of giving users control over their interface environment. In this context, customizability proves to be a crucial aspect of inclusive design. Giving users the ability to customize their experience by adding or removing visual elements, selecting preferred input methods, or adjusting layout components using accessibility settings will not only meet different user needs but also promote a sense of agency and satisfaction.
Taken together, these findings underscore the importance of a user and task-centered approach to XR development. Through personalization, consistency, and design decisions based on empirical models, developers can create XR experiences that are both accessible and high-performing.

6. Conclusions

Although solutions with 3D object representations and 3D interactions are increasingly common in our environment, they are not yet precise and of sufficient quality for everyday use. Therefore, special attention should be paid to adapting the interface and interaction methods to make the solutions as accessible as possible for all users.
This study examines different types of interactions in educational applications with 3D objects and defines accessibility parameters based on the evaluation of 3D user interactions in educational software systems. Parameters such as object size, the presence of distractions, the intuitiveness of the UI, and the size and distance of interaction elements were analyzed in the pilot study. The use of specific hardware, the duration of the study per participant, and the desire for each participant to use the application under identical conditions required an individual approach to each person and resulted in a limited number of participants. The results show the importance of adapting the UI to ensure the accessibility and effectiveness of 3D interactions. Based on the analysis of participants’ objectively measured results and their subjective opinions or attitudes toward each interaction method, recommendations were made that could improve the accessibility of systems that use 3D interactions and make them more accessible to all users.
The knowledge and experience we have gained through this research, as well as the parameters and recommendations from this study, will be used as input parameters for modeling future research. In it, we plan to include a larger sample of participants from different demographic groups to obtain more reliable and accurate results, especially for participants with disabilities, who are not represented in sufficient numbers in this study. To minimize the potential influence of prior knowledge on the results obtained in later sublevels, because respondents have passed through levels in which they had similar tasks, in future studies, we plan to randomly assign the order of levels and sublevels to individual users of the application.
In addition, future research will focus on analyzing the effects of various hardware parameters, such as monitor resolution and specifications, which have proven to be difficult when displaying 3D objects. Environmental conditions such as lighting, which affect the quality of displays in certain situations and were not originally considered as research parameters, will also be considered.

Author Contributions

Conceptualization, A.K.D. and K.Ž.; methodology, A.K.D.; software, A.K.D.; validation, A.K.D., K.Ž. and M.M.; formal analysis, A.K.D.; investigation, A.K.D. and M.K.; resources, A.K.D. and K.Ž.; data curation, A.K.D., K.Ž., M.M. and M.K.; writing—original draft preparation, A.K.D. and K.Ž.; writing—review and editing, A.K.D., K.Ž., M.M. and M.K.; visualization, A.K.D. and K.Ž.; supervision, K.Ž. and M.M.; project administration, A.K.D. and K.Ž.; funding acquisition, K.Ž. and M.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

This study was conducted in accordance with the Declaration of Helsinki and approved by the Ethics Committee of the University of Zagreb Faculty of Electrical Engineering and Computing (251-67/308-21/168, 25 November 2021).

Informed Consent Statement

Informed consent was obtained from all participants involved in the study.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Acknowledgments

The authors would like to sincerely thank Željka Car for her valuable, insightful advice during the conduct of this study, as well as Sandra Komaić, who provided technical support in the development of the research instrument as part of her diploma thesis.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Kim, N.W.; Joyner, S.C.; Riegelhuth, A.; Kim, Y. Accessible Visualization: Design Space, Opportunities, and Challenges. Comput. Graph. Forum 2021, 40, 173–188. [Google Scholar] [CrossRef]
  2. Dineva, S. The importance of visualization in e-learning courses. In Proceedings of the 14th International Conference on Virtual Learning—ICVL, 1, Bucharest, Romania, 25–26 October 2019; Available online: https://www.researchgate.net/publication/336916893_The_importance_of_visualization_in_e-learning_courses (accessed on 5 March 2025).
  3. Zhao, Y.; Jiang, J.; Chen, Y.; Liu, R.; Yang, Y.; Xue, X.; Chen, S. Metaverse: Perspectives from graphics, interactions and visualization. Vis. Inform. 2022, 6, 56–67. [Google Scholar] [CrossRef]
  4. Rauschnabel, P.A.; Felix, R.; Hinsch, C.; Shahab, H.; Alt, F. What is XR? Towards a Framework for Augmented and Virtual Reality. Comput. Hum. Behav. 2022, 133, 107289. [Google Scholar] [CrossRef]
  5. Recommendation ITU-T-P.1320; Quality of Experience Assessment of Extended Reality Meetings. ITU-T: Geneva, Switzerland, 2022.
  6. Dargan, S.; Bansal, S.; Kumar, M.; Mittal, A.; Kumar, K. Augmented Reality: A Comprehensive Review. Arch. Comput. Method Eng. 2023, 30, 1057–1080. [Google Scholar] [CrossRef]
  7. Egliston, B.; Carter, M. ‘The Interface of the Future’: Mixed Reality, Intimate Data and Imagined Temporalities. Big Data Soc. 2022, 9, 20539517211063689. [Google Scholar] [CrossRef]
  8. XR Accessibility User Requirements (XAUR). W3C Working Group Note 25 August 2021. Available online: https://www.w3.org/TR/xaur/ (accessed on 18 April 2025).
  9. Pastor, A.; Bourdin-Kreitz, P. Comparing Episodic Memory Outcomes from Walking Augmented Reality and Stationary Virtual Reality Encoding Experiences. Sci. Rep. 2024, 14, 7580. [Google Scholar] [CrossRef]
  10. Peney, T.; Skarratt, P.A. Increasing the Immersivity of 360° Videos Facilitates Learning and Memory: Implications for Theory and Practice. Educ. Technol. Res. Dev. 2024, 72, 3103–3115. [Google Scholar] [CrossRef]
  11. Cadet, L.B.; Chainay, H. Memory of Virtual Experiences: Role of Immersion, Emotion and Sense of Presence. Int. J. Hum.-Comput. Stud. 2020, 144, 102506. [Google Scholar] [CrossRef]
  12. Heather, A.; Chinnah, T.; Devaraj, V. The Use of Virtual and Augmented Reality in Anatomy Teaching. MedEdPublish 2019, 8, 77. [Google Scholar] [CrossRef]
  13. Goh, E.S.; Sunar, M.S.; Ismail, A.W. 3D Object Manipulation Techniques in Handheld Mobile Augmented Reality Interface: A Review. IEEE Access 2019, 7, 40581–40601. [Google Scholar] [CrossRef]
  14. Siang, C.V.; Isham, M.I.M.; Mohamed, F.; Yusoff, Y.A.; Mokhtar, M.K.; Tomi, B.; Selamat, A. Interactive Holographic Application Using Augmented Reality EduCard and 3D Holographic Pyramid for Interactive and Immersive Learning. In Proceedings of the 2017 IEEE Conference on e-Learning, e-Management and e-Services (IC3e), Miri, Malaysia, 16–17 November 2017. [Google Scholar] [CrossRef]
  15. Yenioglu, B.Y.; Yenioglu, S.; Sayar, K.; Ergulec, F. Using Augmented Reality Based Intervention to Teach Science to Students with Learning Disabilities. J. Spec. Educ. Technol. 2023, 39, 108–119. [Google Scholar] [CrossRef]
  16. Grasse, K.M.; Melcer, E.F.; Kreminski, M.; Junius, N.; Ryan, J.; Wardrip-Fruin, N. Academical: A Choice-Based Interactive Game for Enhancing Moral Reasoning, Knowledge, and Attitudes in Responsible Conduct of Research. In Games and Narrative: Theory and Practice; Bostan, B., Ed.; Springer: Cham, Switzerland, 2022; pp. 173–189. [Google Scholar] [CrossRef]
  17. Cozzolino, V.; Moroz, O.; Ding, A.Y. The Virtual Factory: Hologram-Enabled Control and Monitoring of Industrial IoT Devices. In Proceedings of the IEEE International Conference on Artificial Intelligence and Virtual Reality, Taichung, Taiwan, 10–12 December 2018; pp. 120–123. [Google Scholar] [CrossRef]
  18. Gunesekera, A.I.; Bao, Y.; Kibelloh, M. The Role of Usability on E-Learning User Interactions and Satisfaction: A Literature Review. J. Syst. Inf. Technol. 2019, 21, 368–394. [Google Scholar] [CrossRef]
  19. Benlian, A. IT Feature Use over Time and Its Impact on Individual Task Performance. J. Assoc. Inf. Syst. 2015, 16, 144–173. [Google Scholar] [CrossRef]
  20. Tomasi, S.; Schuff, D.; Turetken, O. Understanding Novelty: How Task Structure and Tool Familiarity Moderate Performance. Behav. Inf. Technol. 2018, 37, 406–418. [Google Scholar] [CrossRef]
  21. Stauffert, J.P.; Niebling, F.; Latoschik, M.E. Latency and Cybersickness: Impact, Causes, and Measures. A Review. Front. Virtual Real. 2020, 1, 582204. [Google Scholar] [CrossRef]
  22. Itaguchi, Y. Size Perception Bias and Reach-to-Grasp Kinematics: An Exploratory Study on the Virtual Hand with a Consumer Immersive Virtual-Reality Device. Front. Virtual Real. 2021, 2, 712378. [Google Scholar] [CrossRef]
  23. Tamura, Y.; Makino, H.; Ohno, N. Size Perception in Stereoscopic Displays Based on Binocular Disparity Considering Interpupillary Distance. J. Adv. Simul. Sci. Eng. 2024, 11, 93–101. [Google Scholar] [CrossRef]
  24. Wang, L.; Sandor, C. Can You Perceive the Size Change? Discrimination Thresholds for Size Changes in Augmented Reality. In Virtual Reality and Mixed Reality, Proceedings of the 18th EuroXR International Conference, EuroXR 2021, Milan, Italy, 24–26 November 2021; Bourdot, P., Alcañiz Raya, M., Figueroa, P., Interrante, V., Kuhlen, T.W., Reiners, D., Eds.; LNCS; Springer: Cham, Switzerland, 2021; Volume 13105. [Google Scholar] [CrossRef]
  25. Thomas, B.H. Examining User Perception of the Size of Multiple Objects in Virtual Reality. Appl. Sci. 2020, 10, 4049. [Google Scholar] [CrossRef]
  26. Picciano, A.G.; Dziuban, C.D.; Graham, C.R.; Moskal, P.D. Blended Learning; Routledge: New York, NJ, USA, 2021. [Google Scholar] [CrossRef]
  27. Bachmann, D.; Weichert, F.; Rinkenauer, G. Review of Three-Dimensional Human-Computer Interaction with Focus on the Leap Motion Controller. Sensors 2018, 18, 2194. [Google Scholar] [CrossRef]
  28. Blinder, D.; Birnbaum, T.; Ito, T.; Shimobaba, T. The state-of-the-art in computer generated holography for 3D display. Light Adv. Manuf. 2022, 3, 572–600. [Google Scholar] [CrossRef]
  29. Dastmalchi, M.R.; Goli, A. Embodied learning in virtual reality: Comparing direct and indirect interaction effects on educational outcomes. In Proceedings of the 2024 IEEE Frontiers in Education Conference (FIE), Washington, DC, USA, 13–16 October 2024; pp. 1–7. [Google Scholar] [CrossRef]
  30. Nielson, G.M.; Olsen, D.R. Direct Manipulation Techniques for 3D Objects Using 2D Locator Devices. In Proceedings of the 1986 Workshop on Interactive 3D Graphics (I3D ‘86), Chapel Hill, NC, USA, 23–24 October 1986; ACM: New York, NY, USA, 1987; pp. 175–182. [Google Scholar] [CrossRef]
  31. Zeleznik, R.C.; Forsberg, A.S.; Strauss, P.S. Two Pointer Input for 3D Interaction. In Proceedings of the 1997 Symposium on Interactive 3D Graphics (I3D ‘97), Providence, RI, USA, 27–30 April 1997; ACM: New York, NY, USA, 1997; pp. 115–120. [Google Scholar] [CrossRef]
  32. Houde, S. Iterative Design of an Interface for Easy 3-D Direct Manipulation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ‘92), Monterey, CA, USA, 3–7 May 1992; ACM: New York, NY, USA, 1992; pp. 135–142. [Google Scholar] [CrossRef]
  33. Conner, B.D.; Snibbe, S.S.; Herndon, K.P.; Robbins, D.C.; Zeleznik, R.C.; van Dam, A. Three-Dimensional Widgets. In Proceedings of the 1992 Symposium on Interactive 3D Graphics (I3D ‘92), Cambridge, MA, USA, 29 March–1 April 1992; ACM: New York, NY, USA, 1992; pp. 183–188. [Google Scholar] [CrossRef]
  34. Shoemake, K. Arcball: A User Interface for Specifying Three-Dimensional Orientation Using a Mouse. In Proceedings of the Graphics Interface ’92, Vancouver, BC, Canada, 11–15 May 1992; Canadian Human-Computer Communications Society: Toronto, ON, Canada, 1992; pp. 151–156. [Google Scholar] [CrossRef]
  35. Mendes, D.; Caputo, F.M.; Giachetti, A.; Ferreira, A.; Jorge, J.A. A Survey on 3D Virtual Object Manipulation: From the Desktop to Immersive Virtual Environments. Comput. Graph. Forum 2019, 38, 21–45. [Google Scholar] [CrossRef]
  36. Chen, Y.; Armstrong, C.; Childers, R.; Do, A.; Thirey, K.; Xu, J.; Bryant, D.G.; Howard, A. Effects of object size and task goals on reaching kinematics in a non-immersive virtual environment. Hum. Mov. Sci. 2022, 83, 102954. [Google Scholar] [CrossRef] [PubMed]
  37. Kumle, L.; Võ, M.L.-H.; Nobre, A.C.; Draschkow, D. Multifaceted consequences of visual distraction during natural behaviour. Commun. Psychol. 2024, 2, 49. [Google Scholar] [CrossRef] [PubMed]
  38. Spanogianopoulos, S.; Sirlantzis, K.; Mentzelopoulos, M.; Protopsaltis, A. Human Computer Interaction Using Gestures for Mobile Devices and Serious Games: A Review. In Proceedings of the 2014 International Conference on Interactive Mobile Communication Technologies and Learning (IMCL), Thessaloniki, Greece, 13–14 November 2014; IEEE: Piscataway, NJ, USA, 2015. [Google Scholar] [CrossRef]
  39. LaViola, J., Jr.; Kruijff, E.; McMahan, R.P.; Bowman, D.; Poupyrev, I.P. 3D User Interfaces: Theory and Practice, 2nd ed.; Addison-Wesley: Boston, MA, USA, 2017. [Google Scholar]
  40. Li, Y.; Huang, J.; Tian, F.; Wang, H.-A.; Dai, G.-Z. Gesture interaction in virtual reality. Virtual Real. Intell. Hardw. 2019, 1, 84–112. [Google Scholar] [CrossRef]
  41. Bedikian, R. Understanding Latency: Part 2. 2013. Available online: https://web.archive.org/web/20250122144858/https://blog.leapmotion.com/understanding-latency-part-2/ (accessed on 23 April 2025).
  42. Niechwiej-Szwedo, E.; Gonzalez, D.; Nouredanesh, M.; Tung, J. Evaluation of the Leap Motion Controller during the performance of visually-guided upper limb movements. PLoS ONE 2018, 13, e0193639. [Google Scholar] [CrossRef]
  43. Weichert, F.; Bachmann, D.; Rudak, B.; Fisseler, D. Analysis of the accuracy and robustness of the leap motion controller. Sensors 2013, 13, 6380–6393. [Google Scholar] [CrossRef]
  44. Ababsa, F.; He, J.; Chardonnet, J.R. Combining HoloLens and Leap-Motion for Free Hand-Based 3D Interaction in MR Environments. In Lecture Notes in Computer Science; De Paolis, L., Bourdot, P., Eds.; Springer: Cham, Switzerland, 2020; Volume 12242, pp. 315–327. [Google Scholar] [CrossRef]
  45. Li, R.; Liu, Z.; Tan, J. A survey on 3D hand pose estimation: Cameras, methods, and datasets. Pattern Recognit. 2019, 93, 251–272. [Google Scholar] [CrossRef]
  46. Murugana, A.S.; Behzada, A.; Kim, E.; Chung, J.; Jung, D. Development of webcam-based hand tracking for virtual reality interaction. In Proceedings of the Fifth International Conference on Image, Video Processing, and Artificial Intelligence (IVPAI 2023), Shenzhen, China, 14–15 November 2024. [Google Scholar] [CrossRef]
  47. Rogers, Y.; Sharp, H.; Preece, J. Interaction Design: Beyond Human-Computer Interaction, 6th ed.; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2023; ISBN 978-1-119-90109-9. [Google Scholar]
  48. Aliprantis, J.; Konstantakis, M.; Nikopoulou, R.; Mylonas, P.; Caridakis, G. Natural Interaction in Augmented Reality Context. In Proceedings of the 1st International Workshop on Visual Pattern Extraction and Recognition for Cultural Heritage Understanding, Pisa, Italy, 30 January 2019; pp. 50–61. Available online: https://ceur-ws.org/Vol-2320/paper11.pdf (accessed on 5 March 2025).
  49. Mortensen, D.H. Natural User Interfaces—What Does It Mean & How to Design User Interfaces That Feel Naturally. Available online: https://www.interactiondesign.org/literature/article/natural-user-interfaces-what-are-they-and-how-do-youdesign-user-interfaces-that-feel-natural (accessed on 23 February 2023).
  50. Rautaray, S.S.; Agrawal, A. Vision based hand gesture recognition for human computer interaction: A survey. Artif. Intell. Rev. 2015, 43, 1–54. [Google Scholar] [CrossRef]
  51. Apple. Learn Advanced Gestures to Interact with iPad. iPad User Guide. Available online: https://support.apple.com/guide/ipad/learn-advanced-gestures-ipadab6772b8/ipados (accessed on 7 March 2025).
  52. Guerino, G.C.; Valentim, N.M.C. Usability and User Experience Evaluation of Natural User Interfaces: A Systematic Mapping Study. IET Softw. 2020, 14, 451–467. [Google Scholar] [CrossRef]
  53. Shatilov, K.A.; Chatzopoulos, D.; Lee, L.-H.; Hui, P. Emerging ExG-based NUI inputs in extended realities: A bottom-up survey. ACM Trans. Interact. Intell. Syst. 2021, 11, 10. [Google Scholar] [CrossRef]
  54. Wigdor, D.; Wixon, D. Brave NUI World: Designing Natural User Interfaces for Touch and Gesture; Morgan Kaufmann: Burlington, MA, USA, 2011. [Google Scholar]
  55. Deshmukh, A.M.; Chalmeta, R. User Experience and Usability of Voice User Interfaces: A Systematic Literature Review. Information 2024, 15, 579. [Google Scholar] [CrossRef]
  56. Seaborn, K.; Sawa, Y.; Watanabe, M. Coimagining the Future of Voice Assistants with Cultural Sensitivity. Hum. Behav. Emerg. Technol. 2024, 2024, 3238737. [Google Scholar] [CrossRef]
  57. Wenzel, K.; Kaufman, G. Designing for Harm Reduction: Communication Repair for Multicultural Users’ Voice Interactions. In Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (CHI ‘24), Honolulu, HI, USA, 11–16 May 2024. [Google Scholar] [CrossRef]
  58. Wani, T.M.; Gunawan, T.S.; Qadri, S.A.A.; Kartiwi, M.; Ambikairajah, E. A Comprehensive Review of Speech Emotion Recognition Systems. IEEE Access 2021, 9, 47795–47814. [Google Scholar] [CrossRef]
  59. Mindlin, D.; Beer, F.; Sieger, L.N.; Heindorf, S.; Esposito, E.; Ngomo, A.-C.N.; Cimiano, P. Beyond One-Shot Explanations: A Systematic Literature Review of Dialogue-Based xAI Approaches. Artif. Intell. Rev. 2025, 58, 81. [Google Scholar] [CrossRef]
  60. Novák, J.Š.; Masner, J.; Benda, P.; Šimek, P.; Merunka, V. Eye tracking, usability, and user experience: A systematic review. Int. J. Hum.-Comput. Interact. 2023, 40, 4484–4500. [Google Scholar] [CrossRef]
  61. Moreno-Arjonilla, J.; López-Ruiz, A.; Jiménez-Pérez, J.R.; Callejas-Aguilera, J.E.; Jurado, J.M. Eye-tracking on virtual reality: A survey. Virtual Real. 2024, 28, 38. [Google Scholar] [CrossRef]
  62. Futami, K.; Tabuchi, Y.; Murao, K.; Terada, T. Exploring Gaze Movement Gesture Recognition Method for Eye-Based Interaction Using Eyewear with Infrared Distance Sensor Array. Electronics 2022, 11, 1637. [Google Scholar] [CrossRef]
  63. Cai, J.; Ge, X.; Tian, Y.; Ge, L.; Shi, H.; Wan, H.; Xu, J. Designing Gaze-Based Interactions for Teleoperation: Eye Stick and Eye Click. Int. J. Hum.-Comput. Interact. 2023, 40, 2500–2514. [Google Scholar] [CrossRef]
  64. Rolff, T.; Gabel, J.; Zerbin, L.; Hypki, N.; Schmidt, S.; Lappe, M.; Steinicke, F. A Hands-Free Spatial Selection and Interaction Technique Using Gaze and Blink Input with Blink Prediction for Extended Reality. arXiv 2025. [Google Scholar] [CrossRef]
  65. Feit, A.M.; Vordemann, L.; Park, S.; Berube, C.; Hilliges, O. Detecting Relevance during Decision-Making from Eye Movements for UI Adaptation. In Proceedings of the ACM Symposium on Eye Tracking Research and Applications (ETRA ‘20 Full Papers), Stuttgart, Germany, 2–5 June 2020; Article 10. pp. 1–11. [Google Scholar] [CrossRef]
  66. Sidenmark, L.; Gellersen, H. Eye&Head: Synergetic Eye and Head Movement for Gaze Pointing and Selection. In Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology (UIST ‘19), New Orleans, LA, USA, 20–23 October 2019; ACM: New York, NY, USA, 2019; pp. 1161–1174. [Google Scholar] [CrossRef]
  67. McNamara, A.; Mehta, R. Additional insights: Using eye tracking and brain sensing in virtual reality. In Proceedings of the Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 25–30 April 2020; pp. 1–4. [Google Scholar] [CrossRef]
  68. Sendhilnathan, N.; Fernandes, A.S.; Proulx, M.J.; Jonker, T.R. Implicit Gaze Research for XR Systems. In Proceedings of the PhysioCHI: Towards Best Practices for Integrating Physiological Signals in HCI, Honolulu, HI, USA, 11 May 2024. [Google Scholar] [CrossRef]
  69. Garbin, S.J.; Shen, Y.; Schuetz, I.; Cavin, R.; Hughes, G.; Talathi, S.S. OpenEDS: Open Eye Dataset. arXiv 2019, arXiv:1905.03702. Available online: https://paperswithcode.com/paper/190503702 (accessed on 19 April 2025).
  70. Kothari, R.; Yang, Z.; Kanan, C.; Bailey, R.; Pelz, J.B.; Diaz, G.J. Gaze-in-wild: A dataset for studying eye and head coordination in everyday activities. Sci. Rep. 2020, 10, 2539. [Google Scholar] [CrossRef] [PubMed]
  71. Katrychuk, D.; Griffith, H.K.; Komogortsev, O.V. Power-efficient and shift-robust eye-tracking sensor for portable VR headsets. In Proceedings of the 11th ACM Symposium on Eye Tracking Research & Applications 2019, Denver, CO, USA, 25–28 June 2019; Article 19. pp. 1–8. [Google Scholar] [CrossRef]
  72. Kothari, R.S.; Bailey, R.J.; Kanan, C.; Pelz, J.B.; Diaz, G.J. EllSeg-Gen, towards domain generalization for head-mounted eyetracking. Proc. ACM Hum.-Comput. Interact. 2022, 6, 139. [Google Scholar] [CrossRef]
  73. De Giglio, V.; Evangelista, A.; Uva, A.E.; Manghisi, V.M. Exploring Augmented Reality Interaction Metaphors: Performance and Ergonomic Assessment using HoloLens 2 and RULA Method. Procedia Comput. Sci. 2025, 253, 1790–1799. [Google Scholar] [CrossRef]
  74. Haria, A.; Subramanian, A.; Asokkumar, N.; Poddar, S.; Nayak, J.S. Hand Gesture Recognition for Human Computer Interaction. Procedia Comput. Sci. 2017, 115, 367–374. [Google Scholar] [CrossRef]
  75. Ahmed, S.; Kallu, K.D.; Ahmed, S.; Cho, S.H. Hand gestures recognition using radar sensors for human-computer interaction: A review. Remote Sens. 2021, 13, 527. [Google Scholar] [CrossRef]
  76. Serrano, R.; Morillo, P.; Casas, S.; Cruz-Neira, C. An empirical evaluation of two natural hand interaction systems in augmented reality. Multimed. Tools Appl. 2022, 81, 31657–31683. [Google Scholar] [CrossRef]
  77. Ferracani, A.; Pezzatini, D.; Bianchini, J.; Biscini, G.; Del Bimbo, A. Locomotion by Natural Gestures for Immersive Virtual Environments. In Proceedings of the 1st International Workshop on Multimedia Alternate Realities (AltMM ‘16), Amsterdam, The Netherlands, 16 October 2016; ACM: New York, NY, USA, 2016; pp. 21–24. [Google Scholar] [CrossRef]
  78. Caserman, P.; Liu, S.; Göbel, S. Full-Body Motion Recognition in Immersive-Virtual-Reality-Based Exergame. IEEE Trans. Games 2022, 14, 243–252. [Google Scholar] [CrossRef]
  79. Moeslund, T.B.; Granum, E. A survey of computer vision-based human motion capture. Comput. Vis. Image Underst. 2001, 81, 231–268. [Google Scholar] [CrossRef]
  80. Cao, Z.; Hidalgo, G.; Simon, T.; Wei, S.; Sheikh, Y. OpenPose: Realtime multi-person 2D pose estimation using part affinity fields. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 172–186. [Google Scholar] [CrossRef]
  81. Pimpalshende, A.; Suresh, C.; Potnurwar, A.; Pinjarkar, L.; Bongirwar, V.; Dhunde, R. Facial gesture recognition with human computer interaction for physically impaired. Adv. Non-Linear Var. Inequalities 2024, 27, 48–59. [Google Scholar] [CrossRef]
  82. Krupke, D.; Steinicke, F.; Lubos, P.; Jonetzko, Y.; Gorner, M.; Zhang, J. Comparison of multimodal heading and pointing gestures for co-located mixed reality human-robot interaction. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; pp. 1–9. [Google Scholar] [CrossRef]
  83. Ratti, E.; Waninger, S.; Berka, C.; Ruffini, G.; Verma, A. Comparison of medical and consumer wireless EEG systems for use in clinical trials. Front. Hum. Neurosci. 2017, 11, 398. [Google Scholar] [CrossRef] [PubMed]
  84. Zhang, Y.; Liang, B.; Chen, B.; Torrens, P.M.; Atashzar, S.F.; Lin, D.; Sun, Q. Force-aware interface via electromyography for natural VR/AR interaction. ACM Trans. Graph. 2022, 41, 268. [Google Scholar] [CrossRef]
  85. Kastalskiy, I.; Mironov, V.; Lobov, S.; Krilova, N.; Pimashkin, A.; Kazantsev, V. A neuromuscular interface for robotic devices control. Comput. Math. Methods Med. 2018, volume, 8948145. [Google Scholar] [CrossRef] [PubMed]
  86. Welihinda, D.V.D.S.; Gunarathne, L.; Herath, H.; Yasakethu, S.; Madusanka, N.; Lee, B.-I. EEG and EMG-based human-machine interface for navigation of mobility-related assistive wheelchair (MRA-W). Heliyon 2024, 10, e27777. [Google Scholar] [CrossRef]
  87. de la Rosa, R.; Alonso, A.; Carrera, A.; Durán, R.; Fernández, P. Man-machine interface system for neuromuscular training and evaluation based on EMG and MMG signals. Sensors 2010, 10, 11100–11125. [Google Scholar] [CrossRef]
  88. Meltzner, G.S.; Heaton, J.T.; Deng, Y.; De Luca, G.; Roy, S.H.; Kline, J.C. Silent speech recognition as an alternative communication device for persons with laryngectomy. IEEE/ACM Trans. Audio Speech Lang. Process. 2017, 25, 2386–2398. [Google Scholar] [CrossRef]
  89. Wu, P.; Kaveh, R.; Nautiyal, R.; Zhang, C.; Guo, A.; Kachinthaya, A.; Mishra, T.; Yu, B.; Black, A.W.; Muller, R.; et al. Towards EMG-to-speech with necklace form factor. In Proceedings of the Interspeech 2024, Kos, Greece, 1–5 September 2024; pp. 402–406. [Google Scholar] [CrossRef]
  90. Drey, T.; Montag, M.; Vogt, A.; Rixen, N.; Seufert, T.; Zander, S.; Rietzler, M.; Rukzio, E. Investigating the Effects of Individual Spatial Abilities on Virtual Reality Object Manipulation. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI ‘23), Hamburg, Germany, 23–28 April 2023; ACM: New York, NY, USA, 2023. Article 398. pp. 1–24. [Google Scholar] [CrossRef]
  91. About Universal Design. The National Disability Authority, Ireland. Available online: https://universaldesign.ie/about-universal-design (accessed on 19 April 2025).
  92. Salvador-Ullauri, L.; Acosta-Vargas, P.; Gonzalez, M.; Luján-Mora, S. Combined Method for Evaluating Accessibility in Serious Games. Appl. Sci. 2020, 10, 6324. [Google Scholar] [CrossRef]
  93. Mack, K.; McDonnell, E.; Jain, D.; Wang, L.L.; Froehlich, J.E.; Findlater, L. What Do We Mean by “Accessibility Research”? A Literature Survey of Accessibility Papers in CHI and ASSETS from 1994 to 2019. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, Yokohama, Japan, 8-–13 May 2021; pp. 1–18. [Google Scholar] [CrossRef]
  94. Web Content Accessibility Guidelines (WCAG). 12 December 2024. Available online: https://www.w3.org/TR/WCAG21 (accessed on 3 March 2025).
  95. Mobile Accessibility at W3C. Available online: https://www.w3.org/WAI/standards-guidelines/mobile (accessed on 3 March 2025).
  96. Guidance on Applying WCAG 2 to Non-Web Information and Communications Technologies (WCAG2ICT). 15 November 2024. Available online: https://www.w3.org/TR/wcag2ict-22 (accessed on 3 March 2025).
  97. Žilak, M.; Car, Ž.; Čuljak, I. A Systematic Literature Review of Handheld Augmented Reality Solutions for People with Disabilities. Sensors 2022, 22, 7719. [Google Scholar] [CrossRef]
  98. Creed, C.; Al-Kalbani, M.; Theil, A.; Sarcar, S.; Williams, I. Inclusive Augmented and Virtual Reality: A Research Agenda. Int. J. Hum.-Comput. Interact. 2024, 40, 6200–6219. [Google Scholar] [CrossRef]
  99. Dudley, J.; Yin, L.; Garaj, V.; Kristensson, P.O. Inclusive Immersion: A Review of Efforts to Improve Accessibility in Virtual Reality, Augmented Reality and the Metaverse. Virtual Real. 2023, 27, 2989–3020. [Google Scholar] [CrossRef]
  100. Bailey, B.; Bryant, L.; Hemsley, B. Virtual Reality and Augmented Reality for Children, Adolescents, and Adults with Communication Disability and Neurodevelopmental Disorders: A Systematic Review. Rev. J. Autism Dev. Disord. 2022, 9, 160–183. [Google Scholar] [CrossRef]
  101. Clarkson, J.; Coleman, R.; Hosking, I.M.; Waller, S.W. Inclusive Design Toolkit; University of Cambridge: Cambridge, UK, 2007; Available online: https://www-edc.eng.cam.ac.uk/files/idtoolkit.pdf (accessed on 19 April 2025).
  102. Interaction Types in Automotive HMI Terminology. Ultraleap. Available online: https://docs.ultraleap.com/automotive-guidelines/interaction-types.html (accessed on 3 March 2025).
  103. Ultraleap Developer Resources. Ultraleap. Available online: https://docs.ultraleap.com/ (accessed on 3 March 2025).
  104. Chakravarthi, B.; Prabhu Prasad, B.M.; Imandi, R.; Pavan Kumar, B.N. A Comprehensive Review of Leap Motion Controller-Based Hand Gesture Datasets. In Proceedings of the 2023 International Conference on Next Generation Electronics (NEleX), Vellore, India, 14–16 December 2023; pp. 1–7. [Google Scholar] [CrossRef]
  105. Komaić, S. Analysis and Implementation of Interactions Using Leap Motion in Applications Developed in Unity Environment. Master’s Thesis, University of Dubrovnik, Dubrovnik, Croatia, 5 July 2022. Available online: https://repozitorij.unidu.hr/islandora/object/unidu:1940 (accessed on 5 March 2025). (In Croatian).
  106. Fitts, P.M. The Information Capacity of the Human Motor System in Controlling the Amplitude of Movement. J. Exp. Psychol. 1954, 47, 381–391. [Google Scholar] [CrossRef]
Figure 1. Schematic representation of the experimental environment used in the study. The layout includes the position of the participant, the sensor, the display, and the interaction area. The user interface of the application is in Croatian. In the upper left corner of each screen, the blue robot in the bubble displays the user’s instructions for a particular sublevel (e.g., “The goal of this game is to place a cube, a sphere and a pyramid in the appropriate space corresponding to their shape.”). In the upper right corner of the screen, the options “Skip sublevel” (hr. Preskoči razinu) and “Exit” (hr. IZLAZ) are displayed.
Figure 1. Schematic representation of the experimental environment used in the study. The layout includes the position of the participant, the sensor, the display, and the interaction area. The user interface of the application is in Croatian. In the upper left corner of each screen, the blue robot in the bubble displays the user’s instructions for a particular sublevel (e.g., “The goal of this game is to place a cube, a sphere and a pyramid in the appropriate space corresponding to their shape.”). In the upper right corner of the screen, the options “Skip sublevel” (hr. Preskoči razinu) and “Exit” (hr. IZLAZ) are displayed.
Applsci 15 05258 g001
Figure 2. Design of the screen on the first level: (a) objects of basic size; (b) enlarged objects; (c) shrunken objects; (d) objects of basic size with reduced contrast.
Figure 2. Design of the screen on the first level: (a) objects of basic size; (b) enlarged objects; (c) shrunken objects; (d) objects of basic size with reduced contrast.
Applsci 15 05258 g002
Figure 3. Design of the screen on the second level: (a) without distractors; (b) with distractors.
Figure 3. Design of the screen on the second level: (a) without distractors; (b) with distractors.
Applsci 15 05258 g003
Figure 4. Design of the screen on the third level: (a) direct interaction; (b) indirect interaction.
Figure 4. Design of the screen on the third level: (a) direct interaction; (b) indirect interaction.
Applsci 15 05258 g004
Figure 5. Design of the screen on the fourth level: (a) horizontal orientation of buttons; (b) vertical orientation of buttons; (c) vertical orientation of buttons with distracting background.
Figure 5. Design of the screen on the fourth level: (a) horizontal orientation of buttons; (b) vertical orientation of buttons; (c) vertical orientation of buttons with distracting background.
Applsci 15 05258 g005
Figure 6. Participants’ ratings for ease of (a) 3D object retrieval; (b) resizing a 3D object; (c) rotating a 3D object; (d) pressing 3D button with a virtual finger.
Figure 6. Participants’ ratings for ease of (a) 3D object retrieval; (b) resizing a 3D object; (c) rotating a 3D object; (d) pressing 3D button with a virtual finger.
Applsci 15 05258 g006
Figure 7. Participants’ ratings of the importance of precision in (a) 3D object retrieval; (b) resizing a 3D object; (c) rotating a 3D object; (d) pressing 3D button with a virtual finger.
Figure 7. Participants’ ratings of the importance of precision in (a) 3D object retrieval; (b) resizing a 3D object; (c) rotating a 3D object; (d) pressing 3D button with a virtual finger.
Applsci 15 05258 g007
Figure 8. Participants’ rating of intuitiveness of (a) 3D object retrieval; (b) resizing a 3D object; (c) rotating a 3D object; (d) pressing 3D button with a virtual finger.
Figure 8. Participants’ rating of intuitiveness of (a) 3D object retrieval; (b) resizing a 3D object; (c) rotating a 3D object; (d) pressing 3D button with a virtual finger.
Applsci 15 05258 g008
Figure 9. Participants’ perception of frustration during: (a) 3D object retrieval; (b) resizing a 3D object; (c) rotating a 3D object; (d) pressing 3D button with a virtual finger.
Figure 9. Participants’ perception of frustration during: (a) 3D object retrieval; (b) resizing a 3D object; (c) rotating a 3D object; (d) pressing 3D button with a virtual finger.
Applsci 15 05258 g009
Figure 10. Participants’ satisfaction with the degree of precision achieved during: (a) 3D object retrieval; (b) resizing a 3D object; (c) rotating a 3D object; (d) pressing 3D button with a virtual finger.
Figure 10. Participants’ satisfaction with the degree of precision achieved during: (a) 3D object retrieval; (b) resizing a 3D object; (c) rotating a 3D object; (d) pressing 3D button with a virtual finger.
Applsci 15 05258 g010
Figure 11. Participants’ subjective feeling of ease of use of: (a) 3D object retrieval; (b) resizing a 3D object; (c) rotating a 3D object; (d) pressing 3D button with a virtual finger.
Figure 11. Participants’ subjective feeling of ease of use of: (a) 3D object retrieval; (b) resizing a 3D object; (c) rotating a 3D object; (d) pressing 3D button with a virtual finger.
Applsci 15 05258 g011
Table 1. Description of interactions by game level.
Table 1. Description of interactions by game level.
LevelType of InteractionUsabilityReliabilityComfort
1Retrieval of geometric solidRetrieval of an object
using grab–and-release interaction
highlowhigh
2Resizing of geometric solidRetrieval of an object using grab–and-release interaction and adjusting size of object using pinch interactionmediumhighhigh
3Rotation of geometric solidRotation of geometric solid using swipe interactionhighlowmedium
4Pressing a buttonPressing a 3D button using virtual fingermediummediummedium
Table 2. Participants’ activities and associated metrics.
Table 2. Participants’ activities and associated metrics.
Participant’s ActivityActivity Result Stored in the Database
Level 1: Grab-and-release interaction
Press of the button to start activitiesTime required to complete task
Press of the button to return the object to the sceneNumber of times the button was pressed
Correctly inserted objectsNumber of correctly inserted objects
Incorrectly inserted objectsNumber of incorrectly inserted objects
Press of the button to skip the level1 = skipped, 0 = not skipped
Level 2: Pinch interaction of both hands
Press of the button to start activitiesTime required to complete task
Press of the button to return the object to the sceneNumber of times the button was pressed
Press of the button to skip the level1 = skipped, 0 = “not skipped”
Level 3: Swipe and rotate interactions
Press of the button to start activitiesTime required to complete task
Press of the button to skip the level1 = skipped, 0 = not skipped
Level 4: Interaction of pressing the 3D button with the virtual finger
Press of the button to start activitiesTime required to answer all questions
Press of the button to answer the question.Selected answer
Correctness of the answer1 = correct answer, 0 = incorrect answer
Press of the button to skip the level1 = skipped, 0 = not skipped
Table 3. Statistical analysis of the time required to complete tasks on sublevels of the first level.
Table 3. Statistical analysis of the time required to complete tasks on sublevels of the first level.
SublevelMCσminmaxRQ1Q3IQRSK
Basic42.5734.2228.4810.52153.25142.7422.7456.4933.751.643.59
Enlarged objects39.1333.6923.9911.25134.06122.823.1647.8724.711.894.66
Shrunken objects52.9040.7244.177.16248.27241.1223.7768.8745.102.126.35
Reduced contrast28.7223.2521.365.70127.73122.0315.3634.3018.932.558.47
Table 4. Statistical analysis of the time needed to complete tasks on sublevels of the second level.
Table 4. Statistical analysis of the time needed to complete tasks on sublevels of the second level.
SublevelMCσminmaxRQ1Q3IQRSK
Without distractors31.3521.5429.593.02156.12153.1012.1436.0023.862.135.52
With distractors42.8025.8637.984.24171.99167.7617.3962.2144.821.542.20
Table 5. Statistical analysis of the time needed to complete tasks on sublevels of the third level.
Table 5. Statistical analysis of the time needed to complete tasks on sublevels of the third level.
SublevelMCσminmaxRQ1Q3IQRSK
Direct interaction17.8313.0813.822.0959.1357.049.3924.8115.431.471.84
Indirect interaction7.836.196.200.6640.6640.004.429.645.223.2915.13
Table 6. Statistical analysis of the time needed to complete tasks on sublevels of the fourth level.
Table 6. Statistical analysis of the time needed to complete tasks on sublevels of the fourth level.
SublevelMCσminmaxRQ1Q3IQRSK
Horizontal orientation7.065.982.991.8615.2013.344.798.924.130.840.11
Vertical orientation16.8112.3113.091.9361.7759.848.0121.8913.881.652.60
Distracting background7.246.553.391.0120.3819.374.908.583.671.483.56
Table 7. Number of times when option to return the object to the scene was used.
Table 7. Number of times when option to return the object to the scene was used.
LevelSublevelNMP
1Basic 73835 (66.04%)
Enlarged objects000 (0.00%)
Shrunken objects1001132 (60.38%)
Reduced contrast51625 (47.17%)
2Without distractors56622 (41.51%)
With distractors1111034 (64.15%)
3Direct interaction24415 (28.30%)
Indirect interactionNot applicable for this sublevel
N = total number of uses of the object return option; M = maximum number of object returns per participant; P = number (%) of participants who used return object option.
Table 8. Categorization of results on the first level based on accuracy in performing tasks. On the first level, the accuracy is represented by the number of correctly and incorrectly inserted 3D objects in four categories: correct (all objects are correctly inserted), mostly correct (two objects are correctly inserted), mostly incorrect (only one object is correctly inserted), and incorrect (none of the objects are correctly inserted).
Table 8. Categorization of results on the first level based on accuracy in performing tasks. On the first level, the accuracy is represented by the number of correctly and incorrectly inserted 3D objects in four categories: correct (all objects are correctly inserted), mostly correct (two objects are correctly inserted), mostly incorrect (only one object is correctly inserted), and incorrect (none of the objects are correctly inserted).
SublevelCategory
CorrectMostly CorrectMostly IncorrectIncorrect
Basic62.26%30.19%7.55%0.00%
Enlarged objects3.77%35.85%43.40%16.98%
Shrunken objects83.02%16.98%0.00%0.00%
Reduced contrast64.15%35.85%0.00%0.00%
Table 9. Percentage of correct answers to questions on the fourth level.
Table 9. Percentage of correct answers to questions on the fourth level.
SublevelAccuracy
Horizontal orientation98.11
Vertical orientation92.45
Distracting background62.26
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kešelj Dilberović, A.; Žubrinić, K.; Miličević, M.; Kristić, M. Evaluation of Accessibility Parameters in Education Software That Supports Three-Dimensional Interactions. Appl. Sci. 2025, 15, 5258. https://doi.org/10.3390/app15105258

AMA Style

Kešelj Dilberović A, Žubrinić K, Miličević M, Kristić M. Evaluation of Accessibility Parameters in Education Software That Supports Three-Dimensional Interactions. Applied Sciences. 2025; 15(10):5258. https://doi.org/10.3390/app15105258

Chicago/Turabian Style

Kešelj Dilberović, Ana, Krunoslav Žubrinić, Mario Miličević, and Mihaela Kristić. 2025. "Evaluation of Accessibility Parameters in Education Software That Supports Three-Dimensional Interactions" Applied Sciences 15, no. 10: 5258. https://doi.org/10.3390/app15105258

APA Style

Kešelj Dilberović, A., Žubrinić, K., Miličević, M., & Kristić, M. (2025). Evaluation of Accessibility Parameters in Education Software That Supports Three-Dimensional Interactions. Applied Sciences, 15(10), 5258. https://doi.org/10.3390/app15105258

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop