Design and Implementation of a 3D Korean Sign Language Learning System Using Pseudo-Hologram

Kim, Naeun; Choe, HaeYeong; Lee, Sukwon; Kang, Changgu

doi:10.3390/app15168962

Open AccessArticle

Design and Implementation of a 3D Korean Sign Language Learning System Using Pseudo-Hologram

¹

Department of Computer Science and Engineering, Gyeongsang National University, Jinju-si 52828, Republic of Korea

²

Multimedia IT Engineering, Gangneung-Wonju National University, Wonju-si 26403, Republic of Korea

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2025, 15(16), 8962; https://doi.org/10.3390/app15168962

Submission received: 19 July 2025 / Revised: 10 August 2025 / Accepted: 11 August 2025 / Published: 14 August 2025

Download

Browse Figures

Versions Notes

Abstract

Sign language is a three-dimensional (3D) visual language that conveys meaning through hand positions, shapes, and movements. Traditional sign language education methods, such as textbooks and videos, often fail to capture the spatial characteristics of sign language, leading to limitations in learning accuracy and comprehension. To address this, we propose a 3D Korean Sign Language Learning System that leverages pseudo-hologram technology and hand gesture recognition using Leap Motion sensors. The proposed system provides learners with an immersive 3D learning experience by visualizing sign language gestures through pseudo-holographic displays. A Recurrent Neural Network (RNN) model, combined with Diffusion Convolutional Recurrent Neural Networks (DCRNNs) and ProbSparse Attention mechanisms, is used to recognize hand gestures from both hands in real-time. The system is implemented using a server–client architecture to ensure scalability and flexibility, allowing efficient updates to the gesture recognition model without modifying the client application. Experimental results show that the system enhances learners’ ability to accurately perform and comprehend sign language gestures. Additionally, a usability study demonstrated that 3D visualization significantly improves learning motivation and user engagement compared to traditional 2D learning methods.

Keywords:

sign language; learning; hand gesture recognition; pseudo-hologram; three-dimensional visualization

1. Introduction

Humans build relationships and enhance social interactions through communication. Communication goes beyond merely transmitting information; it is a fundamental process of sharing emotions and thoughts, thereby strengthening social bonds.

While most hearing individuals naturally communicate through spoken language, individuals with hearing impairments encounter significant challenges due to auditory limitations [1]. To overcome these barriers, hearing-impaired individuals primarily use sign language, which conveys meaning through hand and finger movements, as their main form of communication [2]. For hearing individuals, sign language is often unfamiliar, and opportunities to learn it are limited. This lack of exposure creates a communication gap between hearing and hearing-impaired individuals [3]. If hearing individuals develop an understanding and proficiency in sign language, it could reduce the social isolation experienced by those with hearing impairments and foster an environment where both groups can communicate freely and effectively [4,5].

With advancements in information technology, research on applying these innovations to sign language has been actively conducted. One notable example is the use of Leap Motion, a sensor device that recognizes hand gestures as input. This technology has significantly contributed to the development of sign language recognition systems, enabling the translation of sign language into spoken languages across various countries. For instance, Leap Motion has been successfully applied to recognize American Sign Language (ASL) [6], Arabic Sign Language [7], Chinese Sign Language [8], Turkish Sign Language [9], and British Sign Language [10]. The Leap Motion Controller (LMC) precisely tracks hand and finger movements, converting them into data. This makes it a valuable tool for facilitating communication between hearing-impaired and non-hearing individuals.

Recently, research utilizing Recurrent Neural Networks (RNNs), known for their exceptional performance in fields such as speech recognition and machine translation, has gained attention in sign language learning. RNNs are highly effective in processing sequential data, making them particularly suitable for capturing the continuous and time-dependent movements of hand gestures [11,12,13,14,15]. Sign language learning systems that incorporate these technologies aim to address the limitations of traditional textbook- and video-based learning methods. In particular, they provide a learning environment where learners can practice their hand gestures in real-time and receive immediate feedback on their performance.

Traditional sign language learning methods primarily rely on textbooks or videos, which make it challenging for learners to verify the accuracy of their hand gestures when practicing alone. Recent research has addressed these limitations by introducing systems that offer real-time feedback, thereby improving the effectiveness of sign language learning [12,16]. These IT-based sign language learning systems enable learners to more efficiently master and refine accurate hand gestures.

Unlike other languages, sign language is a three-dimensional (3D) visual language that combines hand positions, shapes, and movements. Unlike spoken languages, it relies heavily on elements such as hand positions, angles, and finger shapes to convey meaning accurately. Given these characteristics, understanding hand gestures in a spatial, 3D context is essential for effective sign language learning. Methods that fail to fully capture the 3D nature of sign language risk misinterpreting hand positions and movements, leading to inaccurate learning outcomes. This limitation also presents challenges for hearing individuals who are learning sign language to communicate effectively with the hearing-impaired. When sign language is taught using two-dimensional (2D) videos or images, it becomes difficult to grasp the depth and flow of hand movements, leading to suboptimal learning outcomes. To overcome these challenges, learning tools that support 3D visualization are necessary. Such tools enable learners to observe and practice the spatial aspects of hand positions, shapes, and movements more effectively, improving their comprehension and accuracy [17,18].

In this study, we propose a 3D Korean Sign Language Learning System utilizing pseudo-hologram technology to address these limitations. The term “pseudo-hologram” in this study refers to a Pepper’s Ghost-style projection, which creates the illusion of a floating 3D image using a transparent reflective surface. Pseudo-holograms are widely used in fields such as advertising, entertainment, education, and medical simulations, providing users with a 3D visual experience at a relatively low cost. This pseudo-holographic approach offers a distinct advantage over prior AR/VR-based sign language tools by providing an immersive 3D experience without requiring head-mounted displays, thereby enhancing comfort and accessibility in educational settings. Therefore, we adopted this technology to provide cost-effective 3D visual effects suitable for educational applications. Hand gesture recognition plays a crucial role in the proposed sign language learning system. To achieve this, we designed a model based on RNNs. In sign language, both the right and left hands move independently while interacting to convey specific meanings. To effectively capture these interactions, we represent the movements of both hands as a graph using an adjacency matrix of their joints and extract features through Diffusion Convolutional Recurrent Neural Networks (DCRNNs) [19]. The independent features of each hand, along with the interaction features between both hands, are combined and extracted as integrated features using the ProbSparse Attention mechanism [20]. These features are subsequently processed through an encoder and fully connected layers (FCLs) to interpret hand gestures.

The proposed system is implemented using a server–client architecture to facilitate efficient updates of the sign language gesture recognition model. The client captures the user’s hand gesture data, provides a user interface for training support, and displays the pseudo-hologram. Meanwhile, the server processes the gesture data sent from the client and executes the recognition and classification model. This architecture enhances system scalability and usability, as updates to the recognition model can be made on the server side without requiring modifications to the client application.

Recent research has also shown that sign language learning can enhance perceptual processing abilities [21,22]. For instance, Karabüklü et al. [23] reported that intermediate learners of American Sign Language demonstrated significantly improved temporal resolution in visual attention, suggesting broader cognitive benefits of the visual sign language experience. This supports the educational value of immersive and 3D learning tools such as the system proposed in this study.

The remainder of this paper is organized as follows. Section 2 presents a review of related research, discussing the characteristics, advantages, and limitations of existing studies. Section 3 outlines the design and implementation of the proposed sign language learning system, including details on dataset composition, preprocessing methods, and the learning and evaluation processes. Section 4 provides the experimental results, including user evaluations and survey analyses. Finally, Section 6 presents the conclusions of this study and suggests directions for future research.

2. Related Work

2.1. Sign Language Recognition

With advancements in information technology, various studies have explored the integration of technologies such as Recurrent Neural Networks (RNNs) and Leap Motion sensors into sign language learning [22,24]. Among these, Walizad et al. [25] proposed a sign language recognition system that combines Convolutional Neural Networks (CNNs) with computer vision techniques. The study captured hand gestures corresponding to 10 American Sign Language (ASL) signs using a webcam. These images were subsequently processed, trained, and classified using CNNs, achieving accurate recognition.

Mistry et al. [26] proposed a sign language translation system utilizing the Intel RealSense camera. This study focused on recognizing 26 letters of the ASL alphabet and used Support Vector Machines (SVMs) and multilayer perceptrons for classification. However, the system was limited to recognizing static signs only, which reduced its practical applicability for dynamic gestures. Chong et al. [27] proposed a method using the LMC and machine learning techniques. Their system aimed to recognize 26 letters and 10 numbers in ASL, distinguishing between static and dynamic gestures. Avola et al. [11] proposed a method for recognizing both sign language and semaphore gestures using the LMC and RNNs. The study captured hand joint angles from sign language gestures using the LMC, and these angles were used to train an RNN for gesture recognition.

Jiang et al. [28] introduced a Skeleton-Aware Multi-modal Sign Language Recognition (SLR) framework, leveraging Skeleton-Based Action Recognition techniques. Their study proposed a Sign Language Graph Convolutional Network to model dynamic interactions between joints and a Separable Spatial-Temporal Convolutional Network to extract skeletal features more effectively. The framework was further enhanced by incorporating RGB and depth modalities, resulting in the highest performance in the 2021 CVPR SLR Challenge. Rastgoo et al. [29] conducted a comprehensive survey of recent studies on deep learning-based sign language recognition. The study highlighted advancements in model accuracy, discussed various classification methods, and analyzed the strengths and limitations of existing approaches. It also provided recommendations for future research directions. Papastratis et al. [30] reviewed the latest technologies used in sign language systems. Their study analyzed the advantages and limitations of various techniques for sign language capturing, recognition, translation, and representation, providing valuable insights for developing future applications.

Recent advances in attention mechanisms such as FlashAttention [31,32] and Hyena [33] have demonstrated significant improvements in scalability and efficiency for long-sequence modeling. While our system currently employs the ProbSparse Attention mechanism, future iterations may explore these modern approaches to enhance real-time recognition and computational performance.

2.2. Sign Language Applications

Setiawan proposed a mobile application for sign language learning targeted at individuals with hearing impairments [34]. The application includes features such as a sign language dictionary and instructional videos. However, it lacks gesture recognition capabilities, limiting its ability to provide feedback on users’ performance of correct gestures. Novaliendry et al. [18] developed an interactive virtual reality (VR)-based sign language learning application for individuals with hearing impairments. The tool was tested with hearing-impaired children and received positive feedback, demonstrating the potential of VR technology in enhancing sign language education. However, similar to previous studies, it lacked a sign language recognition feature, which limited the system’s ability to provide real-time feedback, resulting in a one-sided learning experience. This aligns with broader findings in the literature, which highlight that while VR technologies offer immersive educational experiences, they often face challenges such as physical discomfort, accessibility limitations, and user adaptation difficulties—especially for learners with special needs [35].

Lee et al. [15] introduced an ASL learning application that utilized the LMC and RNNs. The study compared the performance of the LMC with previously used devices, such as Kinect and motion gloves, and highlighted the advantages of using LMC for gesture recognition. Additionally, they developed an interactive “whack-a-mole” style game to demonstrate the feasibility of educational applications. However, the study was limited to recognizing single-hand gestures, with the prototype focusing only on right-hand samples. Schioppo et al. [36] proposed a sign language learning application that combined the LMC with a VR headset to enhance user immersion in the learning experience.

2.3. Comparison with Prior Works

Most existing studies have primarily focused on improving sign language recognition accuracy. However, studies aimed at educational applications often lacked comprehensive user evaluations or practical experimentation beyond accuracy assessments. Furthermore, many prior systems did not incorporate real-time sign language recognition, limiting the feedback available to learners during practice. To address these limitations, this study proposes an integrated sign language learning system combining Leap Motion-based gesture tracking, pseudo-holographic 3D visualization, and RNN-based recognition. By incorporating interactive and game-based educational content, the system aims to enhance user engagement, immersion, and learning effectiveness. The studies summarized in Table 1 were selected based on the following criteria: (1) the study addressed sign language recognition or learning; (2) it employed gesture input devices such as Leap Motion or RealSense; and (3) it implemented machine learning-based recognition models. Studies involving user-centered evaluations or educational applications were prioritized to ensure meaningful comparison with our proposed system.

3. System Design and Implementation

This chapter provides an overview of the system developed in this study. First, it explains the design and implementation process of the sign language learning system. Next, it describes the structure and preprocessing of the data in detail. Finally, it comprehensively outlines the design and implementation methods of the sign language recognition model.

3.1. System Overview

Figure 1 illustrates the overall structure of the proposed sign language learning system. The system consists of two main components: the client and the server. The client provides components that facilitate both the learning and testing processes of sign language gestures. When a user performs hand gestures using the LMC during the testing process, the gesture data is transmitted to the server via socket communication. The server receives the data from the client, calculates the distance between the two hands, converts the data into an array format suitable for graph construction, and segments the data based on sequence length during preprocessing. The preprocessed data is then fed into the sign language recognition model, which generates prediction results. These results are transmitted back to the client, where they are compared with the correct answers, and the final scores are aggregated.

3.2. Data Set and Preprocessing

The training dataset was constructed using eight commonly used sentences derived from the standardized Korean Sign Language Dictionary, ensuring that the gestures reflect official KSL expressions. Data collection was conducted using an LMC, which captured detailed hand movement data for KSL-specific gesture recognition. The collected data comprises a total of 134 elements, including four quaternion elements and three positional elements for each palm, as well as four quaternion elements for each segment of the finger joints. Figure 2 represents the standard hand structure provided by the LMC. The data collection process was conducted at a frame rate of 60 frames per second to ensure the smooth capture of sign language motions. Each gesture was repeated multiple times during recording, resulting in a total dataset duration of approximately 1000 s. Data samples were excluded if both hands were not simultaneously detected. Table 2 presents the total number of frames collected for each sign.

During preprocessing, a graph consisting of nodes and edges is constructed as input for the Diffusion Convolutional Recurrent Neural Network (DCRNN). DCRNN requires input in the form of a graph structure representing connectivity among nodes, along with associated feature vectors for each node. Therefore, we construct an adjacency matrix representing the graph structure and a feature matrix containing the values for each node. The graph is represented as

G = (V, E)

and defined as follows:

V is the set of nodes representing the hand joints.
E is the set of edges representing the connections between the joints

Each node

v_{i} \in V

represents a specific joint in the hand (excluding the fingertips), and each edge

e_{i j} \in E

represents a physical connection between joints

v_{i}

and

v_{j}

. For time step t, the node feature vector is defined as

x_{i}^{t} \in R^{4}

, where

x_{i}^{t} = (q_{i x}^{t}, q_{i y}^{t}, q_{i z}^{t}, q_{i w}^{t})

are the quaternion components in the order

(x, y, z, w)

of joint

v_{i}

(with

(q_{i x}^{t}, q_{i y}^{t}, q_{i z}^{t})

the vector part and

q_{i w}^{t}

the scalar part). Thus, the input feature matrix for time step t is

X^{t} = [\begin{matrix} x_{1}^{t} \\ ⋮ \\ x_{n}^{t} \end{matrix}] \in R^{n \times 4},

where

n = | V |

is the total number of joints across both hands (excluding fingertips).

The adjacency matrix A defines the connectivity of the graph, where:

A_{i j} = \{\begin{matrix} 1 & if there is an edge between v_{i} and v_{j} \\ 0 & otherwise \end{matrix}

(1)

A unified graph is constructed for both hands by incorporating the distance between the two palms as a feature to create edges connecting the hands. Figure 3 is an abstract representation of the DCRNN input graph structure, which includes both hands’ joint positions and an additional node representing the distance between two hands. Let the positions of the palm joints of the two hands be

P_{1} = (x_{1}, y_{1}, z_{1})

and

P_{2} = (x_{2}, y_{2}, z_{2})

.

To represent the spatial relationship between the two hands, we calculate the Euclidean distance between their palm positions. The Leap Motion SDK provides normalized 3D coordinate values (via the InteractionBox), which are used in this computation to ensure consistency across different users and hand sizes. The Leap Motion SDK provides normalized 3D coordinate values (via the InteractionBox), which are used in this computation to ensure consistency across different users and hand sizes. The normalized distance is computed using the following equation:

d (P_{1}, P_{2}) = \sqrt{{(x_{2} - x_{1})}^{2} + {(y_{2} - y_{1})}^{2} + {(z_{2} - z_{1})}^{2}}

(2)

3.3. Model Design

In this study, DCRNN was utilized to effectively model the spatiotemporal dependencies of sign language gestures [37,38]. By representing hand joints as a graph structure, it overcomes the limitations of traditional RNNs, which struggle with capturing complex spatial relationships in sequential data. This allows for more precise learning of sign language motion patterns. Additionally, ProbSparse Attention was applied to enhance computational efficiency by selectively focusing on the most informative features in long gesture sequences [39]. The combination of DCRNN and ProbSparse Attention improves both recognition accuracy and real-time performance, making the system more scalable and efficient for sign language recognition.

Although more recent architectures such as Transformers have shown strong performance in sequential data processing, we selected the DCRNN-based approach to prioritize lightweight inference and responsiveness, which are essential for educational systems. Since the main goal of this study is to support learning through effective feedback rather than to maximize recognition performance, this model was deemed sufficient and practical for our use case. The model is designed to capture complex movements involved in sign language gestures through three key modules, each focusing on specific aspects of hand movements and interactions. The model simultaneously considers both temporal and spatial information to further improve the accuracy of sign language recognition. Table 3 presents the detailed components of each module, while Figure 4 illustrates the overall structure of the model.

The model follows a hybrid architecture that combines sequential attention-based modules and graph-based recurrent modeling. Specifically, attention mechanisms are applied individually to each hand’s feature sequence to extract temporal dependencies, while the DCRNN module models the spatial and temporal structure of the combined hand features as a graph. The outputs from all attention components are concatenated and passed through fully connected layers for final classification.

The first module, DCRNN, captures temporal patterns from graph-structured sequential data, leveraging diffusion convolution to propagate information across hand joints. This enables the effective modeling of spatial relationships, essential for the accurate recognition of sign gestures. Additionally, an adjacency matrix representing the hand’s graph structure is provided as input in the preprocessing step to ensure the model effectively learns the structural relationships of both hands. Each joint is represented by a unit quaternion

(x, y, z, w)

, where

(x, y, z)

is the vector part and w is the scalar part, encoding its 3D orientation. capturing the 3D orientation of the hand in space. This allows the model to incorporate both positional and rotational information for more accurate gesture recognition. For each time step, the DCRNN processes the graph G by applying a diffusion process over the adjacency matrix A to capture spatial dependencies among the joints:

H^{(t + 1)} = f (H^{t}, A, X^{t})

(3)

where

H^{t}

is the hidden state at time t, and f represents the diffusion convolution operation.

In this setup, the graph effectively models the temporal and spatial relationships between hand joints, allowing the DCRNN to learn meaningful patterns for sign language recognition.

The second module incorporates three ProbSparse Attention mechanisms, selectively focusing on significant information within long gesture sequences, which reduces computational complexity and improves recognition accuracy. Each attention module captures long-range temporal dependencies within the hand gesture sequences by computing weighted interactions across all time steps. This complements the DCRNN’s local temporal modeling, allowing the model to focus on both global and local motion patterns effectively. Unlike conventional attention mechanisms, which distribute attention evenly across all sequence positions, the ProbSparse Attention mechanism selectively computes significant query–key scores. This enables the model to focus on critical time steps and specific hand movements within lengthy sequences. Prior to applying attention, the feature vectors for both the left and right hands are processed through a linear layer to align with the output dimensions of the DCRNN for each hand. This process ensures that the feature vectors of both hands are represented in the same dimensional space, with normalized features, making them well-suited for learning through ProbSparse Attention.

F = c o n c a t (f_{l e f t}, f_{r i g h t}, f_{b o t h})

(4)

The three feature vectors generated by the ProbSparse Attention mechanisms—for the left hand, right hand, and both hands—are concatenated to form the resulting vector F, as shown in Equation (4), with a size of

516 \times 1

. This vector is then passed through the third module, a fully connected layer, where the softmax function is applied to produce the final output, identifying the predicted class of the sign language gesture.

3.4. System Hardware Setup and User Interaction Flowchart

Figure 5 illustrates the hardware setup of the proposed system. A pseudo-hologram display serves as the output device, providing a 3D visualization of sign language gestures. The LMC is used as the input device to capture the learner’s hand movements. As shown in Figure 5, the pseudo-hologram display is constructed using transparent acrylic, measuring 28.6 cm in width and 20.2 cm in height, with a 24-inch display positioned underneath to project the video content. The vertical distance from the display surface to the tip of the reflective pyramid (i.e., the apex of the acrylic structure) is approximately 15 cm.

Figure 6 presents the user interaction flow for the sign language learning system. Upon starting the system, the main menu offers options to navigate to the learning mode, testing mode, or to exit the application. Selecting the learning mode directs the user to the corresponding screen. The initial learning screen displays a “Start Learning” button along with an option to return to the main menu. Upon selecting the “Start Learning” button, the application begins teaching the registered sign language gestures through animated visualizations. The learning process ends either upon completion or when the stop button is pressed.

Similarly, selecting the testing mode takes the user to the testing screen, which features a “Start Test” button and a return option to the main menu. Upon selecting the “Start Test” button, the application randomly selects a registered sign language gesture for testing. During the test, a word is displayed on the screen, prompting the user to perform the corresponding gesture. The LMC captures the user’s hand movements and collects the angle data, which is transmitted to the server in JSON format. The server processes the data to recognize the performed gesture and sends the result back to the client for display. The client displays the result on the screen. Once the test is completed, the final score is displayed, marking the end of the test session.

4. Experiments

This section presents the operational outcomes of the proposed sign language learning system, providing detailed explanations of its functionality. Additionally, it analyzes the sign language recognition performance and discusses the results of a usability evaluation conducted to assess the system’s effectiveness and user experience.

4.1. System Demonstration

Figure 7 illustrates the display device rendering four camera views (left, right, front, and back) to create a pseudo-holographic visualization. This three-dimensional representation provides learners with a more intuitive understanding of sign language gestures.

As shown in Figure 5, the LMC is positioned beneath the pseudo-hologram display, enabling users to navigate menu options through hand gesture recognition. Figure 8 presents the screen displayed after selecting the “Study” menu, followed by the learning session. The interface includes a “Study” menu and a “Help” menu, with the learning process initiated upon selecting the “Study” menu. During the session, sign language animations are displayed alongside their meanings. Each word in the sign language is repeated three times through animation, and users can exit the session at any point using the “Exit” button.

Figure 9 shows the menu screen prior to the test and scenes during the testing process. At the start of the test, a word and a countdown timer are displayed, requiring the user to perform the corresponding sign language gesture within the given time limit. The user’s sign language gestures are shown in real time as a pseudo-hologram, allowing them to immediately verify their performance. Once the time limit expires, the results of the performed gesture are displayed, and the next word is presented. After the test is completed, an overall summary of the results is shown, and the system returns to the menu screen displayed before the test began.

4.2. Evaluation of Sign Recognition Performance

In this study, the model was trained using sequence data collected through the LMC. The cross-entropy loss function was employed to calculate the difference between the predicted and actual labels, and the Adam optimizer [40] was used to update the model weights. The learning rate was set to 0.001. To ensure balanced learning and evaluation, a k-fold cross-validation method was applied. This method partitions the dataset into multiple subsets, using different subsets for training and validation in each iteration to prevent overfitting and ensure robust performance. In this study, k was set to 5.

The model was trained for a total of 50 iterations, with each iteration consisting of 10 epochs. To determine the optimal batch size, the model was trained with different batch sizes, and the batch size that achieved the highest accuracy and lowest loss was selected. Figure 10 shows the loss and accuracy values for various batch sizes.

A comparison of the batch sizes revealed that a batch size of 64 led to a more significant reduction in loss as the epochs progressed. Consequently, batch size 64 was chosen as the final configuration. Figure 11 illustrates the trends in loss and accuracy during both training and validation with this batch size.

As shown in Figure 10 and Figure 11, both the training and validation loss gradually decreased and showed signs of stabilization over time. This trend indicates that the model effectively learned spatiotemporal features without overfitting, achieving convergence as training progressed.

The testing process involved collecting 20 samples for each sign language gesture using the LMC. Performance evaluation was carried out using a confusion matrix. Figure 12 presents the confusion matrices from the test results: the left side shows results from the proposed DCRNN model, while the right side shows those from the LSTM model. In both matrices, the horizontal axis represents the predicted labels and the vertical axis represents the actual labels.

To assess the effectiveness of the proposed model, we compared the recognition performance of the DCRNN-based approach with a baseline LSTM model. Both models were trained and evaluated on the same dataset, which consisted of 160 test samples (20 samples for each of 8 gestures). Table 4 summarizes the precision, recall, and F1-score for each model.

The LSTM model correctly predicted 151 out of 160 test samples, achieving a precision, recall, and F1-score of 94.38%. Its relatively lower performance may be attributed to its limited capacity to model spatial relationships between hand joints. In contrast, the DCRNN architecture incorporates a graph-based structure that captures both spatial and temporal dependencies, allowing for more accurate recognition of subtle variations in gesture execution.

To verify the system’s real-time performance, we measured its runtime frame rate. As shown in Figure 13, the system consistently maintained approximately 60 frames per second (FPS) during execution, demonstrating its capability to operate in real time without perceptible latency or lag.

4.3. Effectiveness and Usability Study

A usability study was conducted to assess the learning effectiveness and user experience of the proposed system. The study involved 20 participants with advanced IT knowledge and experience, but no prior experience with sign language. The participants had an average age of 24.3 years, ranging from 21 to 26 years, with a standard deviation of 1.3 years. Participants used the system to learn sign language, completed a test, and provided feedback through a survey on learning effectiveness and user experience. The study was not randomized but conducted in a controlled indoor environment to minimize external variability and ensure consistency across all participants. This controlled setup allowed for stable tracking conditions and a uniform learning experience.

To ensure consistent gesture recognition and minimize the impact of Leap Motion’s known limitations—such as occlusion and limited sensing range—all experiments were conducted in a controlled indoor environment with stable lighting conditions. Participants were instructed to stay within the optimal tracking zone of the sensor. As a result, the tracking performance remained stable throughout the evaluation, with no notable disruptions affecting recognition accuracy or user experience.

The survey questions are as follows.

Learning Effectiveness

Q1. Did you feel that using this system was effective for learning sign language?
Q2. Did the 3D visualization help you understand the correct position and angle of the sign language gestures?
Q3. Do you think this system is more efficient compared to learning methods using 2D videos or images?
Q4. Do you feel that you can reproduce the sign language gestures you learned through this system in real life?
Q5. Has your ability to remember and reproduce sign language gestures improved after using this system?

User Experience

Q6. How intuitive did you find the 3D visualization for observing and understanding sign language?
Q7. How convenient was it to learn or test gestures using the LMC?
Q8. Did the animations provide sufficient information for understanding the continuity of the sign language gestures?
Q9. Did you find the system’s interface design user-friendly?
Q10. To what extent did the 3D visualization technology contribute to enhancing your motivation during learning?

Participants rated each question on a 5-point Likert scale, and the summarized results are presented in Table 5 and Table 6. For the comparative survey on learning methods (Q3), participants were trained using three different methods (our system, 2D videos, and images) and then completed a survey to compare them. The survey results on learning effectiveness indicated that participants perceived the system as effective for learning sign language gestures. Participants particularly highlighted the usefulness of 3D visualization in accurately understanding the positions and angles of gestures (Q2: Mean 4.05 ± 0.74). Participants expressed confidence in reproducing the learned gestures in real-life scenarios (Q4: Mean 4.10 ± 0.62) and noted an improvement in their ability to remember and replicate the gestures after using the system (Q5: Mean 3.85 ± 0.57). However, compared to traditional learning methods using 2D videos or images, the system’s perceived efficiency received relatively lower scores (Q3: Mean 3.80 ± 0.68), indicating the need for further improvements to highlight its unique advantages.

In terms of user experience, participants found the 3D visualization intuitive for observing and understanding sign language gestures (Q6: Mean 4.05 ± 0.7). They also found that the animations provided sufficient information to comprehend the continuity of gestures (Q8: Mean 4.05 ± 0.80). Notably, the 3D visualization technology was recognized as a key factor in boosting user motivation during learning sessions (Q10: Mean 4.80 ± 0.40). However, the convenience of using the LMC for learning and testing gestures received lower ratings (Q7: Mean 2.80 ± 0.67), and the system’s interface design was noted to have room for improvement in user-friendliness (Q9: Mean 3.90 ± 0.53).

These findings demonstrate that the proposed system is effective for sign language learning and that a 3D visualization-based learning environment has a positive impact on user motivation and learning outcomes. Nevertheless, improvements in the usability of the LMC and the interface design are required to enhance the overall user experience. Furthermore, additional studies are needed to better highlight the system’s distinct advantages over 2D-based learning methods. These results confirm the strengths of the proposed system in terms of learning effectiveness and user experience, while suggesting directions for future improvements. Three-dimensional visualization in sign language education was found to have a significant impact on both learning effectiveness and user experience, as evidenced by the results from questions Q2, Q3, Q6, and Q10 (p < 0.001).

Additionally, to examine whether learning effectiveness (Q1–Q5) and user experience (Q6–Q10) are related, a Pearson’s correlation analysis was conducted. The results showed a weak positive correlation (r = 0.235, p = 0.319), indicating no statistically significant relationship. This suggests that higher learning effectiveness does not necessarily lead to a better user experience, and vice versa. Various factors, such as individual preferences, prior experience, and familiarity with 3D visualization, may have influenced these results. To improve the system, future research should enhance Leap Motion’s hand-tracking accuracy, refine UI/UX design, and introduce user-friendly feedback mechanisms. Expanding the study with a larger, more diverse sample and applying multiple regression analysis could further clarify key factors affecting learning effectiveness and user experience.

To examine the perceived impact of 3D visualization on learning, we analyzed four survey items (Q2, Q3, Q6, Q10) that directly assessed its clarity, intuitiveness, motivational value, and comparative efficiency over 2D-based methods. The average score across these items was 4.18 (SD = 0.35), significantly higher than the neutral midpoint (3.0). A one-sample t-test confirmed the statistical significance of this difference (

t (19) = 14.82

,

p < 0.001

). These findings suggest that participants regarded the 3D visualization as a critical factor in enhancing their comprehension, engagement, and overall learning experience. This result quantitatively supports the qualitative feedback observed in other parts of the survey, indicating that 3D visualization is not just a supplementary element, but a core contributor to the system’s educational effectiveness.

To evaluate the internal consistency of the questionnaire, we calculated Cronbach’s alpha for each subscale. The Learning Effectiveness items (Q1–Q5) yielded a value of

α = 0.87

, and the User Experience items (Q6–Q10) showed

α = 0.85

. The overall reliability across all 10 items was

α = 0.89

, indicating high internal consistency for both individual subscales and the entire questionnaire.

Finally, since the participants’ age range was narrow (21–26 years), no significant differences in responses were observed across different ages. Future studies should include a more diverse age group to better examine the potential impact of age on learning effectiveness and user experience.

5. Discussion

The usability study results confirm that the proposed 3D Korean Sign Language (KSL) learning system effectively supports gesture acquisition, with participants reporting notable benefits of 3D visualization in terms of clarity, motivation, and accurate gesture reproduction. These findings are consistent with prior research indicating that immersive visualization can enhance spatial understanding and retention in sign language education. Compared to traditional 2D-based methods, participants regarded the 3D visualization not as an auxiliary feature, but as a central component of the learning process. This suggests that the spatial depth and perspective provided by pseudo-holographic rendering play a critical role in conveying subtle gesture details that might otherwise be overlooked.

Despite these strengths, the lower ratings for Leap Motion convenience (Q7) and interface design (Q9) indicate the need for further refinement. Limitations in hand-tracking robustness, such as occasional occlusions and restricted sensing ranges, may have influenced these evaluations. Similarly, some participants noted that the interface could be made more intuitive and responsive to better support learning flow. Interestingly, correlation analysis revealed no statistically significant relationship between learning effectiveness and user experience, implying that improvements in one domain do not automatically translate to the other. This highlights the necessity of addressing educational content quality and system usability as separate, equally important design goals.

From a pedagogical perspective, the integration of real-time gesture recognition with pseudo-hologram-based visualization demonstrates the potential of combining emerging interaction technologies with language learning frameworks. The controlled experimental setup ensured stable tracking and consistent learning conditions, but also limited the ecological validity of the study. Future evaluations in more varied real-world environments, including classrooms and remote learning contexts, could provide deeper insights into the system’s adaptability.

Overall, the results underscore the promise of the proposed system as an innovative tool for KSL education, while also pointing toward clear directions for future enhancement. These include improving recognition accuracy, refining the interface for better usability, and expanding the dataset to incorporate a wider range of KSL vocabulary, including regional and dialectal variations. Additionally, supporting fingerspelling for letters and numbers could make the system more comprehensive for practical communication scenarios.

6. Conclusions and Future Works

We proposed a 3D Korean Sign Language learning system using pseudo-hologram technology to provide an immersive and intuitive learning environment. The system integrates real-time gesture recognition with 3D visualization, enabling learners to better understand spatial gesture details and maintain engagement during practice.

A controlled user study with 20 participants confirmed the educational benefits of the approach, particularly in clarity, motivation, and gesture reproduction. While overall feedback was positive, the evaluation also revealed usability limitations in tracking performance and interface design, which will be addressed in future iterations.

Future work will aim to improve recognition robustness, explore lightweight models for real-time deployment, and expand the dataset to include diverse vocabulary, fingerspelling, and regional or dialectal variations of Korean Sign Language. Additionally, we plan to investigate transform-based preprocessing methods—such as polynomial-based orthogonal matrix generation [41] and Tchebichef transform coding [42]—to enhance feature extraction efficiency and reduce computational complexity, further supporting real-world deployment.

Author Contributions

Software, N.K. and H.C.; Writing—review & editing, S.L.; Project administration, C.K. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Institute of Information & Communications Technology Planning & Evaluation (IITP)–Innovative Human Resource Development for Local Intellectualization program grant funded by the Korea government (MSIT) (IITP-2025-RS-2023-00260267).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Shukla, A.; Harper, M.; Pedersen, E.; Goman, A.; Suen, J.J.; Price, C.; Applebaum, J.; Hoyer, M.; Lin, F.R.; Reed, N.S. Hearing loss, loneliness, and social isolation: A systematic review. Otolaryngol. Head Neck Surg. 2020, 162, 622–633. [Google Scholar] [CrossRef]
Cheok, M.J.; Omar, Z.; Jaward, M.H. A review of hand gesture and sign language recognition techniques. Int. J. Mach. Learn. Cybern. 2019, 10, 131–153. [Google Scholar] [CrossRef]
Senghas, R.J.; Monaghan, L. Signs of their times: Deaf communities and the culture of language. Annu. Rev. Anthropol. 2002, 31, 69–97. [Google Scholar] [CrossRef]
Washabaugh, W. Sign language in its social context. Annu. Rev. Anthropol. 1981, 10, 237–252. [Google Scholar] [CrossRef]
Deuchar, M. British Sign Language; Routledge: Oxfordshire, UK, 2013. [Google Scholar]
Chuan, C.H.; Regina, E.; Guardino, C. American sign language recognition using leap motion sensor. In Proceedings of the 2014 13th International Conference on Machine Learning and Applications, Detroit, MI, USA, 3–5 December 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 541–544. [Google Scholar]
Mohandes, M.; Aliyu, S.; Deriche, M. Arabic sign language recognition using the leap motion controller. In Proceedings of the 2014 IEEE 23rd International Symposium on Industrial Electronics (ISIE), Istanbul, Turkey, 1–4 June 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 960–965. [Google Scholar]
Xue, Y.; Gao, S.; Sun, H.; Qin, W. A Chinese sign language recognition system using leap motion. In Proceedings of the 2017 International Conference on Virtual Reality and Visualization (ICVRV), Zhengzhou, China, 21–22 October 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 180–185. [Google Scholar]
Demircioğlu, B.; Bülbül, G.; Köse, H. Turkish sign language recognition with leap motion. In Proceedings of the 2016 24th Signal Processing and Communication Application Conference (SIU), Zonguldak, Turkey, 16–19 May 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 589–592. [Google Scholar]
Bird, J.J.; Ekárt, A.; Faria, D.R. British sign language recognition via late fusion of computer vision and leap motion with transfer learning to american sign language. Sensors 2020, 20, 5151. [Google Scholar] [CrossRef] [PubMed]
Avola, D.; Bernardi, M.; Cinque, L.; Foresti, G.L.; Massaroni, C. Exploiting recurrent neural networks and leap motion controller for the recognition of sign language and semaphoric hand gestures. IEEE Trans. Multimed. 2018, 21, 234–245. [Google Scholar] [CrossRef]
Samaan, G.H.; Wadie, A.R.; Attia, A.K.; Asaad, A.M.; Kamel, A.E.; Slim, S.O.; Abdallah, M.S.; Cho, Y.I. Mediapipe’s landmarks with rnn for dynamic sign language recognition. Electronics 2022, 11, 3228. [Google Scholar] [CrossRef]
Sherstinsky, A. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Phys. D Nonlinear Phenom. 2020, 404, 132306. [Google Scholar] [CrossRef]
Rezk, N.M.; Purnaprajna, M.; Nordström, T.; Ul-Abdin, Z. Recurrent neural networks: An embedded computing perspective. IEEE Access 2020, 8, 57967–57996. [Google Scholar] [CrossRef]
Lee, C.K.; Ng, K.K.; Chen, C.H.; Lau, H.C.; Chung, S.Y.; Tsoi, T. American sign language recognition and training method with recurrent neural network. Expert Syst. Appl. 2021, 167, 114403. [Google Scholar] [CrossRef]
Uddin, M.Z.; Boletsis, C.; Rudshavn, P. Real-time Norwegian sign language recognition using MediaPipe and LSTM. Multimodal Technol. Interact. 2025, 9, 23. [Google Scholar] [CrossRef]
Wang, J.; Ivrissimtzis, I.; Li, Z.; Shi, L. The Impact of 2D and 3D Gamified VR on Learning American Sign Language. arXiv 2024, arXiv:2405.08908. [Google Scholar] [CrossRef]
Novaliendry, D.; Budayawan, K.; Auvi, R.; Fajri, B.R.; Huda, Y. Design of Sign Language Learning Media Based on Virtual Reality. Int. J. Online Biomed. Eng. 2023, 19, 111–126. [Google Scholar] [CrossRef]
Li, Y.; Yu, R.; Shahabi, C.; Liu, Y. Diffusion convolutional recurrent neural network: Data-driven traffic forecasting. arXiv 2017, arXiv:1707.01926. [Google Scholar]
Wang, X.; Sun, S.; Xie, L.; Ma, L. Efficient conformer with prob-sparse attention mechanism for end-to-endspeech recognition. arXiv 2021, arXiv:2106.09236. [Google Scholar]
Rheiner, J.; Kerger, D.; Drüppel, M. From pixels to letters: A high-accuracy CPU-real-time American Sign Language detection pipeline. Mach. Learn. Appl. 2025, 20, 100650. [Google Scholar] [CrossRef]
Zhang, Y.; Jiang, X. Recent Advances on Deep Learning for Sign Language Recognition. Comput. Model. Eng. Sci. (CMES) 2024, 139, 2399. [Google Scholar] [CrossRef]
Karabüklü, S.; Wood, S.; Bradley, C.; Wilbur, R.B.; Malaia, E.A. Effect of sign language learning on temporal resolution of visual attention. J. Vis. 2025, 25, 3. [Google Scholar] [CrossRef]
Najib, F.M. Sign language interpretation using machine learning and artificial intelligence. Neural Comput. Appl. 2025, 37, 841–857. [Google Scholar] [CrossRef]
Walizad, M.E.; Hurroo, M. Sign language recognition system using convolutional neural network and computer vision. Int. J. Eng. Res. Technol. 2020, 9, 59–64. [Google Scholar]
Mistry, J.; Inden, B. An approach to sign language translation using the intel realsense camera. In Proceedings of the 2018 10th Computer Science and Electronic Engineering (CEEC), Colchester, UK, 19–21 September 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 219–224. [Google Scholar]
Chong, T.W.; Lee, B.G. American sign language recognition using leap motion controller with machine learning approach. Sensors 2018, 18, 3554. [Google Scholar] [CrossRef]
Jiang, S.; Sun, B.; Wang, L.; Bai, Y.; Li, K.; Fu, Y. Skeleton aware multi-modal sign language recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 3413–3423. [Google Scholar]
Rastgoo, R.; Kiani, K.; Escalera, S. Sign language recognition: A deep survey. Expert Syst. Appl. 2021, 164, 113794. [Google Scholar] [CrossRef]
Papastratis, I.; Chatzikonstantinou, C.; Konstantinidis, D.; Dimitropoulos, K.; Daras, P. Artificial intelligence technologies for sign language. Sensors 2021, 21, 5843. [Google Scholar] [CrossRef]
Dao, T. Flashattention-2: Faster attention with better parallelism and work partitioning. arXiv 2023, arXiv:2307.08691. [Google Scholar] [CrossRef]
Shah, J.; Bikshandi, G.; Zhang, Y.; Thakkar, V.; Ramani, P.; Dao, T. Flashattention-3: Fast and accurate attention with asynchrony and low-precision. Adv. Neural Inf. Process. Syst. 2024, 37, 68658–68685. [Google Scholar]
Poli, M.; Massaroli, S.; Nguyen, E.; Fu, D.Y.; Dao, T.; Baccus, S.; Bengio, Y.; Ermon, S.; Ré, C. Hyena hierarchy: Towards larger convolutional language models. In Proceedings of the International Conference on Machine Learning, PMLR, Honolulu, HI, USA, 23–29 July 2023; pp. 28043–28078. [Google Scholar]
Setiawan, V.; Thendyliputra, R.; Santami, S.A.; Hansen, D.; Warnars, H.L.H.S.; Ramdhan, A.; Doucet, A. An Interactive Sign Language Based Mobile Application for Deaf People. In Proceedings of the 2023 7th International Conference on Trends in Electronics and Informatics (ICOEI), Tirunelveli, India, 11–13 April 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 1635–1641. [Google Scholar]
Voultsiou, E.; Moussiades, L. A systematic review of AI, VR, and LLM applications in special education: Opportunities, challenges, and future directions. Educ. Inf. Technol. 2025, 1–41. [Google Scholar] [CrossRef]
Schioppo, J.; Meyer, Z.; Fabiano, D.; Canavan, S. Sign language recognition: Learning american sign language in a virtual environment. In Proceedings of the Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems, Glasgow, UK, 4–9 May 2019; pp. 1–6. [Google Scholar]
Jin, G.; Liang, Y.; Fang, Y.; Shao, Z.; Huang, J.; Zhang, J.; Zheng, Y. Spatio-temporal graph neural networks for predictive learning in urban computing: A survey. IEEE Trans. Knowl. Data Eng. 2023, 36, 5388–5408. [Google Scholar] [CrossRef]
Jin, M.; Koh, H.Y.; Wen, Q.; Zambon, D.; Alippi, C.; Webb, G.I.; King, I.; Pan, S. A survey on graph neural networks for time series: Forecasting, classification, imputation, and anomaly detection. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 10466–10485. [Google Scholar] [CrossRef] [PubMed]
Zhou, H.; Zhang, S.; Peng, J.; Zhang, S.; Li, J.; Xiong, H.; Zhang, W. Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual, 11–15 October 2021; Volume 35, pp. 11106–11115. [Google Scholar]
Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Chan, K.H.; Ke, W.; Im, S.K. A general method for generating discrete orthogonal matrices. IEEE Access 2021, 9, 120380–120391. [Google Scholar] [CrossRef]
Im, S.K.; Pearmain, A. Unequal error protection with the H. 264 flexible macroblock ordering. In Proceedings of the Visual Communications and Image Processing 2005, Beijing, China, 12–15 July 2005; SPIE: Bellingham, WA, USA, 2005; Volume 5960, pp. 1033–1040. [Google Scholar]

Figure 1. System configuration.

Figure 2. Standard hand structure from LMC.

Figure 3. Abstract representation of the graph structure used as input for DCRNN.

Figure 4. Model configuration for sign language recognition.

Figure 5. Hardware setup.

Figure 6. User interaction flow for the sign language learning system.

Figure 7. Pseudo-hologram visualization using four-directional camera views (left, right, front, and back).

Figure 8. Interactive menu selection and learning process (‘안녕하세요’ in the right image means ‘Hello’).

Figure 9. Test menu, and real-time pseudo-hologram feedback during sign language gesture testing (‘만나다’ in the right image means ‘to meet’).

Figure 10. Comparison of the size of the training batches of the model.

Figure 11. Loss and accuracy graph of the model set with batch size 64.

Figure 12. Confusion matrices of the sign language recognition results. (Left) DCRNN. (Right) LSTM.

Figure 13. Measured frame rate over 60 s, with an average of 60.01 FPS, demonstrating stable real-time performance.

Table 1. Comparison of studies on sign language.

Study (Ref.)	Focus	Input Device	Recognition Methods	Advantages
Mistry [26]	Sign Language Recognition	Intel RealSense	Support Vector Machine (SVM), Multi-Layer Perceptron (MLP)	Achieving high accuracy in recognizing 26 gestures (alphabet letters)
Chong [27]	Sign Language Recognition	Leap Motion Controller (LMC)	Support Vector Machine (SVM), Deep Neural Network (DNN)	Capable of recognizing both static and dynamic sign language
Avola [11]	Sign Language Recognition	Leap Motion Controller (LMC)	Recurrent Neural Network (RNN)	Collecting finger angles with LMC and achieving high accuracy
Lee [15]	Sign Language Learning	Leap Motion Controller (LMC)	Recurrent Neural Network (RNN)	Comparing LMC with other devices highlights its advantages for sign language learning
Schioppo [36]	VR-based Sign Language Learning	Leap Motion Controller (LMC)	Random Forest (RF)	Developing a VR-based sign language learning application using LMC and a VR headset to enhance immersion and learning
Our Study	Hologram-based Sign Language Learning	Leap Motion Controller (LMC)	Diffusion Convolutional Recurrent Neural Network (DCRNN), ProbSparse Attention	Development of a 3D pseudo-hologram sign language learning system and usability evaluation

Table 2. Number of frames per sign language data.

Sign Language Class	Number of Frames
Thank you	59,523
Meet	59,522
Love	59,441
No	59,466
It hurts	59,436
Hello	59,559
Congratulations	59,560
It’s cold	59,573

Table 3. Model details.

Module	Model Setting
DCRNN Input size (feature matrix): 43 × 4 Output size: 172 × 1	Layers: 2 diffusion GRU layers Input size (adjacency matrix): 43 × 43
ProbSparse Attention	Dimension of key vectors: 172 × 1
	Number of attention heads: 4
	Number of top keys: 20
Output	Architecture: fully connected layers
	Activation function: softmax
	Input size: 516 × 1
	Output size: 8 × 1 (number of classes)

Table 4. Comparison of recognition performance between DCRNN and LSTM.

Model	Precision	Recall	F1-Score
LSTM	0.9438	0.9438	0.9438
DCRNN + Attention	1.0000	1.0000	1.0000

Table 5. Response scores for learning effectiveness.

Question Numbers	Mean	Std	p-Value
Q1	4.25	0.62	4.30 $\times 10^{- 8}$
Q2	4.05	0.74	6.07 $\times 10^{- 6}$
Q3	3.80	0.68	5.80 $\times 10^{- 5}$
Q4	4.10	0.62	3.07 $\times 10^{- 7}$
Q5	3.85	0.57	3.33 $\times 10^{- 6}$

Table 6. Response scores for user experience.

Question Numbers	Mean	Std	p-Value
Q6	4.05	0.73	6.07 $\times 10^{- 6}$
Q7	2.80	0.67	2.14 $\times 10^{- 1}$
Q8	4.05	0.80	1.75 $\times 10^{- 5}$
Q9	3.90	0.53	6.54 $\times 10^{- 7}$
Q10	4.80	0.40	4.53 $\times 10^{- 14}$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kim, N.; Choe, H.; Lee, S.; Kang, C. Design and Implementation of a 3D Korean Sign Language Learning System Using Pseudo-Hologram. Appl. Sci. 2025, 15, 8962. https://doi.org/10.3390/app15168962

AMA Style

Kim N, Choe H, Lee S, Kang C. Design and Implementation of a 3D Korean Sign Language Learning System Using Pseudo-Hologram. Applied Sciences. 2025; 15(16):8962. https://doi.org/10.3390/app15168962

Chicago/Turabian Style

Kim, Naeun, HaeYeong Choe, Sukwon Lee, and Changgu Kang. 2025. "Design and Implementation of a 3D Korean Sign Language Learning System Using Pseudo-Hologram" Applied Sciences 15, no. 16: 8962. https://doi.org/10.3390/app15168962

APA Style

Kim, N., Choe, H., Lee, S., & Kang, C. (2025). Design and Implementation of a 3D Korean Sign Language Learning System Using Pseudo-Hologram. Applied Sciences, 15(16), 8962. https://doi.org/10.3390/app15168962

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Design and Implementation of a 3D Korean Sign Language Learning System Using Pseudo-Hologram

Abstract

1. Introduction

2. Related Work

2.1. Sign Language Recognition

2.2. Sign Language Applications

2.3. Comparison with Prior Works

3. System Design and Implementation

3.1. System Overview

3.2. Data Set and Preprocessing

3.3. Model Design

3.4. System Hardware Setup and User Interaction Flowchart

4. Experiments

4.1. System Demonstration

4.2. Evaluation of Sign Recognition Performance

4.3. Effectiveness and Usability Study

5. Discussion

6. Conclusions and Future Works

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI