1. Introduction
A Human–Machine Interface (HMI) provides a medium through which users communicate with computing systems, supporting the transmission of information and execution of commands. As computing systems become increasingly embedded in everyday environments, HMIs play a pivotal role in interpreting human intent and transforming it into precise machine control, thus determining the overall effectiveness of interactive technologies. Recent research has further expanded the concept of HMI to emphasize human–machine collaboration and mutual adaptation, highlighting the cognitive and behavioral mechanisms underlying user interaction with assistive systems [
1]. This conceptual expansion has been reflected in the technological evolution of HMIs across different application domains.
Early developments in Human–Machine Interfaces (HMIs) were primarily centered on motor control applications, where bioelectrical or mechanical signals were converted into control commands for assistive devices such as robotic arms, prosthetic hands, and powered wheelchairs, which demonstrated that coordinated muscle synergies could be utilized to achieve stable control of multi-DOF systems [
2]. These systems demonstrated the feasibility of translating physiological activity into reliable control outputs, marking an important milestone in the evolution of assistive HMI design [
3,
4,
5,
6]. Driven by these advances, digital and communication-oriented applications emerged, extending conventional control paradigms into more interactive and accessible environments. Virtual cursor control, on-screen joystick interfaces, and speller systems were developed to enable text input and menu selection using simplified input mechanisms, thereby providing alternative communication pathways for individuals with severe motor impairments. These developments marked a clear transition toward digital communication interfaces, where information exchange replaced direct physical control as the primary mode of interaction [
7,
8,
9].
Among various input systems, the keyboard and mouse remain the most common means of user interaction. These devices achieve high efficiency by reliably translating hand movements into digital commands with both speed and accuracy. For users with unrestricted hand movement, they provide substantial convenience, as their operation is intuitive and requires no specialized training. Since keyboards and mouses were designed for users with free hand movement, difficulties may arise when used by those with limited hand movement [
10]. To overcome these limitations, research continues on methods that allow users to input information or express intent without using hand movements. Consequently, interface technologies utilizing bio-signals are gaining attention as a new alternative.
Biomarker-based interface technology refers to the technology that interprets a user’s intent by receiving biological signals as input and controls devices or systems based on this interpretation [
11]. Researchers have diversified their approaches toward bio-signal-based interaction, incorporating physiological and cognitive signals as intuitive communication channels. In this context, modalities such as electroencephalography (EEG), electrooculography (EOG), and electrocardiography (ECG) have emerged as complementary input channels [
12,
13,
14].
Electrooculogram (EOG) is a signal generated by eye movements, representing corneal-retinal potential changes at levels of approximately 50 to 3500
[
15]. Electroencephalogram (EEG) is an electrical signal generated by the cerebral cortex, typically ranging from approximately 0.5 to 100
[
16]. Furthermore, Electromyogram (EMG) is an electrical signal generated during muscle contraction and relaxation, with a maximum signal amplitude reaching 50
to 20
[
17].
Among these, EMG has the advantage of having a larger signal amplitude compared to other bio-signals and can be easily acquired from various body parts. Furthermore, EMG reflects muscle contraction intensity and patterns, enabling more accurate communication of user intent with simple movements. These characteristics make EMG a promising alternative to conventional communication devices such as keyboards and mouses, especially for individuals with limited hand mobility [
18,
19]. In particular, users with congenital limb deficiencies, amputations due to accidents, or impaired motor function due to disease can input characters using only simple muscle contractions through this structure. Because such users often lack the fine motor control required for conventional input devices, there is a growing need for systems that enable reliable communication through minimal muscle activity. To reduce the burden of complex hand movements, such systems often rely on simplified directional commands. By converting muscle contraction signals into digital inputs for four directions (up, down, left, right), characters can be selected in a more simplified manner without complex hand movements. Consequently, an EMG-based speller system with an up, down, left, right input structure can serve as a practical and highly accessible alternative for users with limited hand or arm movement.
To further enhance the practicality of such electromyography-based speller systems, it is essential to consider not only improvements in input and operation methods but also the selection of an efficient keyboard layout suited to the user’s conditions. Over the past decade, various EMG-based speller systems have been developed to explore muscle activity as an alternative communication channel. Early studies primarily focused on simple command recognition or single-letter selection tasks, demonstrating the feasibility of translating muscle contractions into discrete control inputs [
20]. Subsequent research expanded these designs by integrating EMG with EEG or SSVEP signals to increase classification accuracy and target diversity [
21]. However, most of these systems were optimized for English or alphanumeric input and relied on multi-channel setups or complex classifiers, limiting their accessibility and real-world usability.
Notably, few studies have examined language-specific optimization, and research on efficient Korean text input for individuals unable to use their upper limbs remains largely unexplored despite the language’s unique syllabic composition structure. Moreover, previous EMG-based speller studies have primarily utilized upper-limb or facial muscle signals, while no research has yet explored the feasibility of using lower-limb EMG—such as signals from the thigh or calf—for Korean text input. This indicates that prior work has not fully investigated the potential of muscle groups beyond the upper extremities, nor optimized speller architectures for languages with compositional structures like Hangul. To address these limitations, the present study investigates a low-channel, direction-based EMG speller employing the Cheonjiin (pronounced Chun-jee-in) keyboard layout, which enables efficient Hangul composition through minimal muscle activity in the lower limbs.
In Korea, text input is commonly performed using two main keyboard layouts, the QWERTY keyboard and the Cheonjiin keyboard. The widely used QWERTY keyboard layout has the advantage of enabling fast input based on the premise of free hand movement, as all consonants and vowels are assigned to individual keys. Although both layouts are optimized for users capable of precise finger control, individuals with limited motor function require an alternative input method that allows keyboard operation through a small number of simple EMG signals rather than fine hand movements. When the operation is restricted to limited directional inputs like up, down, left, and right, the QWERTY layout’s large number of keys and complex structure can lead to frequent key transitions during input, potentially making it inefficient. Therefore, in such limited input environments, it is necessary to identify a keyboard layout that minimizes unnecessary key movements and enables efficient character input through simple directional commands.
The Cheonjiin keyboard layout provides a promising solution for Hangul text entry under restricted input conditions. Based on the phonetic structure of the Korean language, which was systematically designed by King Sejong, each syllable is composed by combining basic consonant and vowel elements. The Cheonjiin keyboard, designed to efficiently implement this combinational principle, allows the input of all Korean consonants and vowels using only twelve keys—ten of which are dedicated to consonant and vowel composition—thereby reducing key movement and improving spatial efficiency. Unlike the QWERTY layout, where consonants and vowels are placed on separate keys that require frequent finger movements, the Cheonjiin layout enables users to compose both elements within a compact area through sequential directional inputs. For instance, when typing a syllable that includes the farthest consonant and vowel positions on a QWERTY keyboard, users may need to move across as many as eleven key intervals, whereas in the Cheonjiin layout, the same combination can be completed within only four directional inputs. This substantial reduction in key transitions shortens both physical and temporal distances during text entry, allowing users to efficiently generate characters through simple and logical directional sequences and thereby improving overall input efficiency.
This structural characteristic makes it suitable for electromyography-based speller systems, which select characters solely through up, down, left, and right inputs via muscle contraction. It enables high input efficiency even with limited input means. Meanwhile, realizing this input efficiency requires not only keyboard layout design but also signal processing and classification methods capable of effectively distinguishing electromyography signals.
Recent research in developing user interfaces utilizing electromyography (EMG) signals has primarily focused on applying complex preprocessing steps, including deep learning-based classification models and various feature combinations and normalization techniques. For example, this includes studies classifying hand gestures using CNN (Convolutional Neural Network)-based models [
22] or employing LSTM (Long Short-Term Memory)-based neural networks to classify EMG signals into multiple grip gestures across different force levels in amputee subjects [
23]. While these approaches offer high classification accuracy, they have the limitation of consuming significant computational resources [
24]. In contrast, our study proposes a user interface capable of real-time input classification even in low-specification environments. This is achieved by using only simple time-domain features, such as Root Mean Square (RMS) and Slope Sign Changes (SSC), without complex neural network structures or high-dimensional feature combinations.
In our study, simple yet physiologically interpretable time-domain features were employed to enable real-time EMG classification under low-resource constraints. Specifically, features such as root mean square (RMS), slope sign changes (SSC), and peak amplitude were selected for their proven robustness and computational efficiency in representing lower-limb muscle activation patterns.
Our study introduces an EMG-based Cheonjiin speller system developed for individuals with restricted hand mobility. The interface adopts a four-directional control scheme (up, down, left, and right) using surface electromyography (sEMG) signals recorded from the rectus femoris muscle of the thigh and gastrocnemius muscle of the calf. Electrode placement and corresponding movement directions were determined according to anatomical considerations, and input commands were classified in real time by extracting time-domain features such as root mean square (RMS) and slope sign changes (SSC). Incorporating the Cheonjiin keyboard layout, which enables efficient directional control using a minimal number of EMG signals, allows efficient Korean text entry with fewer keystrokes compared to conventional methods. Compared to existing QWERTY-based input methods, the proposed system enhances higher accessibility and practicality for users with limited physical input capacity, thereby extending the applicability of EMG-based speller systems. Furthermore, through the adoption of a low-resource, real-time processing pipeline, the system minimizes computational load while maintaining high classification reliability. This simplicity makes the interface easily adaptable to portable or embedded assistive platforms.
This paper describes the design and implementation process of the proposed system and quantitatively evaluates its input recognition performance and actual Hangul input functionality through two experiments. The first experiment focuses on input action recognition accuracy, while the second centers on actual character input functionality. Results were analyzed based on metrics such as confusion matrix, accuracy, precision, recall, and F1-score. The proposed speller system can serve as an alternative input method for users who find conventional input devices difficult to use due to physical limitations. It also holds potential for future expansion into various assistive technology application environments, such as rehabilitation assistive devices and smart interfaces.
3. Results
3.1. Feature Analysis
As illustrated in
Figure 8, the temporal patterns of the three extracted features—RMS, SSC, and maximum amplitude—clearly represent muscle activation and relaxation phases across channels during continuous directional input sequences. RMS exhibits distinct amplitude peaks corresponding to active contractions, SSC captures fine-grained variations in firing activity, and the maximum amplitude highlights transient bursts of high-intensity contraction. These complementary patterns enable robust detection of active channels through simultaneous thresholding of all three features. When visualized separately for Experiment 1 and Experiment 2, the distributions of these features showed consistent separability among the target commands (e.g., UP, DOWN, SELECT), confirming stable real-time classification during sequential text-entry tasks.
3.2. Experimental Results
Two experiments were conducted to quantitatively evaluate the performance of the EMG-based Cheonjiin keyboard system. In Experiment 1, participants were instructed to perform each of the five commands—up, down, left, right, and select—individually, allowing assessment of recognition accuracy for single inputs. In Experiment 2, participants were tasked with performing multiple commands consecutively to simulate actual character input, thereby enabling evaluation of the system’s stability and consistency under real-world usage conditions.
Experiment 1 was conducted with a total of 3 subjects, each of whom repeatedly performed the 5 input actions, which are ‘up’, ‘down’, ‘left’, ‘right’, and ‘select’. According to the confusion matrix derived from the collected 300 data points, the proposed system demonstrated high classification accuracy across most classes. As shown in
Figure 9a, the system achieved perfect classification for the ‘up’ class, while the ‘left’, ‘right’, and ‘down’ classes also demonstrated high accuracy with only a few misclassifications. In contrast, the ‘select’ class exhibited the lowest accuracy, frequently confusing it with the ‘left’ and ‘up’ classes.
Based on these classification results, the overall average accuracy was 90.0%, precision was 90.7%, recall was 90.0%, and the F1-score was 89.8%. Notably, the ‘up’, ‘left’, and ‘right’ inputs demonstrated high recognition performance, with precision and recall values close to or above 90%. In contrast, the ‘select’ input showed the lowest recall at 71.7%, despite a relatively high precision of 93.5%, indicating that many ‘select’ actions were misclassified as other directional inputs. Analysis of the information transfer rate (ITR) showed that the average ITR for the three subjects was 99.5 bits/min, suggesting that the proposed system ensured sufficient speed for rapid recognition and processing of input actions.
The results of this experiment demonstrate that the system maintains high overall recognition performance despite differences in signal magnitude, duration, and variation patterns among subjects. Furthermore, it was confirmed that adequate classification performance can be achieved using only simple feature extraction algorithms based on SSC and RMS.
Experiment 2 involved collecting a total of 1704 data points by having three subjects repeatedly perform each input action. The multi-class classification performance of the proposed system was evaluated through a confusion matrix, as illustrated in
Figure 9b. The results showed that the ‘up’, ‘down’, ‘left’ and ‘right’ classes were all correctly classified instances, respectively. Misclassifications for these classes were relatively few, with errors mainly distributed across adjacent directional inputs. The ‘select’ class presented relatively low accuracy compared to the other commands, with 504 of 641 instances correctly recognized, and some degree of confusion occurring with the ‘down’, ‘up’, and ‘left’ classes.
The overall average accuracy was 88.2%, indicating that the system maintained practical classification performance under online conditions. Class-specific quantitative metrics recorded an average precision of 87.3%, average recall of 90.6%, and an average F1-score of 88.3%. The ‘up’, ‘left’, and ‘right’ inputs showed consistently high recognition performance. Conversely, the ‘down’ input demonstrated reduced precision of 71.0%, and the ‘select’ input exhibited relatively low recall of 75.8% despite a high precision of 98.4%. These results suggest that some ‘down’ and ‘select’ actions were confused with directional inputs of similar activation patterns. Analysis of the information transfer rate revealed an average of 92.87 bits/min across the three subjects, indicating that despite its simple feature-based algorithm, the proposed system sustained stable and efficient communication in real-world environments.
These results demonstrate that the proposed system maintains stable classification performance overall, even under conditions where users perform multiple actions consecutively as if typing characters. Notably, even when subjects repeatedly performed the same action multiple times, we confirmed that simple feature extraction algorithms based on SSC and RMS alone could effectively distinguish between signals with similar characteristics across classes.
To comprehensively evaluate the input recognition performance and user applicability of the proposed system, the results of Experiment 1 and Experiment 2 were compared. In Experiment 1, conducted with a limited number of inputs and in a controlled environment, the system demonstrated very high classification performance, achieving an overall average accuracy of 90.0%, precision of 95.7%, recall of 90.0%, F1-score of 89.8%, and information transfer rate of 99.5 bits/min. Notably, the inputs ‘up’, ‘left’, and ‘right’ were all perfectly classified, confirming that the system operates with high reliability for basic directional inputs.
In Experiment 2, even as the number of inputs increased and experiments were conducted under more diverse conditions, the overall average accuracy remained at 88.65%, precision at 86.45%, recall at 91.15%, F1-score at 87.8%, and information transfer rate at 92.87 bits/min. This is a positive result demonstrating that the system can reliably maintain practical performance levels even as the usage environment expands.
Additionally, to assess the subjects’ subjective cognitive workload, the NASA-TLX questionnaire was administered after the experiment concluded. The results are presented in
Figure 10. This questionnaire is structured to evaluate six items: Mental Demand, Physical Demand, Temporal Demand, Performance, Effort, and Frustration. Across participants, the overall workload scores ranged from 13.7 to 26.3, corresponding to a medium perceived workload level. As shown in
Figure 10a, most individual item scores remained relatively low, generally between 10 and 30 points, with the exception of Physical Demand and Temporal Demand, which reached higher values for certain subjects. The pie chart in
Figure 10b illustrates the relative weighting of each item, showing that Physical Demand accounted for the largest proportion at 26.7%, followed by Performance at 26.4% and Effort at 24.4%. Temporal Demand showed the lowest contribution, and both Mental Demand and Frustration were also minor components of the overall workload. These results confirm that while the proposed input system may impose some degree of physical burden, particularly in terms of muscular effort, it does not generate excessive cognitive load, time pressure, or emotional stress. Overall, participants perceived the workload as moderate and manageable, suggesting that the system can be adopted without substantial difficulty.
4. Discussion
The proposed EMG-based Cheonjiin speller demonstrated stable and accurate performance across both experimental conditions. The system achieved overall accuracy near 89% and maintained comparable precision and recall values between simple and continuous input tasks, confirming its robustness during real-time operation. The average information transfer rate (ITR) of 96.19 bits/min indicates that the system enables interactive communication while operating with only two EMG channels. This level of performance is comparable to previously reported EMG-based interfaces (40–90 bits/min) [
24] and approaches that of mid-level SSVEP-based BCI spellers (80–100 bits/min) [
22,
23].
These findings suggest that the proposed system can achieve competitive throughput while maintaining low computational cost and real-time responsiveness, as evidenced by the mean command interval of approximately 2.2 s. Moreover, the results reflect consistent user–machine coordination and minimal interaction conflict, underscoring the system’s practical applicability even under low-resource hardware conditions [
1].
It demonstrates that stable classification of up/down, left/right, and selection inputs is achievable using only simple time-domain features based on RMS and SSC, without complex deep learning models. The observed results demonstrate that the selected time-domain features effectively capture the physiological characteristics of muscle activation relevant to directional control. In particular, RMS exhibited clear increases corresponding to contraction intensity, indicating its role as a direct measure of muscle activation strength. SSC, in contrast, revealed more dynamic fluctuations, reflecting transient variations in muscle fiber recruitment and firing behavior during directional transitions. These complementary patterns were especially evident between the rectus femoris and gastrocnemius muscles, whose activation dynamics differed across up, down, left, and right commands. Together, RMS and SSC enabled consistent differentiation of movement directions while maintaining low computational complexity. This confirms that the chosen features not only simplify the real-time processing pipeline but also retain sufficient discriminative power for reliable classification in practical EMG-based communication systems.
Compared to recent EMG-based speller systems that utilize multi-channel setups or complex frequency-domain features [
21,
36], the proposed system demonstrates that comparable accuracy can be achieved with only two channels and simple time-domain descriptors. This finding underscores the potential of the system as a minimal yet effective approach for real-time text input, particularly in low-resource or embedded environments.
While previous studies typically employed deep learning-based models for EMG signal analysis and classification [
37,
38,
39], such approaches often require large datasets, high computational resources, and extended training times, which limit their feasibility for real-time or embedded applications [
40]. In contrast, the present study effectively distinguished similar EMG signals using simple time-domain features such as RMS and SSC, demonstrating that reliable classification can be achieved with minimal computational cost. This lightweight feature-based approach aligns with findings from prior research on efficient EMG-based control paradigms [
41,
42], confirming its suitability for portable assistive systems.
Furthermore, this system provides a practical alternative input method for users with limited hand mobility. By adopting the Cheonjiin layout, which has fewer keys and shorter key travel distances compared to the QWERTY layout, Korean input is possible using only directional control operations. Minimizing input actions is expected to reduce user burden. Electrode placement and input actions were set based on anatomical structures, maintaining high consistency and accuracy despite differences in signal strength and variation patterns between subjects.
Despite the overall high accuracy observed in both experiments, some misclassifications were identified—particularly in the “select” command, which was occasionally recognized as “up” or “left.” This phenomenon is likely attributed to the challenge of isolating specific lower-limb muscles, as some participants unintentionally activated adjacent muscles during contraction. Even when participants intentionally attempted to activate the muscles corresponding to “select,” the signal from one muscle (e.g., the rectus femoris) could be detected earlier or more strongly than its counterpart (e.g., the gastrocnemius), resulting in an unbalanced feature pattern. Additionally, occasional confusion in directional intent was observed when participants rapidly switched between directional and selection commands, indicating a possible timing mismatch between muscle activations and the system’s recognition window.
To mitigate such errors, future implementations could increase the duration of each voluntary contraction or introduce adaptive timing calibration to better align the recognition window with the user’s activation dynamics. Moreover, as users become more accustomed to the directional control scheme through repeated practice, such misclassifications are expected to decrease due to improved motor consistency and proprioceptive control. Therefore, incorporating adaptive calibration and user-specific training sessions may further enhance system reliability.
The present research should be further expanded to broaden its scope and practical applicability. First, the study involved a limited number of healthy participants, which may constrain the generalizability of the results to individuals with motor impairments. Expanding the participant group to include individuals with motor impairments would enhance the generalizability of the findings and provide a more comprehensive evaluation of the system’s clinical applicability. However, recruiting patients with movement disorders within a limited timeframe poses practical challenges, including the need for additional institutional review board (IRB) approval and ethical considerations related to patient safety and consent.
Future studies should investigate how different muscle contraction patterns vary across demographic and physiological factors. For instance, differences between young and middle-aged adults, male and female participants (influenced by muscle mass and fat distribution), and individuals with varying body mass index (BMI) or activity levels could significantly affect EMG amplitude and stability. Comparative analysis among healthy participants, those with mild motor impairments, and those with severe disabilities would provide deeper insights into system adaptability. In addition, systematic evaluation of contraction parameters—such as contraction intensity (e.g., weak 10–20% vs. strong 50–70%), contraction duration (short tap vs. sustained hold), fatigue accumulation across repeated tasks, and sensitivity differences between the rectus femoris and gastrocnemius—would help optimize classifier robustness and improve usability across diverse user groups.
Second, the experimental setup was conducted in a controlled environment, whereas real-world conditions may introduce additional noise or signal variability. Third, while the current study employed static feature parameters, adaptive or context-aware algorithms could further improve robustness under changing physiological states or electrode conditions. Addressing these limitations in future studies will be critical for achieving practical deployment in daily assistive contexts.
Another important consideration for future development is the effect of muscle fatigue on EMG signal stability and classification performance. Since the proposed system relies on voluntary contractions of the rectus femoris and gastrocnemius muscles, prolonged or repeated activation may lead to muscle fatigue, potentially reducing signal amplitude and altering spectral characteristics. This issue is particularly relevant for lower-limb muscles, which may fatigue faster under repetitive activation compared to smaller upper-limb groups. Prior research has demonstrated that fatigue induces a shift in the median frequency of EMG signals, which is closely related to changes in the zero-crossing rate and slope sign change (SSC) features [
43]. Because SSC was one of the key features used in our study, such variability could influence the reliability of long-term classification. Therefore, future studies should investigate adaptive feature selection or fatigue-compensation mechanisms to ensure stable performance during extended use sessions.
From a software architecture perspective, the current implementation also presents opportunities for further development. First, the entire signal acquisition, feature extraction (RMS, SSC, and peak amplitude), and classification pipeline was implemented in Python, which—while highly productive and convenient for rapid prototyping—carries inherent overhead. For example, the interpreted nature of Python, the Global Interpreter Lock (GIL), and non-deterministic garbage collection can introduce latency or jitter, especially under strict real-time constraints [
44]. Secondly, while our latency (~2.17 s) is acceptable for the targeted interface scenario, it remains relatively coarse compared to hard real-time control systems where latencies of a few milliseconds or hundreds of microseconds are required [
45].
In future work, we plan to explore a hybrid architecture in which latency-critical components (e.g., real-time signal preprocessing and event triggering) are implemented in a compiled language such as C or C++ and invoked from Python only for higher-level tasks. This would reduce end-to-end latency and improve determinism under wearable or embedded deployment.
While the system performs well with simple time-domain features, future research could explore advanced feature extraction or hybrid modeling techniques to further improve classification performance. Approaches such as wavelet transform-based time–frequency analysis or combined statistical–spectral features have shown promise in capturing both transient and steady-state components of EMG signals [
29,
46]. Integrating such hybrid representations with lightweight classifiers could enhance accuracy while maintaining real-time feasibility. This approach facilitates model lightweighting and power consumption reduction, holding significant promise for future applications in portable assistive devices or low-specification environments.
Future technical refinements hold potential to further enhance the system’s practicality and accessibility. For instance, integrating adaptive filtering algorithms capable of real-time noise estimation [
47], or implementing calibration-free signal normalization schemes [
48], could improve robustness across sessions. Hardware improvements, such as wireless sensor integration and flexible electrode materials [
49], may also increase portability and user comfort. Furthermore, by utilizing diverse sensory channels like vision, hearing, and touch to enable users to intuitively perceive the system’s responses, it is anticipated that the system could be expanded into a comprehensive Augmentative and Alternative Communication (AAC) system accessible to users with visual or auditory impairments.
Ultimately, our study demonstrates the potential for electromyography (EMG)-based interaction technology to substantially enhance information accessibility and freedom of expression for users with physical limitations. It can serve as foundational data for future user-centered interface design.