Mouse Data Defence Technology Using Machine Learning in Image-Based User Authentication: Based on the WM_INPUT Message

Jung, Wontae; Kim, Jinwook; Lee, Kyungroul

doi:10.3390/electronics15010016

Open AccessArticle

Mouse Data Defence Technology Using Machine Learning in Image-Based User Authentication: Based on the WM_INPUT Message

by

Wontae Jung

¹,

Jinwook Kim

² and

Kyungroul Lee

^3,*

¹

Consulting Business Division Pentest Team, A3 Security Co., Ltd., Seoul 07281, Republic of Korea

²

Interdisciplinary Program of Information & Protection, Mokpo National University, Muan 58554, Republic of Korea

³

School of Computer Science and Engineering, Information Security Major, Mokpo National University, Muan 58554, Republic of Korea

^*

Author to whom correspondence should be addressed.

Electronics 2026, 15(1), 16; https://doi.org/10.3390/electronics15010016 (registering DOI)

Submission received: 12 October 2025 / Revised: 11 December 2025 / Accepted: 12 December 2025 / Published: 19 December 2025

(This article belongs to the Special Issue Emerging Technologies for Network Security and Anomaly Detection)

Download

Browse Figures

Versions Notes

Abstract

In personal computers, data is input through devices such as keyboards and mice, and various services are received from the internet. To provide these online services, secure user authentication methods are essential. Knowledge-based authentication methods, such as PINs or passwords, have been widely implemented in most services due to their ease of implementation. However, security threats such as brute-force attacks, phishing attacks, and keyboard data attacks that intercept sensitive user information have emerged. To counter these security threats, image-based authentication methods using mouse input were introduced. However, vulnerabilities arose when functions like GetCursorPos() or WM_INPUT messages were used, allowing mouse input data to be intercepted, thereby undermining image-based authentication. To defend against these attacks, counter-defence methods were developed to generate fake mouse data, protecting actual mouse data. With the advent of these defence methods, there has been a demand for attack methods to classify fake and real mouse data. Recently, machine learning-based methods have been employed on the attacker’s side to classify real mouse data, effectively distinguishing fake from real mouse data and compromising the security of image-based authentication methods. Therefore, this paper proposes a defence technology to safely protect mouse data from theft attacks using machine learning, specifically leveraging Generative Adversarial Networks (GANs). To achieve the goal of this defence technology, the distribution of fake mouse data generated using GANs was analyzed, verifying the feasibility of mouse defence methods. In summary, a system incorporating the defence technology was constructed, and a dataset containing both fake and real mouse data was created. Based on the constructed environment, the performance of the mouse data defence technology was evaluated. The results showed that it reduced performance by up to 37% in the dataset with the highest performance of existing machine learning-based attack methods. This study concludes that the proposed mouse data defence technology effectively addresses vulnerabilities and security threats related to user authentication information in various services relying on image-based authentication methods.

Keywords:

image-based authentication technology; mouse data; personal data protection; machine learning

1. Introduction

With the advent of the digital age, the use of the internet has increased, and online services have become essential in daily life, from young children to the elderly [1]. In particular, to provide online services, methods for user authentication are essential on the internet. As a result, knowledge-based authentication methods, such as PINs or passwords, possession-based authentication methods, such as security cards or smart cards, and biometric authentication methods, such as fingerprints or retina scans, have emerged [2]. Among these user authentication methods, knowledge-based methods like PINs or passwords have been widely implemented in most online services due to their ease of use. However, security threats such as brute-force attacks, phishing attacks, and keyboard data attacks that intercept sensitive user information have emerged [3]. To counter these security threats, image-based authentication methods using mouse input were introduced.

Despite these efforts, vulnerabilities emerged when functions like GetCursorPos() or WM_INPUT messages were used, enabling the interception of mouse input data and thus compromising the security of image-based authentication methods [4,5,6]. In response to these attacks, countermeasures were developed to generate fake mouse data to protect actual mouse data. These methods use functions like SetCursorPos() or WM_INPUT messages to generate fake mouse data, thereby protecting the real mouse data [7,8,9]. In other words, the defence program generates arbitrary fake mouse data and uses functions like SetCursorPos() or WM_INPUT messages to send this data to the attack program. As a result, even if the attack program intercepts mouse data using GetCursorPos() or WM_INPUT messages, it cannot distinguish between fake and real mouse data. This effectively protects the real mouse data, improving the security of image-based authentication methods.

As these defence methods were developed, there was a demand for attack techniques capable of classifying fake and real mouse data. Recently, machine learning-based attack methods have emerged that can classify real mouse data, effectively distinguishing between fake and real mouse data, which compromises the security of image-based authentication methods [10]. Therefore, this paper proposes a defence technology to protect mouse data from theft attacks using machine learning, specifically leveraging Generative Adversarial Networks (GANs). By generating fake mouse data that closely resembles real mouse data using GANs, the proposed defence technology prevents attackers from classifying fake and real mouse data, even when using machine learning-based methods, and thus protects the mouse data.

The contributions of this study are as follows:

This paper analyzes existing mouse data attack methods using machine learning, based on prior research and datasets used in these attacks, and proposes a GAN-based technology to protect mouse data. The proposed technology reduces the success rate of mouse data attacks, thereby ensuring the safer protection of user authentication information.
The paper uses CTGAN (Conditional Tabular GAN), a type of GAN, to generate 2D data and analyzes the protection method for mouse data. This approach is novel and unique in the context of mouse data protection.
In prior research, the maximum success rate of machine learning-based mouse data attacks was 99%. However, the defence technology proposed in this paper reduced the attack success rate by up to 37%. This means the attacker now only achieves a 63% success rate, and since mouse data involves continuous coordinates, if the attacker cannot capture consecutive coordinates, they cannot succeed in authenticating, thus enhancing security.

The structure of this paper is as follows. Section 2 introduces related studies and research motivations, discussing existing mouse data attack and defence methods. Section 3 describes the proposed mouse data defence technology using GANs and outlines the experimental configuration. Section 4 analyzes and discusses the experimental results. Section 5 presents a discussion of the results, and Chapter 6 concludes the paper.

2. Related Research and Research Motivation

The mouse input device interacts with the user by moving the cursor on the screen according to the movement of the mouse device. To address security threats in password-based user authentication methods, where keyboard data leakage undermines security, image-based authentication methods using mouse data have emerged. These image-based methods must protect both the images displayed on the screen and the mouse data used for clicks on those images. However, similar to password-based methods, image-based authentication methods have also faced security threats due to the leakage of image and mouse data. As a result, various attack and defence methods have been researched in this context. The goal of this paper is to propose a defence technology to protect mouse data, which is fundamentally required to be secured in image-based authentication methods.

Recently, several attack and defence methods related to mouse data have been studied, with a representative attack method involving the interception of mouse data using the WM_INPUT message. This attack method exploits the functionality where an attack program registers the WM_INPUT message provided by the Windows operating system. When mouse data is entered, the registered handler is called, allowing the attacker to intercept the data. The handler collects relative coordinates and, after obtaining the initial absolute coordinates of the screen, continues to collect relative coordinates. By doing so, the attacker can track the user’s mouse cursor trajectories [11]. In response to this attack, defence methods have been developed where the defence program generates fake mouse data and sends this fake data to the attack program along with the real mouse data using the WM_INPUT message. This prevents the attack program from classifying the real mouse data from the collected both fake and real data, as evaluated through experiments.

In [12], an authentication system based on mouse movement data was proposed to assess the effectiveness of user authentication systems. By analyzing and extracting features from various mouse movements, such as movement, clicks, and drag-and-drop actions, the system aims to enhance security by distinguishing between normal and abnormal behavior, thereby improving security.

In [13], another study focused on preventing insider threats by authenticating users based on their mouse usage patterns. To extract these patterns, mouse data was trained using deep learning models like Convolutional Neural Networks (CNN), and the models were evaluated using performance metrics such as accuracy, FAR (False Acceptance Rate), and FRR (False Rejection Rate). The results demonstrated that users were authenticated with a very low error rate, which is expected to reduce serious security threats, such as data leakage from insiders. In [14], data was collected from 20 participants based on mouse click streams, and 87 features were extracted to train a CNN model. The results showed a continuous user authentication accuracy of 98.8%, which is expected to detect anomalous behavior.

In prior research on mouse data attack methods [15], a method was proposed to classify fake and real mouse data using machine learning-based models in a scenario where defence methods utilizing the WM_INPUT message were applied. To evaluate the proposed method, an attack system was built to intercept mouse data, and a dataset was created by collecting the mouse data from this system. The method was tested through various configurations based on different datasets, features, and generation cycles to enhance classification performance. In this study, the classification models for mouse data attacks included KNN, Logistic Regression, Decision Tree, Random Forest, Gradient Boosting, and MLP. The results demonstrated that machine learning-based models could classify real mouse data with over 99% accuracy, meaning that even with defence methods in place, the attacker could still successfully intercept user authentication data. Detailed performance evaluation revealed that the highest-performing datasets were those with a 500 ms generation cycle (datasets 4-1 to 4-7), while the lowest-performing datasets were those with a 50 ms generation cycle (datasets 1-1 to 1-7). Based on these results, this paper proposes a defence technology focused on datasets with both the highest and lowest performance in the previous research.

As described above, recent studies have employed machine learning-based methods to extract features from mouse data and apply them to enhance security, such as in image-based authentication. Therefore, these authentication methods that rely on mouse data require effective protection to ensure that users’ mouse cursor trajectories are not exposed to attackers. The need to secure mouse data against such security threats serves as the motivation for this paper.

3. Proposed Technology and Dataset

In prior research, attackers have proposed a method that utilizes machine learning models to effectively classify fake mouse data and real mouse data, thereby stealing user authentication information. This method has been verified through experiments. In response to this, to prevent the theft of user authentication information, the goal of this paper is to propose a technology for securely protecting mouse data from mouse data attacks using a machine learning-based classification method. Specifically, a technology for generating fake mouse data that is similar to real mouse data using GANs is proposed to defend against such attacks.

3.1. Pre-Validation of the Proposed Technology

To pre-validate the defence capability of the proposed technology for protecting mouse data, we compare the distribution of the fake mouse data generated by the GAN-based method with the distribution of real mouse data. The GAN-based method uses the CTGAN model [16,17] to generate fake mouse data similar to real mouse data, and the distribution of the X coordinates for both fake and real mouse datasets is shown in Figure 1.

For the pre-validation, the data ratio used was 1:1, with real mouse data represented in blue and fake mouse data in red. The results show that the fake mouse data closely resembles the distribution of real mouse data. Therefore, by utilizing the fake mouse data generated by the GAN-based method, it is expected that the attacker’s machine learning-based classification method may not be capable of distinguishing between real and fake mouse data.

This paper proposes a defence technology to protect mouse data from the perspective of the defender. Based on this proposed technology, fake mouse data similar to real mouse data is generated using a GAN-based method. Finally, the defence performance of the mouse data is evaluated by applying the attacker’s machine learning-based classification method from previous research to datasets containing both real and fake mouse data.

3.2. Structure of the Proposed Defence Technology

The proposed defence technology generates fake mouse coordinates, G1, G2, …, Gn, similar to real mouse coordinates, A1, A2, …, An, based on the collection of real mouse data input from the mouse device. An attacker then collects both real and fake mouse coordinates, A1, G1, G2, A2, …, An, Gn, and attempts to classify the datasets using a machine learning-based method to steal user authentication information. However, because the generated fake coordinates closely resemble real mouse data, they interfere with the classification process, reducing the attack’s effectiveness. The goal of the defence technology is to hinder the classification of the two datasets, ultimately decreasing the attack’s performance. This defence technology is represented in Figure 2.

Dataset Configuration of Experiments

In this section, we describe the dataset composition used to evaluate the performance of the proposed defence technology against mouse data attacks, based on the generation of fake mouse data. To generate the fake mouse data, we utilize datasets from a previous study on mouse data attack methods. Specifically, we select the dataset with the lowest attack performance, which has a generation period of 50 ms (datasets 1-1 to 1-7), and the dataset with the highest attack performance, which has a generation period of 500 ms (datasets 4-1 to 4-7). This selection ensures that the defence technology is evaluated under the same conditions as the attack method. Accordingly, we reconstruct the datasets by generating fake mouse data using a GAN-based method, based on real mouse data input by users. The reconstructed datasets include datasets 1-1 to 1-7 and datasets 4-1 to 4-7, with each dataset containing 20,000 data points. The overall dataset composition for the experiments is presented in Table 1.

In the constructed datasets, the real mouse coordinates refer to actual mouse data input via a mouse device, while the fake mouse coordinates represent artificially generated mouse data using GANs. The number of real mouse data points in datasets 1-1 to 1-7 are 16,003, 14,009, 12,005, 10,282, 8011, 6000, and 4004, respectively, while in datasets 4-1 to 4-7, they are 16,042, 14,080, 12,021, 10,009, 8319, 6068, and 4049, respectively.

Similarly, the number of fake mouse data points in datasets 1-1 to 1-7 are 3997, 5991, 7995, 9718, 11,989, 14,000, and 15,996, while in datasets 4-1 to 4-7, they are 3958, 5920, 7979, 9991, 11,681, 13,932, and 15,951, respectively. The ratios in the table represent the proportion of real to fake mouse data, with each dataset following a structured ratio progression from 8:2 to 2:8.

To ensure that the experimental results are derived under the same conditions as the previous study on mouse data attacks, we adopt identical features and hyperparameters. The selected features include the collection time, X coordinate, and Y coordinate from dataset 1-1. Moreover, since machine learning models are sensitive to hyperparameters, which significantly affect performance [18], we use the same hyperparameters as in the attack method to maintain consistency in experimental conditions. The previous study determined the optimal hyperparameters for mouse data attacks based on dataset 1-1, and the optimal hyperparameters and performance evaluation results are presented in Table 2.

4. Experimental Results

To evaluate the performance of the proposed mouse data defence technology, we assessed the performance based on the constructed datasets (1-1 to 1-7 and 4-1 to 4-7) with varying feature compositions. As in the previous study on mouse data attacks, the first experiment presents the performance evaluation results for the dataset defined by the features of mouse event occurrence time, X-coordinate, and Y-coordinate. The second experiment presents the performance evaluation results for the dataset defined by the features of mouse event occurrence time, X-coordinate distance, and Y-coordinate distance. Finally, the third experiment presents the performance evaluation results for the dataset defined by the features of mouse event occurrence time, X-coordinate, Y-coordinate, X-coordinate distance, and Y-coordinate distance. By comparing and analyzing the comprehensive performance evaluation results based on various features and datasets, we identify the best feature set with the highest defence performance.

The machine learning models used for performance evaluation were K-Nearest Neighbors (KNN) [19], Logistic Regression [20], Decision Tree [21], Random Forest [22], Gradient Boosting [23], and Multilayer Perceptron (MLP) [24], which were also used in the previous study on mouse data attacks.

4.1. First Experiment Results of the Mouse Data Defence Technology

The first experiment analyzes the performance in terms of cross-validation, accuracy, precision, recall, F1-score, and AUC for the datasets 1-1 to 1-7 and 4-1 to 4-7, with the features of mouse event occurrence time, X-coordinate, and Y-coordinate [25]. To compare the attack and defence performance of mouse data, the results for each dataset are shown in Figure 3 and Figure 4, respectively.

In the first experiment, the performance evaluation results of datasets 1-1 through 1-7 (with a generation interval of 50 ms) and datasets 4-1 through 4-7 (with a generation interval of 500 ms) were analyzed using cross-validation, accuracy, precision, recall, F1-score, and AUC. For the 50 ms generation interval, most datasets showed degraded performance compared to the original attack performance, indicating that the defence mechanism was effective. Among these, dataset 1-2 exhibited the highest defence performance, and the model with the best performance was MLP. In contrast, dataset 1-3 showed the lowest defence performance, with logistic regression being the model that performed the worst.

Similarly, for the 500 ms generation interval, most datasets also showed degraded performance compared to the original attack, confirming the effectiveness of the defence. In this setting, dataset 4-2 demonstrated the highest defence performance, with logistic regression achieving the best performance. Conversely, dataset 4-6 exhibited the lowest defence performance, and the worst-performing model was the gradient boosting model.

To summarize the performance evaluation results of the first experiment, all datasets generally exhibited lower performance than the original attack, confirming that the defence was effective. For the 50 ms generation interval, dataset 1-2 had the highest defence performance, and dataset 1-3 had the lowest. The best-performing model was the multilayer perceptron, while the worst-performing model was logistic regression. For the 500 ms generation interval, dataset 4-2 achieved the highest defence performance, and dataset 4-6 the lowest. The best-performing model was logistic regression, and the worst-performing was the gradient boosting model. All performance evaluation results from the first experiment are summarized in Table 3.

4.2. Second Experiment Results of the Mouse Data Defence Technology

In the second experiment, the performance of the full dataset—defined by features such as timestamp, distance between X coordinates, and distance between Y coordinates—was evaluated using cross-validation, accuracy, precision, recall, F1-score, and AUC. The results are presented in Figure 5 and Figure 6.

In the second experiment, the performance evaluation results—based on cross-validation, accuracy, precision, recall, F1-score, and AUC—for all datasets were analyzed. For datasets 1-1 through 1-7 with a generation interval of 50 ms, most performance metrics were degraded compared to the original attack results, indicating the potential effectiveness of the defence technology. In this setting, dataset 1-6 exhibited the highest defence performance, and the best-performing model was MLP. Conversely, dataset 1-3 showed the lowest defence performance, with gradient boosting being the worst-performing model.

Similarly, for datasets 4-1 through 4-7 with a generation interval of 500 ms, most performance metrics also showed degradation compared to the original attack performance. In this case, dataset 4-2 achieved the highest defence performance, and the best-performing model was logistic regression. On the other hand, dataset 4-4 showed the lowest defence performance, and gradient boosting was again the worst-performing model.

In summary, the results of the second experiment indicate that defence was effective in most cases, as performance generally decreased across all datasets compared to the original attack. For the 50 ms generation interval, dataset 1-6 showed the highest defence performance, and dataset 1-3 the lowest. The MLP model achieved the best performance, while gradient boosting showed the weakest. For the 500 ms generation interval, dataset 4-2 had the highest defence performance, and dataset 4-4 the lowest. The best-performing model was logistic regression, while the worst-performing model was again gradient boosting. The detailed performance evaluation results of the second experiment are summarized in Table 4.

4.3. Third Experiment Results of the Mouse Data Defence Technology

The third experiment evaluated the performance of the full dataset—defined by features such as timestamp, X coordinate, Y coordinate, distance between X coordinates, and distance between Y coordinates—using cross-validation, accuracy, precision, recall, F1-score, and AUC. The results are presented in Figure 7 and Figure 8.

In the third experiment, the performance of all datasets was evaluated using cross-validation, accuracy, precision, recall, F1-score, and AUC. For the datasets 1-1 through 1-7 with a generation interval of 50 ms, most evaluation results showed degraded performance compared to the original attack, indicating that the proposed defence was effective. In this setting, dataset 1-2 exhibited the highest defence performance, and the best-performing model was MLP. Conversely, dataset 1-7 showed the lowest defence performance, with gradient boosting being the worst-performing model.

For the datasets 4-1 through 4-7 with a generation interval of 500 ms, a similar trend was observed—most performance metrics decreased compared to the original attack. In this case, dataset 4-2 showed the highest defence performance, and the best-performing model was logistic regression. On the other hand, dataset 4-6 had the lowest defence performance, with gradient boosting once again being the worst-performing model.

In summary, the results of the third experiment confirmed that most datasets exhibited lower performance than the original attack, suggesting that the defence was generally effective. For the 50 ms generation interval, dataset 1-2 had the highest defence performance and dataset 1-7 the lowest. The best-performing model was MLP, while the worst-performing was gradient boosting. For the 500 ms generation interval, dataset 4-2 showed the highest defence performance and dataset 4-6 the lowest. The best-performing model was logistic regression, and the worst-performing model was again gradient boosting. The complete performance evaluation results of the third experiment are summarized in Table 5.

4.4. Overall Performance Evaluation of the Proposed Mouse Data Defence Technology According to Feature Sets and Datasets

To summarize the results of all experiments based on the characteristics of the proposed mouse data defence technology and the corresponding datasets, performance was analyzed across three experiments using datasets 1-1 through 1-7 and datasets 4-1 through 4-7. The first experiment used datasets defined by features such as timestamp, X coordinate, and Y coordinate. The second experiment used features including timestamp, distance between X coordinates, and distance between Y coordinates. The third experiment used a combined feature set consisting of timestamp, X coordinate, Y coordinate, and the distances between X and Y coordinates.

Analysis of the performance evaluation results across the three experiments showed that datasets 1-1 through 1-7 (with a generation interval of 50 ms) generally exhibited lower performance, while datasets 4-1 through 4-7 (with a generation interval of 500 ms) achieved higher performance. Among the three experiments, the third—utilizing the combined feature set—yielded the highest overall performance.

Compared with previous studies on mouse data attack methods using machine learning, the proposed defence technology in this study was shown to degrade the performance of most attacks effectively. The best-performing generation interval was 500 ms, and the most effective feature combination was the one used in the third experiment (timestamp, X coordinate, Y coordinate, distance between X coordinates, and distance between Y coordinates).

In terms of dataset-specific performance based on the third experiment (the most effective setting), the dataset with the highest performance under the 50 ms generation interval was dataset 1-7, while dataset 1-5 showed the lowest performance. For the 500 ms generation interval, dataset 4-7 had the highest performance, whereas dataset 4-2 had the lowest.

In the model-wise performance evaluation, the best-performing model was the gradient boosting model, while the logistic regression model showed the lowest performance. The overall performance evaluation results of the proposed mouse data defence technology, according to the feature sets and datasets, are summarized in Table 6.

This section compared and evaluated the performance of the proposed mouse data defence technology based on different feature sets and datasets. Summarizing the overall performance evaluation results, the dataset that achieved the highest performance was Dataset 4-7, among the datasets with a generation interval of 500 ms. The gradient boosting model yielded the best performance in this setting. In contrast, Dataset 4-2 showed the lowest performance among the 500 ms datasets, with logistic regression being the worst-performing model.

Among the datasets with a 50 ms generation interval, the highest performance was observed in Dataset 1-7, again when using the gradient boosting model, while the lowest performance was found in Dataset 1-5, with logistic regression showing the weakest results.

In conclusion, the proposed mouse data defence technology generally resulted in performance degradation of the attack models presented in previous studies. Therefore, it can be concluded that the proposed technology effectively protects mouse data from such attacks.

5. Discussion

5.1. Evaluation Results Based on Performance Changes

In this section, to demonstrate the effectiveness of the proposed defence technology, we compared and evaluated the performance of the mouse data defence technology against the mouse data attack method from previous research by analyzing their performance change rates. For this comparison, we selected two datasets from previous studies: the one with the lowest attack performance (generation interval of 50 ms) and the one with the highest attack performance (generation interval of 500 ms).

Among the datasets with a generation interval of 50 ms, Dataset 1-5 showed the lowest defence performance, while Dataset 1-7 showed the highest. For the 500 ms datasets, Dataset 4-2 had the lowest performance, whereas Dataset 4-7 had the highest.

To ensure a consistent comparison of performance change rates, we used the same datasets as those in the previous attack study. The performance changes for both the attack and defence methods were computed and are summarized in Table 7 and Table 8, respectively.

The comparison of performance change rates was conducted based on machine learning models, datasets, and evaluation metrics including accuracy, precision, recall, F1-score, and AUC. The purpose was to compare the performance of the proposed mouse data defence technology with the attack performance from previous research.

For the dataset with the lowest performance at a generation interval of 50 ms (Dataset 1-5), most models showed performance degradation compared to the original attack performance. Among them, the MLP model exhibited the largest drops across all metrics: accuracy decreased by 37%, precision by 46%, recall by 55%, F1-score by 51%, and AUC by 30%. Thus, the MLP was identified as the model with the most significant performance decline for Dataset 1-5.

For the best-performing dataset at 50 ms (Dataset 1-7), performance was still generally degraded. The MLP model again showed the largest decreases: accuracy decreased by 19%, precision by 40%, recall by 89%, F1-score by 81%, and AUC by 19%. Therefore, in the 50 ms interval, the MLP was consistently the most affected model in terms of performance degradation.

In the case of the dataset with the lowest performance at a generation interval of 500 ms (Dataset 4-2), the logistic regression model exhibited the greatest decreases, with accuracy dropping by 26%, precision by 27%, recall remaining unchanged, F1-score decreasing by 16%, and AUC dropping by 42%.

For the best-performing 500 ms dataset (Dataset 4-7), the logistic regression model again showed the largest performance drops: accuracy decreased by 20%, precision by 58%, recall by 97%, F1-score by 94%, and AUC by 31%.

By comparing the results of the machine learning-based mouse data attack from prior work and the proposed GAN-based defence technology, it was confirmed that the defence method effectively reduced model performance, thereby protecting mouse data. The summary of performance change rates for the models with the highest degradation is presented in Table 9.

By analyzing the summary of performance change rates for the models with the greatest degradation, it was found that the MLP model experienced the most significant performance drops in Dataset 1-5, with a 37% decrease in accuracy, 46% in precision, 55% in recall, 51% in F1-score, and 30% in AUC. Likewise, in Dataset 1-7, the MLP again showed the largest decreases across all metrics, including a dramatic 89% drop in recall and 81% in F1-score.

In Dataset 4-2, the logistic regression model exhibited the largest performance decline, with drops of 26% in accuracy, 27% in precision, 16% in F1-score, and 42% in AUC (recall remained unchanged). Similarly, in Dataset 4-7, logistic regression again showed the most severe performance decrease, including a 97% drop in recall and 94% in F1-score.

In conclusion, the MLP model was most affected at a generation interval of 50 ms, whereas the logistic regression model was most affected at 500 ms, confirming the effectiveness of the proposed defence in reducing attack performance.

5.2. Overall Performance Change Rate Analysis by Model for the Proposed Mouse Data Defence Technology

To analyze performance based on machine learning models, the overall performance change rates for each model are visualized in Figure 9.

As shown in the figure, the mouse data defence technology proposed in this study significantly reduced the performance of machine learning-based mouse data attack models from previous research. In particular, for datasets 1-5 and 1-7 with a generation interval of 50 ms, the MLP model exhibited the greatest performance degradation. For datasets 4-2 and 4-7 with a 500 ms generation interval, the logistic regression model showed the most substantial drop in performance.

Table 10 reports t-statistic, confidence interval, and p-value to validate performance differences across models and datasets, and Table 11 presents a statistical divergence analysis between genuine and GAN-generated synthetic mouse data (KL and JS divergence).

As a result, the proposed technology demonstrates that even if an attacker collects both real and fake mouse data and applies machine learning models to classify real mouse movements, classification accuracy can drop by up to 37%, significantly impairing the attack’s effectiveness. Furthermore, considering that users move the mouse frequently during image-based authentication and that a large volume of mouse data is generated with each movement, adversarial fake data can induce misclassification. This makes it difficult for attackers to extract sufficient valid mouse data required for authentication, ultimately preventing the attacker from successfully capturing the user’s input password.

5.3. Discussion of the Use and Utilization of Synthetic Data

In this article, we employ GAN models to generate synthetic mouse data that closely resembles genuine trajectories. The synthetic data, when mixed with genuine data, serves as a defensive mechanism to confuse an adversary who attempts to steal mouse trajectories using machine learning classification models. The rationale is to deliberately inject synthetic data into genuine mouse streams so as to decrease the success rate of classification attacks targeting genuine movement patterns. On the defender side, a deployed filtering module compares the injected coordinates with the generated coordinates to be rendered before the mouse cursor position is updated on the user’s screen or interface; if the two coincide, the cursor is not moved. Consequently, the synthetic data affects only the data collected by the attacker and has no effect on the user interface. Furthermore, by controlling the generation frequency, injection ratio, and trajectory diversity of the synthetic data, the attacker’s model training can be made substantially more complex. This strategy is applicable to continuous authentication and behavior-based authentication technologies, and it has the advantage of improving security for mouse data and user authentication technologies without modifying a user’s genuine mouse trajectories.

The synthetic data is filtered before the mouse cursor is rendered on the screen or interface during user input. For this reason, from the user’s perspective, cursor accuracy, responsiveness, and usability are not affected. From the system’s perspective, because the defender can control the number of synthetic data points, the generation period, and the classification model complexity for attackers, the overhead of generating and filtering synthetic mouse data can be tuned to a level that does not impact the system. Therefore, the proposed approach injects synthetic mouse data to prevent the attacker from reliably acquiring genuine mouse data, thereby degrading attack performance metrics such as success rate. Our experimental results verify that the classification performance for genuine mouse data, including accuracy, decreases in the presence of injected synthetic mouse data. Accordingly, GAN-based synthetic mouse data increases the difficulty, cost, and time required for the attacker, ultimately protecting the genuine mouse data. In conclusion, the proposed mouse data defence technology effectively protects mouse data.

6. Conclusions

In this paper, we proposed a mouse data defence technology based on generative adversarial networks (GANs) to degrade the performance of prior machine learning-based mouse data attack techniques in image-based authentication systems. Previous research has shown that attackers can classify real and fake mouse data with over 99% accuracy using such techniques. In response, our proposed defence effectively reduces classification accuracy by up to 37% and consistently lowers the overall attack performance.

According to the performance analysis, the MLP model at a 50 ms generation interval and the logistic regression model at a 500 ms generation interval experienced the most significant degradation in attack performance. Most models showed notable performance drops when exposed to adversarial mouse data. Given the high volume of mouse data generated during user interaction and the presence of adversarial noise designed to induce misclassification, attackers face increased difficulty in obtaining meaningful mouse data for authentication purposes.

The results of this study suggest that the proposed technology can enhance the security of mouse data in various fields that rely on image-based authentication, such as finance, enterprise security, and cloud services. By mitigating known vulnerabilities and threats associated with mouse-based inputs, this approach contributes to strengthening the robustness of user authentication systems. In future work, we plan to develop techniques for generating even more realistic adversarial mouse data to further improve the effectiveness of the proposed defence technology.

Author Contributions

Conceptualization, W.J. and K.L.; methodology, W.J. and K.L.; software, W.J.; validation, W.J., J.K. and K.L.; data curation, W.J. and J.K.; writing—original draft preparation, W.J. and K.L.; writing—review and editing, J.K. and K.L.; supervision, K.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Research Fund of Mokpo National University in 2023.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

Author Wontae Jung was employed by the company A3 Security Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

NIA. Survey on the Internet Usage. Available online: https://www.nia.or.kr/site/nia_kor/ex/bbs/View.do?cbIdx=99870&bcIdx=26523&parentSeq=26523 (accessed on 12 October 2024).
Baig, A.F.; Eskeland, S. Security, Privacy, and Usability in Continuous Authentication: A Survey. Sensors 2021, 21, 5967. [Google Scholar] [CrossRef] [PubMed]
Kavya, C.; Suganya, R. Survey on keystroke logging attacks. Int. J. Creat. Res. Thoughts (IJCRT) 2021, 9, 503–508. Available online: https://www.ijcrt.org/papers/IJCRT2104074.pdf (accessed on 12 October 2024).
MSDN. WM_INPUT Message. Available online: https://learn.microsoft.com/en-us/windows/win32/inputdev/wm-input (accessed on 12 October 2024).
MSDN. GetCursorPos Function (winuser.h). Available online: https://learn.microsoft.com/ko-kr/windows/win32/api/winuser/nf-winuser-getcursorpos (accessed on 12 October 2024).
Quang, D.; Martini, B.; Raymond, C.K. The role of the adversary model in applied security research. Comput. Secur. 2019, 81, 156–181. [Google Scholar] [CrossRef]
Oh, I.; Lee, K.; Yim, K. A Protection Technique for Screen Image-based Authentication Protocols Utilizing the SetCursorPos function. In Proceedings of the 18th International Workshop on Information Security Applications (WISA), Jeju, Republic of Korea, 24–26 August 2017; pp. 236–245. [Google Scholar]
MSDN. SetCursorPos Function (winuser.h). Available online: https://learn.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-setcursorpos (accessed on 12 October 2024).
Chong, P.; Elovici, Y.; Binder, A. User Authentication Based on Mouse Dynamics Using Deep Neural Networks: A Comprehensive Study. IEEE Trans. Inf. Forensics Secur. 2019, 15, 1086–1101. [Google Scholar] [CrossRef]
Lee, K.; Lee, S. Improved Practical Vulnerability Analysis of Mouse Data According to Offensive Security based on Machine Learning in Image-Based User Authentication. Entropy 2020, 22, 355. [Google Scholar] [CrossRef] [PubMed]
Pan, X.; Ling, Z.; Pingley, A.; Yu, W.; Zhang, N.; Fu, X. How privacy leaks from bluetooth mouse? In Proceedings of the 2012 ACM Conference on Computer and Communications Security, Raleigh, NC, USA, 16–18 October 2012; pp. 1013–1015. [Google Scholar]
Antal, M.; Egyed-Zsigmond, E. Intrusion Detection Using Mouse Dynamics. IET Biometrics 2019, 8, 285–294. [Google Scholar] [CrossRef]
Hu, T.; Niu, W.; Zhang, X.; Liu, X.; Lu, J.; Liu, Y. An Insider Threat Detection Approach Based on Mouse Dynamics and Deep Learning. Secur. Commun. Netw. 2019, 2019, 3898951. [Google Scholar] [CrossRef]
Almalki, S.; Assery, N.; Roy, K. An Empirical Evaluation of Online Continuous Authentication and Anomaly Detection Using Mouse Clickstream Data Analysis. Appl. Sci. 2021, 11, 6083. [Google Scholar] [CrossRef]
Jung, W.; Hong, S.; Lee, K. Mouse Data Attack Technique Using Machine Learning in Image-Based User Authentication: Based on a Defense Technique Using the WM_INPUT Message. Electronics 2024, 13, 710. [Google Scholar] [CrossRef]
Alabdulwahab, S.; Kim, Y.; Seo, A.; Son, Y. Generating Synthetic Dataset for ML-Based IDS Using CTGAN and Feature Selection to Protect Smart IoT Environments. Appl. Sci. 2023, 13, 10951. [Google Scholar] [CrossRef]
Espinosa, E.; Figueira, A. On the Quality of Synthetic Generated Tabular Data. Mathematics 2023, 11, 3278. [Google Scholar] [CrossRef]
Elgeldawi, E.; Sayed, A.; Galal, A.; Zaki, A.M. Hyperparameter Tuning for Machine Learning Algorithms Used for Arabic Sentiment Analysis. Informatics 2021, 8, 79. [Google Scholar] [CrossRef]
Jin, C.; Luo, Y.; Wu, C.; Song, Y.; Li, D. Exploring the Pedestrian Route Choice Behaviors by Machine Learning Models. Int. J. Geo-Inf. (ISPRS) 2024, 13, 146. [Google Scholar] [CrossRef]
Senapaty, M.; Ray, A.; Padhy, N. A Decision Support System for Crop Recommendation Using Machine Learning Classification Algorithms. Agriculture 2024, 14, 1256. [Google Scholar] [CrossRef]
Strelcenia, E.; Prakoonwit, S. Effective Feature Engineering and Classification of Breast Cancer Diagnosis: A Comparative Study. BioMedInformatics 2023, 3, 616–631. [Google Scholar] [CrossRef]
Vergni, L.; Todisco, F. A Random Forest Machine Learning Approach for the Identification and Quantification of Erosive Events. Water 2023, 15, 2225. [Google Scholar] [CrossRef]
Elvanidi, A.; Katsoulas, N. Performance of Gradient Boosting Learning Algorithm for Crop Stress Identification in Greenhouse Cultivation. Biol. Life Sci. Forum 2022, 16, 25. [Google Scholar] [CrossRef]
Vargas, J.; Oviedo, A.; Ortega, N.; Orozco, E.; Gómez, A.; Londoño, J.M. Machine-Learning-Based Predictive Models for Compressive Strength, Flexural Strength, and Slump of Concrete. Appl. Sci. 2024, 14, 4426. [Google Scholar] [CrossRef]
Kiliç, K.; Sallan, J.M. Study of Delay Prediction in the US Airport Network. Aerospace 2023, 10, 342. [Google Scholar] [CrossRef]

Figure 1. Distribution of Real Mouse Data and Fake Mouse Data Generated by GANs (X Coordinate).

Figure 2. Overall architecture of the proposed technology using GANs. The red arrows indicate real mouse data, and the blue arrows indicate fake mouse data.

Figure 3. Performance Evaluation Results of the First Experiment for the Proposed Mouse Data Defence Technology (Dataset 1).

Figure 4. Performance Evaluation Results of the First Experiment for the Proposed Mouse Data Defence Technology (Dataset 4).

Figure 5. Performance Evaluation Results of the Second Experiment for the Proposed Mouse Data Defence Technology (Dataset 1).

Figure 6. Performance Evaluation Results of the Second Experiment for the Proposed Mouse Data Defence Technology (Dataset 4).

Figure 7. Performance Evaluation Results of the Third Experiment for the Proposed Mouse Data Defence Technology (Dataset 1).

Figure 8. Performance Evaluation Results of the Third Experiment for the Proposed Mouse Data Defence Technology (Dataset 4).

Figure 9. Overall Performance Change Rates by Machine Learning Model for the Mouse Data Defence Technology.

Table 1. Composition of Reconstructed Datasets Using GANs.

Dataset	Generation Period	Total Data	Real Mouse Data	Fake Mouse Data	Ratio (Real:Fake)
1-1	50 ms	20,000	16,003	3997	8:2
1-2			14,009	5991	7:3
1-3			12,005	7995	6:4
1-4			10,282	9718	5:5
1-5			8011	11,989	4:6
1-6			6000	14,000	3:7
1-7			4004	15,996	2:8
4-1	500 ms	20,000	16,042	3958	8:2
4-2			14,080	5920	7:3
4-3			12,021	7979	6:4
4-4			10,009	9991	5:5
4-5			8319	11,681	4:6
4-6			6068	13,932	3:7
4-7			4049	15,951	2:8

Table 2. Optimal hyperparameters derived for each machine learning model in the previous study (Dataset 1-1).

Model	Hyperparameters	Training Score	Validation Score	Test Score
KNN (K-Nearest Neighbors)	n_neighbors = 1	1	0.99	0.99
Logistic Regression	C = 1000, penalty = L2	0.84	0.83	0.84
Decision Tree	max_depth = 13	0.99	0.99	0.99
Random Forest	n_estimators = 10	1	0.99	0.99
Gradient Boosting	max_depth = 15, learning_rate = 0.1	1	0.99	0.99
MLP (Multilayer Perceptron)	max_iter = 100, alpha = 1 × 10⁻⁵	0.98	0.98	0.98

Table 3. Summary of the Performance Evaluation Results of the First Experiment for the Proposed Mouse Data Defence Technology.

Performance	Generation Interval	Dataset	Model
Highest Performance	50 ms	1-2	MLP
Highest Performance	500 ms	4-2	Logistic Regression
Lowest Performance	50 ms	1-3	Logistic Regression
Lowest Performance	500 ms	4-6	Gradient Boosting

Table 4. Summary of the Performance Evaluation Results of the Second Experiment for the Proposed Mouse Data Defence Technology.

Performance	Generation Interval	Dataset	Model
Highest Performance	50 ms	1-6	MLP
Highest Performance	500 ms	4-2	Logistic Regression
Lowest Performance	50 ms	1-3	Gradient Boosting
Lowest Performance	500 ms	4-4	Gradient Boosting

Table 5. Summary of the Performance Evaluation Results of the Third Experiment for the Proposed Mouse Data Defence Technology.

Performance	Generation Interval	Dataset	Model
Highest Performance	50 ms	1-6	MLP
Highest Performance	500 ms	4-2	Logistic Regression
Lowest Performance	50 ms	1-7	Gradient Boosting
Lowest Performance	500 ms	4-6	Gradient Boosting

Table 6. Summary of Key Features and Overall Performance Evaluation Across Datasets for the Proposed Mouse Data Defence Technique.

Performance	Generation Interval	Experiment	Features	Dataset	Model
Highest Performance	50 ms	Third Experiment	Timestamp, X coordinate, Y coordinate, X distance, Y distance	1-7	Gradient Boosting
Highest Performance	500 ms	Third Experiment	Timestamp, X coordinate, Y coordinate, X distance, Y distance	4-7	Gradient Boosting
Lowest Performance	50 ms	Third Experiment	Timestamp, X coordinate, Y coordinate, X distance, Y distance	1-5	Logistic Regression
Lowest Performance	500 ms	Third Experiment	Timestamp, X coordinate, Y coordinate, X distance, Y distance	4-2	Logistic Regression

Table 7. Comparison of Performance Change Rates Between the Previous Mouse Data Attack Method and the Proposed Defence Technology (Dataset 1, 50 ms).

Dataset/ Interval	Model	Data	AC	±	P	±	R	±	F	±	AU	±
1-5/ 50 ms	KNN	D	0.998	−6%	0.996	−12%	0.999	−2%	0.998	−7%	0.998	−6%
	KNN	GAN	0.936	−6%	0.872	−12%	0.982	−2%	0.924	−7%	0.943	−6%
	Logistic Regression	D	0.692	−13%	0.641	-	0.51	-	0.568	-	0.758	−30%
	Logistic Regression	GAN	0.601	−13%	Nan	-	0	-	Nan	-	0.532	−30%
	Decision Tree	D	0.998	−9%	0.996	−12%	0.999	−9%	0.998	−11%	0.998	−9%
	Decision Tree	GAN	0.912	−9%	0.874	−12%	0.911	−9%	0.892	−11%	0.912	−9%
	Random Forest	D	0.998	−6%	0.996	−8%	0.999	−7%	0.998	−8%	0.999	−2%
	Random Forest	GAN	0.938	−6%	0.917	−8%	0.93	−7%	0.923	−8%	0.979	−2%
	Gradient Boosting	D	0.999	−4%	0.996	−7%	1	−4%	0.998	−5%	0.999	−1%
	Gradient Boosting	GAN	0.957	−4%	0.93	−7%	0.964	−4%	0.947	−5%	0.989	−1%
	MLP	D	0.998	−37%	0.996	−46%	0.999	−55%	0.998	−51%	0.998	−30%
	MLP	GAN	0.629	−37%	0.542	−46%	0.452	−55%	0.493	−51%	0.697	−30%
1-7/ 50 ms	KNN	D	0.999	−3%	0.998	−13%	0.999	−2%	0.998	−8%	0.999	−3%
	KNN	GAN	0.966	−3%	0.846	−13%	0.98	−2%	0.918	−8%	0.971	−3%
	Logistic Regression	D	0.806	-	nan	-	0	-	nan	-	0.597	−2%
	Logistic Regression	GAN	0.805	-	nan	-	0	-	nan	-	0.586	−2%
	Decision Tree	D	1	−4%	1	−13%	1	−10%	1	−11%	1	−6%
	Decision Tree	GAN	0.956	−4%	0.873	−13%	0.905	−10%	0.888	−11%	0.936	−6%
	Random Forest	D	0.999	−3%	1	−8%	0.996	−7%	0.998	−7%	0.999	−1%
	Random Forest	GAN	0.971	−3%	0.919	−8%	0.931	−7%	0.925	−7%	0.992	−1%
	Gradient Boosting	D	1	−2%	0.999	−6%	1	−3%	0.999	−5%	1	-
	Gradient Boosting	GAN	0.98	−2%	0.935	−6%	0.966	−3%	0.95	−5%	0.995	-
	MLP	D	0.997	−19%	0.989	−40%	0.997	−89%	0.993	−81%	0.999	−19%
	MLP	GAN	0.812	−19%	0.592	−40%	0.112	−89%	0.188	−81%	0.808	−19%

D: Original Dataset, GAN: GAN-generated Dataset, AC: Accuracy, P: Precision, R: Recall, F: F1-score, AU: AUC, ±: Change Rate.

Table 8. Comparison of Performance Change Rates Between the Previous Mouse Data Attack Method and the Proposed Defence Technology (Dataset 4, 500 ms).

Dataset/ Interval	Model	Data	AC	±	P	±	R	±	F	±	AU	±
4-2/ 500 ms	KNN	D	0.998	−8%	0.999	−9%	0.998	−1%	0.999	−6%	0.998	−12%
	KNN	GAN	0.919	−8%	0.905	−9%	0.987	−1%	0.944	−6%	0.874	−12%
	Logistic Regression	D	0.997	−26%	0.997	−27%	0.998	-	0.998	−16%	0.996	−42%
	Logistic Regression	GAN	0.734	−26%	0.724	−27%	0.999	-	0.84	−16%	0.581	−42%
	Decision Tree	D	0.999	−11%	0.999	−9%	0.999	−6%	0.999	−7%	0.997	−13%
	Decision Tree	GAN	0.89.	−11%	0.908	−9%	0.941	−6%	0.925	−7%	0.867	−13%
	Random Forest	D	0.998	−9%	0.998	−10%	0.999	−2%	0.999	−6%	0.999	−5%
	Random Forest	GAN	0.906	−9%	0.898	−10%	0.75	−2%	0.935	−6%	0.948	−5%
	Gradient Boosting	D	0.998	−7%	0.999	−8%	0.999	−2%	0.999	−5%	0.999	−3%
	Gradient Boosting	GAN	0.926	−7%	0.919	−8%	0.98	−2%	0.949	−5%	0.97	−3%
	MLP	D	0.998	−22%	0.997	−23%	0.999	−2%	0.998	−14%	0.998	−24%
	MLP	GAN	0.783	−22%	0.77	−23%	0.91	−2%	0.863	−14%	0.758	−24%
4-7/ 500 ms	KNN	D	0.999	−3%	0.997	−13%	0.998	−3%	0.998	−8%	0.998	−3%
	KNN	GAN	0.965	−3%	0.87	−13%	0.969	−3%	0.917	−8%	0.966	−3%
	Logistic Regression	D	0.997	−20%	0.987	−58%	0.997	−97%	0.992	−94%	0.998	−3%
	Logistic Regression	GAN	0.799	−20%	0.41	−58%	0.034	−97%	0.063	−94%	0.693	−3%
	Decision Tree	D	0.999	−4%	0.999	−12%	0.998	−8%	0.999	−10%	0.998	−5%
	Decision Tree	GAN	0.958	−4%	0.879	−12%	0.916	−8%	0.897	−10%	0.946	−5%
	Random Forest	D	0.999	−3%	0.998	−7%	0.999	−6%	0.999	−7%	0.999	−1%
	Random Forest	GAN	0.973	−3%	0.926	−7%	0.941	−6%	0.933	−7%	0.993	−1%
	Gradient Boosting	D	0.999	−2%	0.998	−5%	0.999	−4%	0.999	−5%	0.999	-
	Gradient Boosting	GAN	0.981	−2%	0.946	−5%	0.958	−4%	0.952	−5%	0.996	-
	MLP	D	0.997	−14%	0.987	−26%	0.997	−52%	0.992	−42%	0.998	−9%
	MLP	GAN	0.862	−14%	0.734	−26%	0.478	−52%	0.579	−42%	0.904	−9%

D: Original Dataset, GAN: GAN-generated Dataset, AC: Accuracy, P: Precision, R: Recall, F: F1-score, AU: AUC, ±: Change Rate.

Table 9. Summary of Overall Performance Change Rates of the Proposed Mouse Data Defence Technology.

Dataset	Accuracy	Precision	Recall	F1-Score	AUC	Model
1-5	−37%	−46%	−55%	−51%	−30%	MLP
1-7	−19%	−40%	−89%	−81%	−19%	MLP
4-2	−26%	−27%	−	−16%	−42%	Logistic Regression
4-7	−20%	−58%	−97%	−94%	−31%	Logistic Regression

Table 10. Summary of Performance Comparison to Validate Performance Differences Across Models and Datasets.

50 ms
Feature	t-Statistic	Confidence Interval (95%)	p-Value (One-Tailed)
F1	3.978	0.049~0.226	5.278 × 10⁻³
F2	4.541	0.077~0.277	3.081 × 10⁻³
F3	2.907	0.011~0.184	1.675 × 10⁻²
500 ms
Feature	t-Statistic	Confidence Interval (95%)	p-Value (One-Tailed)
F1	3.825	0.043~0.217	6.154 × 10⁻³
F2	10.350	0.142~0.236	7.247 × 10⁻⁵
F3	2.984	0.015~0.203	1.532 × 10⁻²

Table 11. Statistical Divergence Analysis Between Genuine and GAN-Generated Synthetic Mouse Data (KL and JS Divergence).

50 ms
Feature	KL(Genuine \|\| GAN)	KL (GAN \|\| Genuine)	JS Divergence
TIME	1.008621	12.555358	0.242797
X	0.175021	1.415825	0.048420
Y	0.586104	2.276023	0.132121
DIFFPOSX	0.011696	0.045298	0.003040
DIFFPOSY	0.028258	0.068322	0.006867
500 ms
Feature	KL(Genuine \|\| GAN)	KL (GAN \|\| Genuine)	JS Divergence
TIME	0.804133	9.186281	0.214075
X	0.289566	1.075720	0.073465
Y	0.283905	1.347653	0.068019
DIFFPOSX	0.007561	0.026126	0.001935
DIFFPOSY	0.030696	0.069390	0.007558

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Jung, W.; Kim, J.; Lee, K. Mouse Data Defence Technology Using Machine Learning in Image-Based User Authentication: Based on the WM_INPUT Message. Electronics 2026, 15, 16. https://doi.org/10.3390/electronics15010016

AMA Style

Jung W, Kim J, Lee K. Mouse Data Defence Technology Using Machine Learning in Image-Based User Authentication: Based on the WM_INPUT Message. Electronics. 2026; 15(1):16. https://doi.org/10.3390/electronics15010016

Chicago/Turabian Style

Jung, Wontae, Jinwook Kim, and Kyungroul Lee. 2026. "Mouse Data Defence Technology Using Machine Learning in Image-Based User Authentication: Based on the WM_INPUT Message" Electronics 15, no. 1: 16. https://doi.org/10.3390/electronics15010016

APA Style

Jung, W., Kim, J., & Lee, K. (2026). Mouse Data Defence Technology Using Machine Learning in Image-Based User Authentication: Based on the WM_INPUT Message. Electronics, 15(1), 16. https://doi.org/10.3390/electronics15010016

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Mouse Data Defence Technology Using Machine Learning in Image-Based User Authentication: Based on the WM_INPUT Message

Abstract

1. Introduction

2. Related Research and Research Motivation

3. Proposed Technology and Dataset

3.1. Pre-Validation of the Proposed Technology

3.2. Structure of the Proposed Defence Technology

Dataset Configuration of Experiments

4. Experimental Results

4.1. First Experiment Results of the Mouse Data Defence Technology

4.2. Second Experiment Results of the Mouse Data Defence Technology

4.3. Third Experiment Results of the Mouse Data Defence Technology

4.4. Overall Performance Evaluation of the Proposed Mouse Data Defence Technology According to Feature Sets and Datasets

5. Discussion

5.1. Evaluation Results Based on Performance Changes

5.2. Overall Performance Change Rate Analysis by Model for the Proposed Mouse Data Defence Technology

5.3. Discussion of the Use and Utilization of Synthetic Data

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI