1. Introduction
The insider threat landscape is increasingly costly and complex. The 2024 Ponemon Institute Report [
1] notes a 26% rise in incidents since 2022, with the average cost exceeding
$15 million per event. Unlike external attacks, insider threats originate within an organization, making malicious activities inherently difficult to distinguish from legitimate behavior. Traditional rule-based security systems are often too rigid and generate excessive false positives in dynamic enterprise environments.
Current approaches to insider threat detection face three fundamental challenges: (1) the sequential nature of user behavior requires modeling temporal dependencies across long event sequences; (2) the personalized nature of normal behavior necessitates user-specific modeling rather than one-size-fits-all solutions; and (3) the sensitive nature of monitoring data demands privacy-preserving techniques to prevent misuse.
This research bridges these gaps by proposing a unified, generative, adaptive, and privacy-preserving modeling framework. The core of our approach is a unified pipeline that applies consistent principles of personalization, concept drift adaptation, and privacy-preserving latent design. Generative AI methods have recently shown strong potential in cybersecurity applications, with surveys highlighting their effectiveness in modeling complex threat patterns and generating realistic behavioral distributions for detection tasks [
2]. However, many existing learning-based approaches lack a unified mechanism to handle new users (cold-start), adapt to evolving behaviors (concept drift), and protect user privacy simultaneously. We utilize VAEs and Transformer-based autoencoders to learn a probabilistic model of each user’s typical behavior sequence. Potential threats are flagged as low-probability events under this model.
AI-driven security systems are increasingly necessary in enterprise and IoT environments, where large-scale behavioral logs require automated and intelligent analysis [
3]. This further motivates the need for adaptive and privacy-preserving generative approaches such as ours.
This work contributes a unified generative and adaptive framework that integrates personalization, concept drift adaptation, and latent-space privacy preservation in a single system, a combination not addressed in prior studies. Unlike existing methods that focus purely on accuracy or sequence modeling, our approach simultaneously ensures (i) user-specific behavioral baselines, (ii) continuous adaptation to evolving activity patterns, and (iii) resilience against formal privacy attacks such as membership inference and reconstruction. This integration bridges a key gap in insider threat research, where detection performance and privacy protections are often treated separately.
The following
Figure 1 presents the high-level overview we just discussed, showing the entire pipeline from raw data to anomaly alerts, including the two model paths and the adaptive feedback loop.
2. Related Work
2.1. Evolution of Insider Threat Detection
The landscape of insider threat detection has evolved through several generations of approaches. Early systems relied on expert-defined rules and static thresholds (e.g., number of file downloads after hours) [
4]. While interpretable, these methods are brittle, generate excessive false positives, and fail to adapt to novel attack patterns [
5].
This led to the adoption of machine learning techniques. Supervised learning methods, such as Support Vector Machines (SVMs) and Random Forests, face the critical challenge of requiring large, labeled datasets of insider attacks, which are notoriously scarce and imbalanced [
6]. Consequently, unsupervised anomaly detection algorithms like Isolation Forest (IF) [
7] and One-Class SVM (OC-SVM) [
8] became prominent. However, these models typically treat data points as independent and identically distributed (i.i.d.), failing to capture the temporal dependencies inherent in user behavior.
Deep learning models advanced the field by handling sequential data. Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) networks model short-term temporal patterns in user events [
9], while deterministic Autoencoders (AEs) learn to reconstruct normal data, identifying anomalies via high reconstruction error [
10]. A key limitation of standard AEs is their tendency to overfit and their lack of a probabilistic understanding of normality. Recent work by Saminathan et al. [
11] demonstrated effective insider threat detection using ANN autoencoders, highlighting the continued evolution of autoencoder-based approaches.
This leads naturally to generative models, which learn the underlying data distribution. Variational Autoencoders (VAEs) [
12] introduce a stochastic latent space, providing a principled anomaly measure via the evidence lower bound (ELBO). More recently, Transformer architectures [
13], with their self-attention mechanisms, have demonstrated superior capability in capturing long-range dependencies in sequences, making them a powerful candidate for modeling complex user behavior sequences [
14].
This observation is consistent with broader surveys on deep learning–based anomaly detection, which highlight the effectiveness of representation learning and sequence modeling techniques across security and network monitoring domains [
15].
Recent reviews show that deep learning architectures, particularly autoencoders and recurrent models, significantly outperform traditional anomaly detection methods in capturing complex behavioral deviations [
14]. This aligns with the shift toward generative models capable of learning richer representations of user behavior.
A recent comprehensive review highlights that insider threats remain particularly challenging due to behavioral variability, scarcity of labeled malicious samples, and the subtle nature of deviations from normal activity [
16]. These findings emphasize the need for personalized and adaptive behavioral modeling approaches, which our framework aims to address.
Table 1 provides a high-level conceptual comparison to contextualize the generative and adaptive properties of our framework, prior to presenting empirical evidence in
Section 4.
2.2. Privacy-Preserving Machine Learning
The pursuit of effective insider threat detection must balance analytical utility with stringent data privacy. A 2025 survey by Padariya et al. [
17] provides a systematic taxonomy of privacy threats and defense mechanisms for generative AI, highlighting that models like GANs and VAEs are vulnerable to attacks even when only synthetic data is published.
Privacy threats include Membership Inference Attacks (MIA), where an adversary determines if a specific data sample was in the training set, as well as attribute inference and model inversion attacks [
17]. Defense mechanisms include Differential Privacy (DP), which offers rigorous mathematical guarantees by adding calibrated noise during training [
18]. There is also a growing focus on creating synthetic data that balances utility with strong privacy, moving beyond traditional anonymization techniques like k-anonymity which struggle with high-dimensional data [
19].
This evolving landscape underscores the necessity of frameworks like ours, which integrate generative models with adaptive learning while incorporating representation-level privacy and DP principles to mitigate these identified risks. Complementary to DP, other Privacy-Enhancing Technologies (PETs) like Federated Learning (FL) are being developed to operationalize privacy by design in AI systems. Our work addresses these identified risks by integrating a privacy-preserving latent-space design and evaluating it directly against MIA and reconstruction attacks, providing empirical privacy assurances alongside detection performance [
5].
2.3. Adaptive and Personalized Learning
The field is increasingly focusing on systems that adapt over time. Techniques such as data reprogramming that preserve privacy [
20] demonstrate how data representations can be transformed to improve AI readiness while effectively preserving privacy through constraint-based optimization. This aligns with the broader trend of decoding-time personalization, a concept analogous to our user-specific fine-tuning and concept drift adaptation, which is essential for the long-term reliability of behavioral anomaly detection.
In summary, while existing methods advance detection capabilities, they often lack a unified approach to personalization, continuous adaptation, and privacy preservation. Most existing works focus either on detection accuracy or privacy protection. Only a limited number integrate both dimensions with adaptive learning for evolving user behavior. This gap motivates our integrated framework, which simultaneously addresses these challenges through generative sequence modeling with user-specific fine-tuning, drift adaptation, and latent-space privacy mechanisms.
3. Methodology
3.1. Data Representation & Preprocessing
Let represent a sequence of events for user u. Each event is a feature vector containing:
Categorical Features: Event type (e.g., log-on, file access, http), user role, resource ID. These are encoded using a trainable embedding layer (dim = 128).
Temporal Features: Hour of day, day of week. Encoded as cyclic sine/cosine pairs to preserve temporal continuity.
Volumetric Features: Session file count, bytes transferred. Normalized to zero mean and unit variance.
A sequence length of 100 events was chosen based on empirical validation and prior work on the CERT dataset, which shows that windows of 50–150 events capture sufficient temporal context for insider threat modeling.
3.2. Model Architectures
3.2.1. Variational Autoencoder (VAE)
Encoder : A two-layer Bidirectional LSTM (hidden dim = 128) processes the input sequence. The final hidden state is projected to the parameters of the latent distribution , where and .
Decoder : A two-layer LSTM (hidden dim = 128) reconstructs the sequence from the sampled latent vector z.
Training: The model is trained by minimizing the negative Evidence Lower Bound (ELBO):
where
is a standard Gaussian prior and
is a weighting term set to 0.1.
Anomaly Score: . A higher score indicates a greater deviation from the learned normal distribution.
3.2.2. Transformer Autoencoder
Architecture: A standard Transformer encoder (4 layers, 8 attention heads, model dimension = 256, FFN dimension = 512) is used for both encoding and decoding. The final output of the encoder is used as the sequence representation.
Attention Mechanism: The model uses multi-head self-attention to capture global dependencies:
Anomaly Score: Mean Squared Error (MSE) between the input and reconstructed sequence.
3.3. Integration of VAE and Transformer Architectures
Our framework is designed to support both VAE and Transformer Autoencoder architectures within a unified pipeline, allowing for comparative analysis and deployment flexibility based on specific use-case requirements. While the two models are trained separately, they share the same preprocessing, personalization, and adaptation mechanisms described in
Section 3.4.
The integration occurs at three levels:
Architectural Integration: Both models follow the same autoencoder paradigm—encoding sequences into latent representations and reconstructing them—but with different inductive biases (probabilistic vs. deterministic attention).
Pipeline Integration: The adaptive personalization Algorithm 1) applies identically to both architectures, enabling user-specific fine-tuning and concept drift adaptation regardless of the underlying model.
Decision Integration: In deployment, either model can be selected based on operational constraints: the VAE for efficiency and robustness to noise, or the Transformer AE for superior long-range dependency modeling.
This integrated approach allows organizations to choose the most appropriate architecture while maintaining consistent personalization, adaptation, and privacy-preserving properties across the framework.
3.4. Adaptive Detection Algorithm
The adaptive detection process is formalized in Algorithm 1, which outlines the steps for user-specific personalization, cold-start handling, and weekly concept drift adaptation.
| Algorithm 1 Adaptive Generative Insider Threat Detection |
Hyperparameters: Personalization Fine-tuning Epochs , Fine-tuning Learning Rate Input: Event logs L, pre-trained global model (trained on a curated set of 500 users from the training data with no injected malicious activities)
Extract event sequences if u is a new user //Cold-start: use global model else Load existing user model endif
Fine-tune on recent data from // Personalization: Fine-tune for epochs with learning rate , using a sliding window of the most recent 10,000 events. Compute anomaly score If // is the 95th percentile score from the user’s own training data. For new users, the global model’s 95th percentile threshold () is used initially. Flag u for review endIf if it is the end of the week then
|
4. Experimental Setup & Results
4.1. Experimental Setup
Dataset and Preprocessing: We use the CERT Insider Threat Dataset v5.2 [
20], a comprehensive, publicly available benchmark containing synthetic log data from 1000 users over 17 months. The dataset includes various insider threat scenarios, such as intellectual property theft and fraud.
Feature Engineering: We extracted sequences of 100 consecutive events for each user. Each event is represented by a feature vector containing:
Categorical: event_type (log-on, log-off, connect, disconnect, http, email, file), user_role, PC_ID. These were embedded into 128-dimensional vectors.
Temporal: hour and day_of_week, encoded as cyclic sine/cosine pairs.
Volumetric: session_file_count, bytes_downloaded, normalized using Standard Scaler.
Data Splitting: We adopt a time-based split, using the first 12 months for training (normal behavior) and the remaining 5 months for testing, which includes the injected malicious activities.
Temporal ordering was strictly preserved in all splits to prevent data leakage. The validation set for hyperparameter tuning comprised the last 30 days of the training period. All experiments were repeated 5 times with different random seeds, and results are reported as mean ± standard deviation.
Baselines and Implementation: We compare our proposed VAE and Transformer Autoencoder (TF-AE) against a comprehensive set of baselines:
Hyperparameters and Implementation Details: All deep learning models were implemented in PyTorch and used the Adam optimizer. Key hyperparameters, determined via a grid search on a validation set, are summarized in
Table 2. The anomaly threshold
for each user-specific model was set to the 95th percentile of their training data’s anomaly scores. This heuristic is widely adopted in unsupervised anomaly detection to control the false positive rate, assuming the training set is predominantly normal. A sensitivity analysis showed the F1-score remained stable (±0.03) when varying this percentile from 90 to 99, indicating low sensitivity to this parameter choice.
The hyperparameters in
Table 2 were chosen using a grid search over a held-out validation set (the final month of training data). The grid included
for the VAE, sequence length
, learning rates
, and latent dimensions
. Hyperparameters were selected to maximize AUPRC on the validation set while mitigating overfitting, monitored through reconstruction loss on normal validation sequences.
4.2. Hardware and Software Environment
All experiments were conducted on a workstation with:
GPU: NVIDIA RTX 3090 (24 GB VRAM)
CPU: Intel Core i9-12900K
RAM: 64 GB DDR5
Framework: PyTorch 2.2, CUDA 12.1
OS: Ubuntu 22.04 LTS
Training and evaluation pipelines were parallelized for user-specific fine-tuning using PyTorch Lightning.
4.3. Quantitative Results
The overall detection performance of all evaluated methods is summarized in
Table 3.
Key Findings:
Generative Models Outperform Baselines: Both the VAE and Transformer Autoencoder significantly outperform static rules, IF, OC-SVM, and the LSTM-AE baseline. The Transformer AE achieved the highest F1-Score (0.66) and AUPRC (0.59), demonstrating the effectiveness of self-attention in modeling long-range dependencies in user behavior sequences.
VAE as a Strong and Efficient Alternative: The VAE provides a strong balance between accuracy and efficiency, achieving an F1-Score of 0.61 with 40% fewer parameters and over 2× faster training time than the Transformer AE. Its probabilistic latent space improves robustness to noise and limited user-specific data.
Role of Personalization and Concept Drift Adaptation:
The ablation study shows that removing personalization results in a significant drop in AUPRC (from 0.59 to 0.50). Ignoring concept drift also reduces performance, confirming that user-specific behavioral baselines and periodic adaptation are essential for stable, long-term detection accuracy.
Figure 2 shows performance comparison between Precision–Recall Curves and F1-Scores.
Impact of Personalization and Adaptation: An ablation study (
Table 4) demonstrates the critical role of personalization. Using a single global model for all users resulted in a 15% relative drop in AUPRC for the Transformer AE, highlighting that user-specific behavioral baselines are essential for accuracy. The absence of concept drift adaptation also led to noticeable performance degradation.
Hyperparameter Sensitivity Analysis: To justify key design choices, we conducted sensitivity analyses on critical hyperparameters: To justify key design choices, we conducted sensitivity analyses on critical hyperparameters:
-value for VAE: Values of were tested. A value of achieved the best balance between reconstruction quality and latent regularization (F1 = 0.61), while higher values () degraded reconstruction capability.
Sequence Length: Windows of events were evaluated. A sequence length of provided optimal context for both architectures, with shorter windows missing long-range dependencies and longer windows introducing noise.
Personalization Window: Recent event windows of were tested for fine-tuning. A window of 10k events provided sufficient recent context without overfitting to temporary behavioral fluctuations.
Adaptation Frequency: Weekly retraining maintained stable performance (F1 = 0.66), while monthly adaptation resulted in an 8% performance degradation due to accumulated concept drift.
4.4. In-Depth Discussion of Model Performance
The superior performance of the Transformer Autoencoder can be attributed to its self-attention mechanism, which effectively captures long-range dependencies across a user’s event sequence. For instance, it can link an anomalous file download to an unusual log-on event that occurred several days prior, a pattern that recurrent models like LSTM often fail to retain. The VAE, while slightly less accurate, offers a compelling alternative. Its probabilistic formulation and more compact latent space make it inherently more robust to noise and overfitting. This is particularly beneficial during the personalization phase, where it can generalize effectively from a user’s limited recent data. As evidenced by the efficiency analysis (
Table 5), the VAE’s lower computational cost makes it highly practical for large-scale deployments where models must be frequently updated for thousands of users.
4.5. Computational Efficiency Analysis
A key practical consideration is the trade-off between performance and resource consumption. As summarized in
Table 5, we compared the training time and the number of parameters for the deep learning models. The VAE offers a significant efficiency advantage, training over 2× faster than the Transformer AE and using 40% fewer parameters, while still delivering competitive performance. This makes the VAE particularly suitable for environments with computational constraints or where models must be frequently retrained for many users.
4.6. Privacy and Interpretability Analysis
Quantitative Privacy: We measured the Mutual Information (MI) between the latent representation Z and a sensitive input feature (resource_ID). The VAE’s MI was 0.048 nats, significantly lower than the deterministic LSTM-AE’s 0.351 nats. This confirms that the stochastic latent space of the VAE effectively obfuscates raw data details, providing inherent privacy.
Privacy Attack Resilience Evaluation: To quantitatively validate our privacy claims beyond mutual information, we simulated two canonical privacy attacks against our trained VAE and Transformer AE models, following the taxonomy outlined by Padariya et al. [
17]:
Membership Inference Attack (MIA): We trained a binary attack classifier (a 3-layer MLP) to distinguish between the latent representations of 10,000 training samples and 10,000 held-out test samples.
Reconstruction Attack: For a subset of 1000 sequences, we attempted to reconstruct the raw input features from the latent representation Z using a dedicated reconstruction attacker network (a 2-layer LSTM). We measured the success using Mean Absolute Error (MAE) between the original and attacker-reconstructed features.
The threat model assumes a black-box attacker with access to latent representations z but not model parameters. For MIA, the attacker trains a binary classifier on 10,000 known training samples and 10,000 test samples. For reconstruction attacks, the attacker uses a separate LSTM network trained to map z back to raw features.
As shown in Table 6, both models demonstrate strong resilience against privacy attacks. The MIA accuracy for both models is close to 50% (random guessing), indicating that the latent representations do not leak membership information. Furthermore, the reconstruction MAE is significantly high, confirming that raw sensitive data cannot be accurately recovered from the model’s latent space. The VAE’s stochastic nature provides it with a marginal advantage in this regard.
Qualitative Interpretability: Figure 3 shows the attention weights from the Transformer AE for an anomalous sequence. The model correctly identifies a critical pattern: a log-on event at an unusual hour, followed by a file download of a large document, and an email event with an attachment. This provides security analysts with a transparent, human-readable rationale for the alert, moving beyond a “black box” prediction.
While our latent-space design shows strong empirical privacy, it does not provide formal differential privacy guarantees. Future work could incorporate DP-SGD training or functional mechanisms for stronger theoretical protection against adaptive adversaries with full model access.
5. Limitations and Generalizability
Our evaluation relies on the CERT v5.2 dataset, which, while comprehensive, is synthetic and may not fully capture the complexity and noise of real enterprise logs. The simulated attacks follow predefined scenarios, potentially lacking the subtlety of real insider threats. Generalization to real-world environments requires validation on operational data with natural behavioral variations and considerations for false positive tolerance.
Additionally, maintaining separate models for thousands of users incurs storage and computational overhead. While the VAE offers efficiency advantages, enterprise-scale deployment would benefit from parameter-efficient fine-tuning techniques or model compression for resource-constrained environments.
6. Conclusions and Future Work
This paper introduced an adaptive and privacy-preserving generative framework for insider threat detection, leveraging the strengths of Variational Autoencoders and Transformer Autoencoders to model user behavioral sequences at scale. The integration of personalization, drift-aware learning, and latent-space privacy provides a comprehensive approach for detecting subtle and evolving insider threats. Empirical results on the CERT dataset demonstrate that the proposed models significantly outperform classical anomaly detection methods, offering higher accuracy, improved robustness, and lower false-positive rates. The privacy analysis further confirms that the framework mitigates the risk of sensitive data reconstruction and exposure, an essential requirement for real-world deployment in regulated environments.
This aligns with broader findings that highlight the importance of AI-driven and privacy-conscious security solutions in enterprise ecosystems [
3].
Future work will explore (i) the integration of federated learning to eliminate the need for centralized log aggregation, (ii) the use of graph-based models to capture relational user behavior in coordinated insider attacks, and (iii) the incorporation of formal differential privacy guarantees to strengthen protection against adversarial inference. Overall, this work contributes an effective generative AI approach for advancing secure, privacy-aware, and adaptive insider threat detection in modern digital ecosystems.
Funding
This research received no external funding.
Data Availability Statement
The data used in this study are derived from the CERT Insider Threat Dataset v5.2, provided by the Software Engineering Institute (SEI), Carnegie Mellon University. The dataset is publicly available for research purposes subject to registration and approval via the SEI data repository. No new data were created in this study.
Acknowledgments
The project was funded by KAU Endowment (WAQF) at king Abdulaziz University, Jeddah, Saudi Arabia. The authors, therefore, acknowledge with thanks WAQF and the Deanship of Scientific Research (DSR) for technical and financial support.
Conflicts of Interest
The author declares no conflicts of interest.
References
- Ponemon Institute. 2024 Cost of Insider Threats Global Report; Proofpoint & Ponemon Institute: Sunnyvale, CA, USA, 2024. Available online: https://www.proofpoint.com/us/resources/threat-reports/cost-of-insider-threats (accessed on 13 December 2025).
- Arifin, M.M.; Ahmed, M.S.; Ghosh, T.K.; Zhuang, J.; Yeh, J.-H. A Survey on the Application of Generative Adversarial Networks in Cybersecurity: Prospective, Direction and Open Research Scopes. arXiv 2024, arXiv:2407.08839. [Google Scholar] [CrossRef]
- Kariyawasam, K.; Khalil, I.; Dahal, K. AI-Driven Security Solutions for Enterprise and IoT Systems: A Systematic Review. Future Internet 2022, 14, 112. [Google Scholar] [CrossRef]
- Cappelli, R.S.; Moore, A.P.; Shimeall, T.J. Common Sense Guide to Mitigating Insider Threats; Carnegie Mellon University, Software Engineering Institute: Pittsburgh, PA, USA, 2012. [Google Scholar]
- Szarmach, J. Privacy-Enhancing and Privacy-Preserving Technologies in AI: Enabling Data Use and Operationalizing Privacy by Design and Default. White Paper, March 2025. Available online: https://www.aigl.blog/privacy-enhancing-and-privacy-preserving-technologies-in-ai/ (accessed on 13 December 2025).
- Bhuyan, M.H.; Bhattacharyya, D.K.; Kalita, J.K. Network Anomaly Detection: Methods, Systems and Tools. IEEE Commun. Surv. Tutor. 2014, 16, 303–336. [Google Scholar] [CrossRef]
- Liu, F.T.; Ting, K.M.; Zhou, Z.-H. Isolation Forest. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining (ICDM), Pisa, Italy, 15–19 December 2008; pp. 413–422. [Google Scholar] [CrossRef]
- Schölkopf, B.; Platt, J.; Shawe-Taylor, J.; Smola, A.; Williamson, R. Estimating the Support of a High-Dimensional Distribution. Neural Comput. 2001, 13, 1443–1471. [Google Scholar] [CrossRef]
- Liu, L.; De Vel, O.; Chen, C.; Zhang, J.; Xiang, Y. Anomaly-Based Insider Threat Detection Using Deep Autoencoders. In Proceedings of the 2018 IEEE International Conference on Data Mining Workshops (ICDMW) Singapore, 17–20 November 2018; pp. 39–48. [Google Scholar] [CrossRef]
- Zhou, C.; Paffenroth, R.C. Anomaly Detection with Robust Deep Autoencoders. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’17), Halifax, NS, Canada, 13–17 August 2017; pp. 665–674. [Google Scholar] [CrossRef]
- Saminathan, K.; Mulka, S.T.R.; Damodharan, S.; Maheswar, R.; Lorincz, J. An Artificial Neural Network Autoencoder for Insider Cyber Security Threat Detection. Future Internet 2023, 15, 373. [Google Scholar] [CrossRef]
- Kingma, D.P.; Welling, M. Auto-Encoding Variational Bayes. arXiv 2013, arXiv:1312.6114. Available online: https://arxiv.org/abs/1312.6114 (accessed on 17 December 2025).
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Polosukhin, I. Attention Is All You Need. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS 2017), Long Beach, CA, USA, 4–9 December 2017; Available online: https://arxiv.org/abs/1706.03762 (accessed on 17 December 2025).
- Xu, J.; Wang, X.; Yang, Q.; Pei, J.; Li, P.; Yu, Z. Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks. In Proceedings of the 41st International ACM SIGIR Conference, Ann Arbor, MI, USA, 8–12 July 2018; pp. 95–104. [Google Scholar] [CrossRef]
- Kwon, D.; Kim, H.; Kim, J.; Suh, S.C.; Kim, I.; Kim, K.J. A Survey of Deep Learning-Based Network Anomaly Detection. Cluster Computing 2019, 22, 949–961. [Google Scholar] [CrossRef]
- Yuan, S.; Wu, X. Deep Learning for Insider Threat Detection: Review, Challenges and Opportunities. Comput. Secur. 2021, 104, 102221. [Google Scholar] [CrossRef]
- Padariya, D.; Zhou, Y.; Hellmann, M.; Rueda, A.; Michalska, S.; Tiffin, N.; Rojas, G. Privacy-Preserving Generative Models: A Comprehensive Survey. arXiv 2025, arXiv:2502.03668. Available online: https://arxiv.org/abs/2502.03668 (accessed on 17 December 2025).
- El Mestari, S.Z.; Lenzini, G.; Demirci, H. Preserving Data Privacy in Machine Learning Systems. Comput. Secur. 2024, 137, 103605. [Google Scholar] [CrossRef]
- Liu, Y.; Huang, J.; Li, Y.; Wang, D.; Xiao, B. Generative AI Model Privacy: A Survey. Artif. Intell. Rev. 2025, 58, 33. [Google Scholar] [CrossRef]
- Lindauer, B. Insider Threat Test Dataset (CERT v5.2). Software Engineering Institute, Carnegie Mellon University: Pittsburgh, PA, USA, 2020. [Google Scholar] [CrossRef]
- Zhou, B.; Liu, S.; Hooi, B.; Cheng, X.; Ye, J. BeatGAN: Anomalous Rhythm Detection Using Adversarially Generated Time Series. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI 2019), Macao, China, 10–16 August 2019; pp. 4433–4439. [Google Scholar] [CrossRef]
- Schlegl, T.; Seeböck, P.; Waldstein, S.M.; Schmidt-Erfurth, U.; Langs, G. AnoGAN: Unsupervised Anomaly Detection with Generative Adversarial Networks to Guide Marker Discovery. In Proceedings of the Information Processing in Medical Imaging (IPMI 2017), Boone, NC, USA, 25–30 June 2017; pp. 146–157. [Google Scholar] [CrossRef]
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |