Adaptive Intervention Architecture for Psychological Manipulation Detection: A Culture-Specific Approach for Adolescent Digital Communications

Yoon, Sungwook; Kim, Byungmun

doi:10.3390/info16050379

Open AccessArticle

Adaptive Intervention Architecture for Psychological Manipulation Detection: A Culture-Specific Approach for Adolescent Digital Communications

by

Sungwook Yoon

^*

and

Byungmun Kim

^†

Gyeongbuk Development Institute, 201, Docheong-daero, Homyeong-eup, Yecheon-gun 36849, Gyeongsangbuk-do, Republic of Korea

^*

Author to whom correspondence should be addressed.

^†

GKNU, 1375, Gyeongdong-ro, Andong-si 36729, Gyeongsangbuk-do, Republic of Korea.

Information 2025, 16(5), 379; https://doi.org/10.3390/info16050379

Submission received: 24 March 2025 / Revised: 11 April 2025 / Accepted: 28 April 2025 / Published: 2 May 2025

(This article belongs to the Section Information and Communications Technology)

Download

Browse Figures

Versions Notes

Abstract

This study introduces a novel artificial intelligence system for detecting and addressing psychological manipulation in digital communications, with a focus on adolescents. The system integrates a hybrid neural network model with emotion analysis capabilities specifically designed for Korean language contexts. Our approach combines text analysis with emotion recognition to enhance detection accuracy while implementing a tiered intervention strategy based on risk levels. The system demonstrated significant improvements over baseline models in detecting various forms of psychological manipulation, particularly in identifying subtle patterns. Our expert evaluation suggests the system’s potential effectiveness in protecting adolescent mental health in digital environments. While primarily focused on adolescents, the findings indicate broader applicability across age groups. This research contributes to the field by offering a culturally adapted framework for psychological manipulation detection, a multimodal analytical approach, and an ethically designed intervention system.

Keywords:

gaslighting detection; adolescent mental health; natural language processing; real-time intervention; BERT-LSTM; Korean text analysis

Graphical Abstract

1. Introduction

The rapid proliferation of digital communication platforms has fundamentally transformed how adolescents interact socially. According to recent statistics, over 95% of teenagers possess smartphones, spending an average of 6–7 h engaged in online communication (Ministry of Science and ICT, 2023) daily [1]. This shift from face-to-face to digital interaction has created new vulnerabilities, particularly regarding psychological manipulation tactics such as gaslighting.

Gaslighting—a form of psychological manipulation where victims are made to question their perceptions, memories, and sanity—poses a significant threat to adolescent mental health. The anonymous and non-face-to-face nature of online communication facilitates manipulative behaviors, as perpetrators can more easily employ tactics of denial, contradiction, and emotional manipulation without immediate accountability. Research indicates that 34% of adolescents report experiencing online gaslighting (Statistics Korea, 2023) [2], contributing to serious psychological consequences, including diminished self-esteem, anxiety, depression, and potentially long-term trauma.

Adolescents are particularly vulnerable to gaslighting during their developmental stage due to three key factors: their critical period of identity formation makes them susceptible to manipulation that causes self-doubt, their heightened sensitivity to peer acceptance increases vulnerability to peer-based psychological manipulation, and their still-developing critical reasoning abilities limit their capacity to identify and resist sophisticated manipulation tactics. While current approaches to addressing gaslighting mainly rely on post-hoc analysis and counseling, these methods fail to provide immediate protection during ongoing harmful interactions, making the development of real-time detection and intervention systems a crucial advancement in protecting adolescent mental health in digital environments [3].

As shown in Figure 1, gaslighting experiences are widely distributed among young adults, with the highest frequency in the VGQ score range of 40–50. This suggests that vulnerability to psychological manipulation formed during adolescence may continue and intensify into young adulthood. Adolescents may have limited ability to clearly recognize or report gaslighting experiences, potentially resulting in underestimation of actual occurrence rates. Considering that digital communication begins at increasingly younger ages in the current environment, establishing active gaslighting detection and response systems from adolescence is an essential step in protecting the mental health of future young adults.

While adolescents demonstrate unique vulnerabilities to gaslighting due to developmental factors, it is important to recognize that gaslighting affects individuals across the lifespan. Recent studies indicate that approximately 45.6% of young adults report experiencing gaslighting [4], and manipulation tactics occur in various relationships, including romantic partnerships, friendships, and family dynamics, regardless of age [5]. This research primarily focuses on adolescents while acknowledging the potential for broader application across age groups.

The Korean language presents unique challenges for gaslighting detection due to its agglutinative features (morpheme combinations), complex honorific system, and context-dependent expressions with frequent subject omission. These linguistic characteristics create opportunities for psychological manipulation and directly influence our model design leading us to include morpheme-based tokenization and contextual restoration algorithms. For example, the hierarchical honorific system can be exploited to establish power dynamics, while subject omission in sentences can be used to avoid responsibility in manipulative contexts.

Current gaslighting detection systems face significant limitations when applied to Korean language contexts. These systems, primarily developed for English language data, often rely on simple keyword-based detection or basic text classification methods. Our research overcomes these limitations by presenting an innovative approach that combines (1) a Korean-specific BERT-LSTM hybrid model [6], (2) a multimodal integration of text and emotional data [7], and (3) a risk-stratified intervention system.

This research developed specialized techniques to effectively process Korean linguistic features, focusing on addressing unique challenges in gaslighting detection. To handle the frequent omission of subjects or objects in Korean sentences that could be exploited to evade responsibility, we developed a restore ellipsis function that analyzes the context of previous conversations. This advanced natural language processing system combines a Korean-optimized BERT-LSTM neural network architecture with emotional context analysis, incorporating specialized pattern recognition algorithms for analyzing intentional changes in honorific levels that may establish power dynamics [8]. The system features a tiered intervention approach with ethical considerations and appropriate risk assessment, designed primarily for adolescent populations while maintaining the potential for broader application across various developmental stages.

2. Related Work

2.1. Gaslighting: Definition and Characteristics

Gaslighting, originating from the 1944 film “Gaslight”, is a form of psychological manipulation where perpetrators deliberately plant seeds of doubt in their targets’ minds, causing them to question their own reality. This manipulation operates through three primary mechanisms: reality distortion through continuous denial of experiences, emotional manipulation by invalidating responses, and establishment of power dynamics through psychological dominance. These tactics create a complex web of control that gradually erodes the victim’s sense of self and autonomy (Stark, 2019) [9].

Adolescents are particularly susceptible to gaslighting due to their developmental stage and the unique challenges of digital environments. Their ongoing identity formation and developing cognitive abilities make them vulnerable to manipulation, while digital platforms amplify these risks through limited non-verbal cues, permanent records that can be manipulated, and continuous access across multiple channels. The combination of developmental vulnerability and digital environmental factors creates a perfect storm that makes adolescents especially susceptible to the harmful effects of gaslighting (Sweet, 2019) [10]. Korean adolescents face culturally specific manifestations of gaslighting that leverage social hierarchies and collectivist values. For instance, appeals to age-based authority (“As your senior, I’m telling you…”) or group consensus (“Everyone else understands this…”) represent culturally embedded gaslighting tactics [11].

2.2. NLP Approaches to Harmful Dialogue Detection

Natural language processing approaches to detecting harmful dialogue have evolved substantially over the past decade. Initial methods relied primarily on rule-based systems and machine-learning classifiers applied to manually engineered features. Philipo, A.; Sarwatt et al. (2024) developed a cyberbullying detection system for social media texts that combined emotional feature extraction with linguistic pattern recognition [12], achieving 87.3% accuracy. However, this approach required extensive domain knowledge for feature engineering and struggled with the contextual nature of harmful language. With the advancement in deep learning, more sophisticated models emerged. Abdali, S. (2022) proposed a multimodal approach that integrated text, image, and metadata for harmful content detection, improving performance by 9.5 percentage points over text-only methods [13]. Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) have shown promise in capturing both local features and sequential patterns in harmful dialogue detection [14].

The advent of transformer-based models, particularly BERT (Bidirectional Encoder Representations from Transformers), has revolutionized natural language processing. BERT’s pre-training on massive text corpora enables it to capture rich contextual representations [15], making it particularly valuable for nuanced tasks like gaslighting detection. For Korean language processing, several specialized BERT variants have been developed [16].

KoBERT, developed by SKT, is pre-trained on a large Korean corpus and incorporates a morpheme-based tokenizer to address Korean’s agglutinative nature. This model has demonstrated superior performance in various Korean NLP tasks, including sentiment analysis and text classification (Song, J.; Kim, B. et al., 2023) [17]. Similarly, KR-BERT utilizes sub-word tokenization optimized for Korean, showing improved performance in handling Korean-specific linguistic phenomena [18].

Long Short-Term Memory (LSTM) networks, a specialized form of recurrent neural networks, excel at capturing sequential dependencies in text. Their ability to model long-range dependencies through specialized gating mechanisms makes them particularly suitable for dialogue analysis, where context across multiple turns is crucial for accurate interpretation. In Korean NLP applications, LSTMs have shown strong performance in tasks requiring sequence understanding, such as dialogue state tracking and conversational analysis (Park, S. 2020) [19]. While both BERT and LSTM models offer distinct advantages, they also have complementary strengths. BERT excels at capturing rich contextual representations but may struggle with very long sequences due to its fixed input length. Conversely, LSTMs are specifically designed to model sequential data but may not capture the same depth of contextual information as transformer-based models. A hybrid approach combining these architectures leverages BERT’s powerful contextual embeddings with LSTM’s sequence modeling capabilities.

2.3. Emotion Analysis and Multimodal Approaches

Emotion analysis in text has evolved from simple lexicon-based approaches to sophisticated deep-learning models. While early methods relied on emotion dictionaries and basic polarity analysis, recent approaches employ neural networks and attention mechanisms that better capture contextual emotional information [20]. Korean emotion analysis presents unique challenges due to linguistic and cultural factors: the complex honorific system creates nuanced emotional patterns, cultural preferences for indirect expression produce subtle emotional cues, and culturally specific expressions reflect Korean social values. These factors necessitate specialized approaches beyond English-centric methods when analyzing manipulation tactics in Korean adolescent communications.

Multimodal approaches that integrate text analysis with other data sources have shown promising results. Zadeh et al. (2018) [21] developed a multimodal deep-learning model that integrated text, audio, and video data for emotion recognition, achieving a 12.6% improvement over unimodal approaches [22]. In the Korean context, Jung, M et al. (2022) proposed a culturally adapted multimodal emotion analysis system that accounts for Korean-specific emotional expression patterns [23].

For gaslighting detection, multimodal approaches offer particular advantages. Since gaslighting often involves discrepancies between stated content and emotional undertones, systems that can analyze both textual content and emotional signals may achieve superior detection performance. Furthermore, the integration of contextual information about speaker relationships and conversation history can help identify manipulative patterns that might not be apparent from isolated utterances.

3. Methodology

3.1. Research Design Overview

This research adopts a systematic approach to developing and evaluating an NLP-based gaslighting detection and intervention system for Korean adolescent communications.

Our methodology encompasses three primary phases of research implementation: data collection and preparation, which involves acquiring and preprocessing AI Hub datasets, including gaslighting annotation and data cleaning; model development and implementation, focusing on the design of BERT-LSTM hybrid architecture and integration of emotion tag data with a tiered intervention system; and evaluation and validation, which utilizes both quantitative performance metrics and qualitative expert assessment to comprehensively evaluate the system’s technical performance and practical utility through detection accuracy testing, baseline model comparisons, expert validation, and limited user testing.

3.2. Gaslighting Pattern Detection Techniques

Our research implemented specialized techniques for detecting gaslighting patterns in Korean adolescent digital communications. The system identifies five major categories of gaslighting behavior through a combination of linguistic analysis, emotional pattern recognition, and cultural context processing.

3.2.1. Classification of Major Gaslighting Types

The model was trained to detect five primary gaslighting types identified through expert analysis: reality distortion (patterns that deny or distort the target’s experiences or memories, characterized by contradictions and false assertions), emotional manipulation (patterns that invalidate or manipulate the target’s emotional responses, including emotional invalidation and strategic emotional triggering), blame shifting (patterns that redirect responsibility from the manipulator to the target, often featuring accusatory language and responsibility inversion), isolation (patterns that socially isolate the target, detected through references to social exclusion and undermining of outside relationships), and gradual intensity (patterns that incrementally increase manipulation intensity over time, measured through longitudinal conversation analysis) (Table 1).

3.2.2. Linguistic Feature Analysis

For detecting Korean-specific gaslighting patterns, our linguistic analysis focuses on two key features that are particularly relevant in Korean communication dynamics. First, we examine honorific-informal speech transition patterns, where sudden shifts between speech forms can indicate power relationship manipulation, with our system actively monitoring honorific markers and their unexpected changes. Second, we analyze subject omission patterns, a common characteristic in the Korean language that can be exploited for responsibility avoidance in manipulative contexts, utilizing a specialized context restoration algorithm to identify these instances.

The system employs two key detection methods for manipulative behavior: first, it analyzes discrepancies between textual content and emotion tags to identify subtle manipulation and potential invalidation patterns, and second, it uses regular expression patterns to identify phrases that deny or minimize the target’s experiences, such as “That never happened”, “You seem too sensitive”, or “It was just a joke”, which are common indicators of gaslighting behavior.

3.2.3. Emotional Transition Analysis

The system employs sophisticated emotional state tracking throughout conversations to detect gaslighting indicators by monitoring transitions from confidence or neutral states to confusion, anxiety, or doubt in the conversation partner, which serve as potential manipulation indicators. Additionally, the system performs a detailed analysis of emotional changes that occur after potential manipulation attempts through a specialized algorithm that evaluates these behavioral patterns. Figure 2 presents the overall framework of our research methodology.

3.2.4. Cultural Context Consideration

The system incorporates Korean-specific cultural context by analyzing three key patterns in adolescent communication: group conformity pressure, age-based hierarchy, and indirect expression patterns. The selection of these specific cultural factors was based on a comprehensive literature review of cross-cultural communication studies and a preliminary qualitative study we conducted with fifteen adolescent counselors and eight cultural psychologists specializing in East Asian communication patterns.

In addition to these primary cultural factors, we incorporated considerations of growth environment variables that influence communication patterns. These include (1) Urban versus rural upbringing contexts, which affect exposure to diverse communication styles; (2) Family communication dynamics, particularly the presence of traditional versus progressive parenting approaches; and (3) Educational environment factors, including the degree of emphasis on hierarchical relationships in school settings. These growth environment variables were integrated into our model as contextual modifiers that adjust the weight of primary cultural factors during analysis.

3.3. Dataset Construction and Preprocessing

This study utilizes two primary datasets from AI Hub, Korea’s national AI platform, in accordance with AI Hub’s terms of service for academic research (Article 7.2 of AI Hub Data Usage Agreement). All data usage complies with Korean Personal Information Protection Act (PIPA) regulations, with proper anonymization protocols maintained throughout the research process.

(1): Korean SNS Multi-Turn Conversation Dataset

This dataset contains 8742 conversation sessions, averaging 15.3 utterances per session. The conversations span daily life, school, hobbies, and interpersonal relationships, capturing typical adolescent interactions. Each utterance includes metadata on speaker information, timestamp, and speech intent. Teenage users (13–19 years) comprise 63% of utterances, with early 20s users making up the remainder.

(2): Emotion-Tagged Adolescent Free Conversation Dataset

This dataset includes 9477 conversation sessions (7828 indoor, 1649 outdoor). Each utterance is tagged with one of seven basic emotions (joy, sadness, anger, fear, surprise, disgust, neutral) and three intensity levels (weak, moderate, strong). These emotion tags provide essential context for analyzing manipulation patterns. The dataset predominantly features adolescents (65%), with additional participants from children (15%) and young adults/adults (20%). This age diversity enables us to analyze gaslighting patterns across developmental stages while maintaining our adolescent focus. The broader age range strengthens the model’s generalizability and facilitates comparative analysis of manipulation tactics. Since neither dataset included explicit gaslighting labels, we developed a systematic labeling strategy:

For dataset labeling, we implemented a comprehensive two-phase approach. In the initial expert labeling phase, a panel of seven domain specialists (two clinical psychologists, two counseling specialists, and three adolescent counselors) developed detailed annotation guidelines for the five major gaslighting types: reality distortion, emotional manipulation, blame-shifting, isolation, and gradual intensity. These experts independently annotated approximately 10% (1500 conversations) of the dataset, achieving substantial inter-rater reliability (Cohen’s Kappa = 0.78). To efficiently scale our labeled dataset while maintaining quality, we employed a rigorous semi-supervised learning methodology. Using the expert-annotated corpus as training data, we developed an initial classifier based on a simplified BERT architecture. This model then processed the remaining unlabeled conversations, with predictions stratified into three confidence tiers:

High confidence (>0.85): These predictions (5124 conversations, 58.6% of total) were automatically incorporated as new labels.
Medium confidence (0.7–0.85): These predictions (2118 conversations, 24.2% of total) underwent expert review before inclusion.
Low confidence (<0.7): These predictions were excluded from the training data.

This iterative process was repeated three times, with each iteration refining the classifier on the expanded labeled dataset. To validate the quality of the semi-supervised labeling, we performed a final random sampling of 200 conversations for expert review, confirming 92% agreement with the algorithmic labels. The final labeled dataset distribution was as follows (Table 2):

As shown in Table 2, our dataset exhibits a natural class imbalance, with reality distortion and emotional manipulation accounting for 70.9% of instances, while isolation and gradual intensity represent only 7.5%. This distribution reflects the real-world prevalence of different gaslighting types in adolescent digital communications, as confirmed by our expert panel. To address potential model bias from this imbalance while maintaining ecological validity, we implemented a balanced approach: (1) using a weighted loss function during training to give more attention to minority classes and (2) applying stratified sampling in our cross-validation procedure. We deliberately avoided excessive synthetic data generation, as it might introduce artifacts that do not represent genuine manipulation patterns. Instead, we carefully tuned our model’s confidence thresholds for minority classes during the evaluation phase. This approach acknowledges the inherent limitations of working with naturally distributed phenomena while ensuring reasonable detection performance across all gaslighting types. The evaluation results in Section 5 are reported with these considerations in mind.

Text preprocessing incorporates several specialized approaches for handling Korean language features. First, adolescent slang and abbreviations are normalized using a comprehensive dictionary. Second, morphological analysis is performed using the Okt processor (KoNLPy 0.5.2) for effective tokenization of Korean’s agglutinative structure. Third, a context-aware algorithm restores frequently omitted subjects and objects in sentences, which could be exploited for gaslighting manipulation.

In our emotion data preprocessing phase, we standardized emotion labels into seven fundamental categories (joy, sadness, anger, fear, surprise, disgust, neutral) and implemented a normalized intensity scale (0.33 for weak, 0.67 for moderate, 1.0 for strong). We also analyzed emotion transitions throughout conversations to detect patterns indicative of potential manipulation. To address the dataset imbalance issue, we employed specialized data augmentation techniques. These included a Modified Easy Data Augmentation (EDA) adapted for Korean language characteristics and a Gaslighting Pattern Variation approach that used templates to maintain gaslighting intent while diversifying expressions. This comprehensive preprocessing pipeline enabled us to build a robust dataset that effectively captures both the linguistic subtleties of Korean adolescent communication and various gaslighting patterns.

3.4. System Architecture

The gaslighting detection and intervention system follows a comprehensive agent-based architecture designed to monitor, analyze, and respond to digital communications in real time. Figure 3 illustrates the system architecture.

The system architecture consists of two primary processing stages: data collection/preprocessing and analysis/intervention. The Data Collection Module captures real-time communication data from various digital platforms, monitoring conversation flows and extracting metadata such as timestamps and user IDs. The Preprocessing Module handles text normalization, tokenization, and context integration, implementing specialized Korean-specific processing, including morphological analysis and slang normalization essential for capturing nuanced manipulation patterns in adolescent communications.

The Analysis and Intervention stage constitutes the system’s core functionality. The AI Analysis Module serves as the primary analytical engine, employing a BERT-LSTM hybrid model with attention mechanisms to detect gaslighting by analyzing linguistic patterns and emotional context. Following detection, the Intervention Module generates appropriate responses based on the analysis results, implementing a three-tiered intervention strategy ranging from subtle awareness prompts to direct intervention depending on the severity of detected manipulation.

The system functions as a sophisticated AI agent with four key capabilities: perception abilities for monitoring and processing communication streams; reasoning capabilities through the BERT-LSTM model for pattern detection; decision-making processes for intervention strategies; and action implementation through user interfaces. Several innovative technical components enhance system performance, including a real-time processing pipeline optimized for low-latency analysis, multimodal integration of text and emotional indicators, context-aware processing that maintains conversation history, and adaptive intervention with risk-stratified response strategies that incorporate user feedback.

The implementation utilizes a robust technical stack with Python 3.9 and PyTorch 1.9.0 for deep-learning components. Key technologies include KoBERT for contextual embeddings, Bidirectional LSTM with attention mechanism for sequence processing, KoNLPy 0.5.2 for Korean language processing, PostgreSQL 13.4 for data management, and FastAPI 0.78.0 (based on Flask 2.0.0) for system interface. This architecture ensures both modularity and extensibility, facilitating system updates and integration across various digital platforms while maintaining performance optimization for real-time detection requirements.

3.5. BERT-LSTM Hybrid Model

At the heart of our gaslighting detection system lies a sophisticated BERT-LSTM hybrid model specifically engineered for Korean language processing. This innovative architecture combines BERT’s advanced contextual embedding capabilities with LSTM’s proficiency in capturing sequential patterns in conversations, creating a powerful tool for analyzing complex linguistic structures.

The model architecture is composed of five essential layers working in harmony: a BERT Encoding Layer utilizing KoBERT pre-trained on Korean corpus (embedding dimension: 768), a Bidirectional LSTM Layer (hidden state size: 256) for processing sequences in both directions, a Hierarchical Attention Mechanism (8 attention heads) for identifying manipulative patterns, an Emotion Integration Layer (emotion embedding size: 64) for emotional context analysis, and a Classification Layer making the final detection decisions through a feed-forward neural network with dropout regularization (rate: 0.2). Based on the information you have provided, I’ll create an integrated formula that captures the complete model pipeline:

ŷ = σ (W \cdot [\sum (α_i \cdot h_i) \oplus e] + b)

(1)

h_i = L S T M (B E R T (t_i))

represents the hidden state from LSTM after processing BERT embeddings of token

t_i

.

α_i

are attention weights calculated using a learnable query vector

q

e is the emotion embedding,

\oplus

denotes vector concatenation,

σ

is the sigmoid activation function,

W

and

b

are learnable weight matrix and bias parameters.

The structure of this architecture is visually represented in Figure 4.

This unified equation represents our comprehensive model architecture, integrating BERT embeddings, LSTM processing, attention mechanisms, and emotion embedding for final classification. The implementation employs State-of-the-Art techniques, including 512-token chunk processing with sliding windows, progressive BERT layer unfreezing, and optimization using AdamW with linear learning rate decay. Our model training used five-fold cross-validation with early stopping based on F1 scores, completing in about 3 h on a GPT API 4.0.

The implementation uses efficient processing parameters with a 128-token maximum sequence length, analyzing conversations both holistically and turn-by-turn. We employed a fine-tuning strategy with progressive BERT layer unfreezing and carefully selected hyperparameters (batch size: 8, learning rate: 2 × 10⁻⁵, dropout: 0.2). The model training was conducted in a CPU environment, taking approximately 45 min, with the dataset split into training (80%), validation (10%), and test (10%) sets. Performance evaluation used comprehensive metrics, including accuracy, precision, recall, and F1-score, with detailed results in Section 5.

3.6. Emotion Tag Integration

Our system’s key innovation lies in the sophisticated integration of emotion tag data with textual analysis, creating a multimodal approach that significantly enhances gaslighting detection accuracy. This integration operates through three primary mechanisms: emotion sequence analysis for tracking suspicious emotional transitions (such as shifts from confidence to confusion), emotion-text alignment for identifying discrepancies between textual content and emotional expression, and emotional response modeling for analyzing emotional changes following potential manipulation attempts. To implement this comprehensive emotional analysis framework, we developed three distinct methods for emotion tag integration:

We employ three distinct methods for emotion tag integration in our system: Early Fusion, where emotion tags are embedded and concatenated with token embeddings before BERT processing to enable early learning of text-emotion interactions; Feature-Level Fusion, which incorporates emotion embeddings after BERT processing but before LSTM analysis to maintain linguistic information while adding emotional context; and Decision-Level Fusion, which trains separate text and emotion analysis models and combines their outputs through a confidence-based gating mechanism.

Comparative experiments indicated that feature-level fusion achieved the best performance, balancing the preservation of linguistic nuance with effective emotion integration. This approach was implemented as described in Equation (1), where emotion embeddings are concatenated with BERT-processed tokens before LSTM processing, allowing the model to leverage both linguistic and emotional information in its sequential analysis.

To evaluate the contribution of emotion tag integration, we conducted ablation studies comparing text-only, emotion-only, and integrated models. The results, detailed in Section 5.1, demonstrate substantial performance improvements from emotion integration, particularly for subtle gaslighting cases.

3.7. Intervention System Design

The intervention system employs a risk-stratified approach to address detected gaslighting in real-time, guided by three fundamental principles: proportionality, where interventions scale according to the detected risk level; autonomy, ensuring users maintain control over intervention actions; and educational value, providing context and learning opportunities. This comprehensive framework is structured into three distinct tiers that correspond to different risk levels, enabling targeted and appropriate responses to various gaslighting situations.

The pseudocode for the intervention system is presented in Algorithm 1. Algorithm 1 is the detailed algorithm for conversation analysis and feedback generation:

Algorithm 1. Intervention System Logic

function DETERMINEINTERVENTION(gaslightingProbability, conversationContext, userProfile)
if gaslightingProbability ≥ 0.9 then
riskLevel ← HIGH
else if gaslightingProbability ≥ 0.7 then
riskLevel ← MEDIUM
else if gaslightingProbability ≥ 0.5 then
riskLevel ← LOW
else
return null // No intervention needed
end if
intervention ← SELECTINTERVENTIONSTRATEGY(riskLevel)
personalizedIntervention ← PERSONALIZEINTERVENTION(intervention, userProfile)
if ISRECENTLYSIMILARINTERVENTION(personalizedIntervention, userProfile) then
personalizedIntervention ← ADJUSTFORREPETITION(personalizedIntervention)
end if
alertMessage ← GENERATEALERTMESSAGE(personalizedIntervention, conversationContext)
actionOptions ← GENERATEACTIONOPTIONS(riskLevel)
return {
"alertMessage": alertMessage,
"actionOptions": actionOptions,
"riskLevel": riskLevel,
"deliveryMethod": DETERMINEDELIVERYMETHOD(riskLevel)
}
end function

The intervention system employs a three-tiered approach based on confidence levels: Tier 1 (Low Risk, 0.5–0.7) provides informational notifications with options to view explanations or monitor further; Tier 2 (Medium Risk, 0.7–0.9) issues warning notifications offering detailed explanations, coping strategies, and trusted contact connections; and Tier 3 (High Risk, >0.9) implements urgent interventions with immediate conversation pause suggestions, guided response templates, and direct connections to support resources.

The intervention delivery is designed to be non-intrusive while remaining accessible, appearing as a floating notification that does not disrupt ongoing conversation. The system is theoretically designed to adapt based on user responses to interventions, with architecture capable of automatically adjusting alert frequency and sensitivity based on feedback. Intervention content was developed in collaboration with adolescent psychology experts and incorporates age-appropriate language and concepts. For ethical reasons, all interventions maintain user agency—the system suggests rather than enforces actions, with users retaining full control over their communications.

User feedback is collected following interventions to evaluate effectiveness and refine the system. Intervention strategies are regularly reviewed and updated based on accumulated feedback and emerging research on effective gaslighting countermeasures.

4. Experimental Setup

4.1. Evaluation Metrics

To comprehensively assess the performance of our gaslighting detection system, we employed a range of evaluation metrics addressing both technical performance and practical utility.

In our technical evaluation of detection accuracy, we used five standard classification metrics: Accuracy measures the overall correct classification rate for both positive and negative cases; Precision evaluates how many of our identified gaslighting instances were actually gaslighting; Recall determines how well we detect all actual gaslighting occurrences; F1 Score provides a balanced measure between precision and recall; and ROC-AUC assesses the model’s discrimination ability across various threshold settings.

We developed specialized metrics beyond standard ones to evaluate gaslighting detection in conversational contexts. These include Turn-Level Detection Accuracy for identifying specific gaslighting instances, Context-Aware Precision for evaluating detection accuracy with conversation history, and Manipulation Type Accuracy for classifying different forms of gaslighting behavior. For emotion tag integration evaluation, we implemented two key metrics: the Emotion-Text Alignment Score, which measures how well emotional signals correlate with textual manipulation patterns, and the Cross-Modal Contribution Index, which quantifies the relative impact of emotional versus textual features in detection decisions.

Our intervention system assessment utilized both quantitative and qualitative measures. We tracked the Intervention Acceptance Rate to measure user engagement, collected User Perceived Helpfulness ratings on a five-point Likert scale, and incorporated Expert Assessment through professional evaluation of intervention appropriateness and effectiveness.

4.2. Baseline Models

To provide a comprehensive evaluation framework for our BERT-LSTM hybrid model with emotion tag integration, we implemented multiple baseline models representing different approaches to text classification. These ranged from simple rule-based methods to advanced deep-learning architectures, allowing us to assess the relative advantages of our proposed system. Our baseline implementations included a keyword-based detection system using gaslighting-related lexicons, traditional machine-learning approaches (SVM and Random Forest with TF-IDF features), various deep-learning models (KoBERT, Bidirectional LSTM, and CNN-based classifiers), and an emotion-only model focused solely on emotional pattern recognition. All models underwent identical training and evaluation procedures, including five-fold cross-validation, to ensure fair performance comparison and statistical validity.

4.3. Ablation Studies

To comprehensively evaluate our system’s performance, we conducted extensive ablation studies examining four key components: architecture components, emotion integration approaches, contextual features, and data augmentation techniques. Each component was systematically removed or modified to understand its contribution to the overall system performance. The architecture components study focused on the impact of BERT, LSTM, and attention mechanisms, while emotion integration examined various fusion approaches from early to decision-level integration. We also investigated the role of contextual features by varying conversation history length and explored different data augmentation techniques, including EDA and pattern-based methods. These systematic evaluations provided crucial insights that guided the optimization of our final system architecture.

4.4. Expert Evaluation Protocol

To ensure a comprehensive evaluation of the system’s psychological validity and practical effectiveness, we assembled a diverse panel of 12 expert professionals. The panel consisted of four clinical psychologists specializing in adolescent mental health, four school counselors experienced in digital communication issues, and four youth educators with expertise in online safety.

The evaluation protocol followed a structured three-phase approach. First, experts participated in a System Demonstration phase where they analyzed 20 pre-selected conversation examples. This was followed by an Independent Assessment phase where each expert evaluated detection accuracy, risk level assignments, intervention strategies, and psychological impact through 50 unseen conversation samples. Finally, in the Structured Feedback phase, experts provided quantitative ratings and qualitative feedback through semi-structured interviews. This protocol, approved by the institutional ethics committee, adhered to established guidelines for technology assessment in mental health applications.

5. Results

5.1. Detection Performance

The gaslighting detection system demonstrated strong performance across various metrics, significantly outperforming baseline approaches. Table 3 presents the comparative performance of our BERT-LSTM hybrid model with emotion tag integration against baseline models.

Statistical significance was determined using paired t-tests over five-fold cross-validation results. * p < 0.05, ** p < 0.01, *** p < 0.001.

To evaluate the model’s performance across age groups, we analyzed the test data by age category. Table 4 shows the detection performance metrics for each group.

The results demonstrate that while the model achieves peak performance with adolescent populations (our focused group), it maintains robust performance across all age groups. This suggests that fundamental gaslighting patterns persist across age boundaries, even as their specific expressions may vary developmentally.

Our BERT-LSTM hybrid model with emotion tag integration achieved the highest performance across all metrics, with an overall accuracy of 89.4% and an F1 score of 84.9%. This represents a significant improvement over the best-performing baseline model (BERT-only), with a 4.8 percentage point increase in accuracy and a 3.4 percentage point increase in F1 score. The performance advantage was particularly pronounced for subtle gaslighting patterns, where the integration of emotional context provided crucial additional information. For instance, in cases of reality distortion without explicit contradictions, the emotion-integrated model achieved a 12.7 percentage point higher recall compared to the BERT-only model.

Our comprehensive ablation studies provided valuable insights into the system’s architecture and performance. The BERT component proved to be particularly crucial, with its removal causing a significant 10.2 percentage point drop in the F1 score, while LSTM removal led to a smaller 5.3 percentage point decrease. The hierarchical attention mechanism also demonstrated substantial value, improving the F1 score by 3.7 percentage points, especially in detecting complex manipulation patterns across longer conversations.

Further analysis revealed the effectiveness of additional components in enhancing system performance. The integration of emotion tags contributed to a 1.5 percentage point improvement in overall F1 score, with feature-level fusion emerging as the most effective approach. Additionally, expanding the conversation context from single utterances to full conversation history yielded a notable 7.2 percentage point increase in F1 score, highlighting the crucial role of broader contextual understanding in accurate gaslighting detection.

As shown in Figure 5, emotion tag integration yielded the greatest improvements for emotional manipulation (7.3 percentage points) and isolation (6.8 percentage points) gaslighting types while providing more modest gains for reality distortion (2.1 percentage points). This pattern aligns with our understanding of these gaslighting types—emotional manipulation and isolation directly involve emotional tactics, making emotional context particularly valuable for detection.

Our analysis included three distinct types of psychological manipulation scenarios: reality distortion, emotional manipulation, and potential false positives. The system demonstrated high accuracy in detecting reality distortion patterns, with confidence scores reaching up to 0.92 in clear cases of fact manipulation. For subtle emotional manipulation scenarios, the integration of emotion tags significantly improved detection capabilities, increasing confidence scores from 0.61 to 0.76 on average.

The system also showed strong capabilities in distinguishing between genuine gaslighting attempts and normal disagreements. In cases initially flagged as potential manipulation, contextual analysis helped reduce false positives by adjusting confidence scores below intervention thresholds when appropriate.

This case demonstrates the system’s ability to distinguish between legitimate disagreements and gaslighting attempts through contextual analysis, avoiding false positive interventions.

5.2. Intervention System Evaluation

The intervention system was evaluated through both expert assessment and limited user testing. The expert evaluation involved 12 professionals rating various aspects of the intervention system on a five-point scale (1 = Poor, 5 = Excellent). The expert evaluation yielded predominantly positive results across multiple dimensions, with User Agency scoring the highest at 4.7/5, followed by Timeliness at 4.5/5 and Clinical Appropriateness at 4.3/5. While Clarity of Communication received a solid 4.1/5 with some suggestions for improvement, Potential Effectiveness scored 3.9/5, with experts emphasizing the need for longitudinal studies to fully validate the system’s real-world impact.

Expert interviews revealed several key findings: while the tiered approach received praise for its proportional risk response, experts emphasized the need for age-specific customization within adolescent groups, expressed concerns about users potentially becoming overly dependent on the system instead of developing their own assessment abilities, and suggested incorporating more psychoeducational materials into the platform.

6. Discussion and Conclusions

Our research reveals significant findings in detecting and preventing gaslighting in adolescent digital communications. The integration of emotion tag data with textual analysis substantially improved performance, particularly in identifying subtle forms of manipulation. This multimodal approach showed a 15% improvement in the F1 score for emotional manipulation detection, demonstrating the importance of analyzing both explicit linguistic patterns and implicit emotional dynamics. Conversation context was crucial for accurate detection. By analyzing complete conversation histories rather than isolated statements, our system achieved a 7.2 percentage point improvement in F1 score. This approach mirrors how humans naturally detect manipulation by recognizing patterns and inconsistencies across multiple interactions.

The study uncovered distinct Korean-specific gaslighting patterns, including the manipulation of honorific forms and strategic subject omission. Our Korean-optimized system significantly outperformed translated general models, highlighting the need for culturally specific NLP approaches that account for unique linguistic features and social dynamics. Our expert evaluation indicated higher appropriateness ratings for interventions in high-risk situations compared to low-risk scenarios. This indicates users are more receptive to system support when facing severe manipulation attempts, which informs future intervention design strategies.

Our analysis of detection performance across age groups revealed that while gaslighting’s core mechanisms remain consistent, there are notable age-specific manifestations. The model performed strongly across all age categories (F1 scores ranging from 82.7% to 85.7%), with the highest accuracy for adolescent communications (90.1%). These results suggest that gaslighting follows fundamental patterns that transcend age boundaries, though specific linguistic expressions, emotional manipulation strategies, and contextual elements vary by developmental stage and relationship dynamics.

Despite promising results, several key limitations warrant acknowledgment. First, our dataset may not fully represent the diverse range of adolescent communication patterns and gaslighting tactics despite its substantial size. Second, emotion tag reliability presents challenges due to the subjective nature of emotion expression and the potential oversimplification of complex emotional states through discrete categories. Third, real-world implementation differs markedly from our controlled research environment, raising concerns about privacy, performance with evolving language patterns, and technical integration. Finally, the study lacks comprehensive user testing and longitudinal effectiveness assessment.

From an ethical perspective, questions persist about the appropriate boundaries of automated psychological interventions—specifically, the balance between protection and surveillance, the impact of false positives on relationships, and the risk of creating technological dependencies.

Future research should expand beyond text and emotion analysis to include additional modalities such as voice patterns and facial expressions. Longitudinal studies are essential for understanding the system’s long-term impact and effectiveness. Cross-cultural adaptation represents another vital direction, potentially revealing which gaslighting patterns transcend cultural boundaries and which require specific cultural considerations. Personalization mechanisms that learn from individual user patterns could improve both detection accuracy and intervention effectiveness. Additionally, educational extensions could provide valuable prevention tools through gamified learning experiences.

This research presents a novel approach to detecting and addressing gaslighting through a natural language processing-based AI system. By integrating BERT-LSTM architecture with emotion analysis, we achieved 89.4% accuracy in identifying gaslighting patterns in Korean language contexts. The system’s tiered intervention framework provides proportional, user-centered support tailored to different risk levels. Our findings show that multimodal analysis significantly enhances detection performance, especially for subtle manipulation tactics, while underscoring the importance of cultural and linguistic specificity in addressing psychological manipulation.

In conclusion, this research demonstrates the effectiveness of NLP-based approaches for detecting and addressing gaslighting in digital communications, particularly benefiting adolescents while showing promise across age groups. Through continued refinement of these approaches—focusing on both universal manipulation tactics and age-specific expressions—we can develop more effective tools to protect vulnerable individuals from psychological manipulation in digital environments.

Author Contributions

Conceptualization, S.Y.; methodology, S.Y.; software, S.Y.; validation, S.Y.; formal analysis, S.Y. and B.K.; investigation, S.Y.; resources, S.Y.; data curation, S.Y.; writing—original draft preparation, S.Y.; writing—review and editing, S.Y. and B.K.; visualization, S.Y.; supervision, B.K.; project administration, S.Y.; funding acquisition, B.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable. This study utilized publicly available anonymized datasets from AI Hub (Korea’s national AI platform) with all personal identifiers removed. The research protocol was reviewed and exempted.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Ministry of Science and ICT; National Information Society Agency. Investigation on the Over-Dependence on Smartphones; NIA VIII-RSE-C-23065; National Information Society Agency: Daegu, Republic of Korea, 2023. [Google Scholar]
Jobst, N. Cyberbullying in South Korea—Statistics & Facts. Statista. Available online: https://www.statista.com/topics/10240/cyberbullying-in-south-korea/ (accessed on 15 March 2025).
Yoon, S. AI-Based Digital Therapeutics for Adolescent Mental Health Management and Disaster Response. Information 2024, 15, 620. [Google Scholar] [CrossRef]
Shekhar, S.; Tripathi, K.M. Impact of Gaslighting on Mental Health among Young Adults. Int. J. Indian Psychol. 2024, 12, 3941–3950. [Google Scholar]
Abell, L.; Brewer, G.; Qualter, P.; Austin, E. Machiavellianism, Emotional Manipulation, and Friendship Functions in Women’s Friendships. Pers. Individ. Differ. 2016, 88, 108–113. [Google Scholar] [CrossRef]
Kim, D.-H.; Son, W.-H.; Kwak, S.-S.; Yun, T.-H.; Park, J.-H.; Lee, J.-D. A Hybrid Deep Learning Emotion Classification System Using Multimodal Data. Sensors 2023, 23, 9333. [Google Scholar] [CrossRef] [PubMed]
Farhadipour, A.; Ranjbar, H.; Chapariniya, M.; Vukovic, T.; Ebling, S.; Dellwo, V. Multimodal Emotion Recognition and Sentiment Analysis in Multi-Party Conversation Contexts. arXiv 2025, arXiv:2503.06805. [Google Scholar]
Yoo, S.; Lee, H.; Song, J.; Kim, J.; Lee, J.; Yoon, S. A Korean Emotion-Factor Dataset for Extracting Emotion and Factors in Korean Conversations. Sci. Rep. 2023, 13, 18547. [Google Scholar] [CrossRef] [PubMed]
Stark, C.A. Gaslighting, Misogyny, and Psychological Oppression. Monist 2019, 102, 221–235. [Google Scholar] [CrossRef]
Sweet, P.L. The Sociology of Gaslighting. Am. Sociol. Rev. 2019, 84, 851–875. [Google Scholar] [CrossRef]
Kim, Y.; Kim, Y.-I.; Kim, K. Differences of Linguistic and Psychological Dimensions between Internet Malicious and Normal Comments. J. Korean Data Anal. Soc. 2013, 15, 3191–3201. [Google Scholar]
Philipo, A.; Sarwatt, D.; Ding, J.; Daneshmand, M.; Ning, H. Assessing Text Classification Methods for Cyberbullying Detection on Social Media Platforms. arXiv 2024, arXiv:2412.19928. [Google Scholar]
Abdali, S. Multi-modal Misinformation Detection: Approaches, Challenges and Opportunities. arXiv 2022, arXiv:2203.13883. [Google Scholar] [CrossRef]
Sultan, D.; Mendes, M.; Kassenkhan, A.; Akylbekov, O. Hybrid CNN-LSTM Network for Cyberbullying Detection on Social Networks using Textual Contents. Int. J. Adv. Comput. Sci. Appl. 2023, 14. [Google Scholar] [CrossRef]
Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
Kim, Y.; Kim, J.H.; Lee, J.M.; Jang, M.J.; Yum, Y.J.; Kim, S.; Shin, U.; Kim, Y.M.; Joo, H.J.; Song, S. A Pre-trained BERT for Korean Medical Natural Language Processing. Sci. Rep. 2022, 12, 13847. [Google Scholar] [CrossRef] [PubMed]
Song, J.; Kim, B.; Kim, M.; Iverson, P. The Korean Speech Recognition Sentences: A Large Corpus for Evaluating Semantic Context and Language Experience in Speech Perception. J. Speech Lang. Hear. Res. 2023, 66, 3399–3412. [Google Scholar] [CrossRef] [PubMed]
Ravindran, V.; Shreejith, G.; Jetti, A.; Sivanaiah, R.; Deborah, A.; Thankanadar, M.; Milton, S. TECHSSN at SemEval-2024 Task 10: LSTM-based Approach for Emotion Detection in Multilingual Code-Mixed Conversations. In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), Mexico City, Mexico, 9–14 June 2024; pp. 763–769. [Google Scholar] [CrossRef]
Park, S. KR-BERT: A Small-Scale Korean-Specific Language Model. arXiv 2020, arXiv:2008.03979. [Google Scholar]
Lee, S.K.; Kim, S.-D. A Spoken Dialogue Analysis Platform for Effective Counselling. Teh. Vjesn. 2022, 29, 1592–1601. [Google Scholar] [CrossRef]
Zadeh, A.; Liang, P.; Mazumder, N.; Poria, S.; Cambria, E.; Morency, L.-P. Memory Fusion Network for Multi-view Sequential Learning. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32. [Google Scholar] [CrossRef]
Kim, T.-Y.; Yang, J.; Park, E. MSDLF-K: A Multimodal Feature Learning Approach for Sentiment Analysis in Korean Incorporating Text and Speech. IEEE Trans. Multimed. 2024, 27, 1266–1276. [Google Scholar] [CrossRef]
Jung, M.; Lim, Y.; Kim, S.; Jang, J.Y.; Shin, S.; Lee, K.-H. An Emotion-based Korean Multimodal Empathetic Dialogue System. In Proceedings of the Second Workshop on When Creative AI Meets Conversational AI, Gyeongju, Republic of Korea, 12–17 October 2022; pp. 16–22. [Google Scholar]

Figure 1. Distribution of gaslighting victimization experiences among young adults [4].

Figure 2. Korean Cultural Context Analysis Framework for Gaslighting Detection.

Figure 3. System Architecture.

Figure 4. BERT-LSTM Hybrid Model Architecture.

Figure 5. Impact of emotion tag integration on F1 Score by gaslighting type.

Table 1. Gaslighting types, characteristics, and detection parameters.

Gaslighting Type	Characteristics	Linguistic Markers	Key Parameters
Reality Distortion	Denial or distortion of the target’s experiences or memories	“That never happened”, “Your memory is wrong”, “I never said that”	contradiction_threshold: 0.75 history_window_size: 10 min_contradiction_confidence: 0.65
Emotional Manipulation	Invalidation or manipulation of the target’s emotional responses	“Don’t be so sensitive”, “There’s no reason to be angry”, “Calm down”	emotion_invalidation_threshold: 0.7 emotion_discrepancy_weight: 0.65 confidence_to_confusion_weight: 0.8
Blame Shifting	Redirection of responsibility from manipulator to target	“I did that because you acted that way”, “It’s your fault”, “I had no choice”	responsibility_inversion_threshold: 0.68<br>blame_pattern_weight: 0.75<br>causal_language_markers: [“because”, “since”]
Isolation	Social isolation of the target	“Your friends don’t understand you”, “Let’s keep this between us”, “Don’t tell others”	relationship_undermining_threshold: 0.7 isolation_pattern_confidence: 0.72 secrecy_phrases: [“between us”, “don’t tell”]
Gradual Intensity	Incremental increase in manipulation intensity over time	Initial: subtle contradictions Later: direct reality distortion Advanced: complete invalidation of the target’s perceptions	baseline_window_size: 5 intensity_monitoring_period: 20 min_escalation_delta: 0.15

Table 2. Gaslighting label distribution.

Category	Count	Percentage
Total conversations	8742	100%
Gaslighting-containing conversations	1283	14.7%
Gaslighting type distribution:
Reality distortion	498	38.8%
Emotional manipulation	412	32.1%
Blame shifting	276	21.5%
Isolation	67	5.2%
Gradual intensity	30	2.3%
Labeling method:
Direct expert labeling	1500	17.2%
High-confidence automatic labeling	5124	58.6%
Human-AI collaborative labeling	2118	24.2%

Table 3. Performance comparison of gaslighting detection models.

Model	Accuracy	Precision	Recall	F1 Score	ROC-AUC
Keyword-Based	68.5%	72.3%	54.1%	61.9%	0.671
SVM (TF-IDF)	76.2%	77.8%	68.5%	72.9%	0.762
Random Forest	74.7%	80.2%	61.3%	69.5%	0.748
BERT-only	84.6%	83.3%	79.8%	81.5%	0.881
LSTM-only	79.5%	78.1%	74.2%	76.1%	0.821
CNN-based	80.3%	79.4%	75.5%	77.4%	0.834
Emotion-Only	71.8%	70.4%	67.3%	68.8%	0.728
BERT-LSTM (no emotion)	86.5%	84.7%	82.1%	83.4%	0.902
BERT-LSTM (with emotion)	89.4%	86.2%	83.7%	84.9%	0.921

Table 4. Detection performance by age group.

Age Group	Accuracy	Precision	Recall	F1 Score
Adolescents (13–19)	90.1%	87.3%	84.2%	85.7%
Children (8–12)	86.5%	84.0%	81.9%	82.9%
Young Adults (20–25)	88.7%	85.4%	82.8%	84.1%
Adults (26+)	87.2%	83.9%	81.5%	82.7%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yoon, S.; Kim, B. Adaptive Intervention Architecture for Psychological Manipulation Detection: A Culture-Specific Approach for Adolescent Digital Communications. Information 2025, 16, 379. https://doi.org/10.3390/info16050379

AMA Style

Yoon S, Kim B. Adaptive Intervention Architecture for Psychological Manipulation Detection: A Culture-Specific Approach for Adolescent Digital Communications. Information. 2025; 16(5):379. https://doi.org/10.3390/info16050379

Chicago/Turabian Style

Yoon, Sungwook, and Byungmun Kim. 2025. "Adaptive Intervention Architecture for Psychological Manipulation Detection: A Culture-Specific Approach for Adolescent Digital Communications" Information 16, no. 5: 379. https://doi.org/10.3390/info16050379

APA Style

Yoon, S., & Kim, B. (2025). Adaptive Intervention Architecture for Psychological Manipulation Detection: A Culture-Specific Approach for Adolescent Digital Communications. Information, 16(5), 379. https://doi.org/10.3390/info16050379

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Adaptive Intervention Architecture for Psychological Manipulation Detection: A Culture-Specific Approach for Adolescent Digital Communications

Abstract

1. Introduction

2. Related Work

2.1. Gaslighting: Definition and Characteristics

2.2. NLP Approaches to Harmful Dialogue Detection

2.3. Emotion Analysis and Multimodal Approaches

3. Methodology

3.1. Research Design Overview

3.2. Gaslighting Pattern Detection Techniques

3.2.1. Classification of Major Gaslighting Types

3.2.2. Linguistic Feature Analysis

3.2.3. Emotional Transition Analysis

3.2.4. Cultural Context Consideration

3.3. Dataset Construction and Preprocessing

3.4. System Architecture

3.5. BERT-LSTM Hybrid Model

3.6. Emotion Tag Integration

3.7. Intervention System Design

4. Experimental Setup

4.1. Evaluation Metrics

4.2. Baseline Models

4.3. Ablation Studies

4.4. Expert Evaluation Protocol

5. Results

5.1. Detection Performance

5.2. Intervention System Evaluation

6. Discussion and Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI