FusionBullyNet: A Robust English—Arabic Cyberbullying Detection Framework Using Heterogeneous Data and Dual-Encoder Transformer Architecture with Attention Fusion

Mahdi, Mohammed A.; Arshed, Muhammad Asad; Mumtaz, Shahzad

doi:10.3390/math14010170

Open AccessArticle

FusionBullyNet: A Robust English—Arabic Cyberbullying Detection Framework Using Heterogeneous Data and Dual-Encoder Transformer Architecture with Attention Fusion

by

Mohammed A. Mahdi

¹

,

Muhammad Asad Arshed

^2,*

and

Shahzad Mumtaz

³

¹

Information and Computer Science Department, College of Computer Science and Engineering, University of Ha’il, Ha’il 55476, Saudi Arabia

²

School of Systems and Technology, University of Management and Technology, Lahore 54770, Pakistan

³

School of Natural and Computing Sciences, University of Aberdeen, Aberdeen AB24 3UE, UK

^*

Author to whom correspondence should be addressed.

Mathematics 2026, 14(1), 170; https://doi.org/10.3390/math14010170 (registering DOI)

Submission received: 15 December 2025 / Revised: 29 December 2025 / Accepted: 30 December 2025 / Published: 1 January 2026

(This article belongs to the Special Issue Computational Intelligence in Addressing Data Heterogeneity)

Download

Browse Figures

Versions Notes

Abstract

Cyberbullying has become a pervasive threat on social media, impacting the safety and wellbeing of users worldwide. Most existing studies focus on monolingual content, limiting their applicability to online environments. This study aims to develop an approach that accurately detects abusive content in bilingual settings. Given the large volume of online content in English and Arabic, we propose a bilingual cyberbullying detection approach designed to deliver efficient, scalable, and robust performance. Several datasets were combined, processed, and augmented before proposing a cyberbullying identification approach. The proposed model (FusionBullyNet) is based on fine-tuning of two transformer models (RoBERTa-base + bert-base-arabertv02-twitter), attention-based fusion, gradually unfreezing the layers, and label smoothing to enhance generalization. The test accuracy of 0.86, F1 scores of 0.83 for bullying and 0.88 for no bullying, and an overall ROC-AUC of 0.929 were achieved with the proposed approach. To assess the robustness of the proposed models, several multilingual models, such as XLM-RoBERTa-Base, Microsoft/mdeberta-v3-base, and google-bert/bert-base-multilingual-cased, were also trained in this study, and all achieved a test accuracy of 0.84. Furthermore, several machine learning models were trained in this study, and Logistic Regression, XGBoost Classifier, and Light GBM Classifier achieved the highest accuracy of 0.82. These results demonstrate that the proposed approach provides a reliable, high-performance solution for cyberbullying detection, contributing to safer online communication environments.

Keywords:

cyberbullying detection; bilingual text classification; dual-encoder transformers; attention-based fusion; English-Arabic NLP; cross-lingual representation; deep learning

MSC:

68T50; 68T07

1. Introduction

Individual interaction has undergone a revolution due to the widespread adoption of social media. However, this revolution has also become the cause of harmful online behaviors, especially cyberbullying, which is now recognized as one of the most pervasive threats to online safety [1]. Cyberbullying is defined as intimidation, humiliation, or harassment of individuals that causes anxiety, depression, and in severe cases, suicidal tendencies [2]. Given the sheer volume of online content, it is almost impossible for humans to check and verify everything. A recent survey shows that approximately 15% school-aged children report having been cyberbullied in the past [3]. In Sri Lanka, a self-reported survey conducted in 2020 among students aged 14–17 exposed alarming levels of cyberbullying. The study found that 81% of respondents had experienced cyberbullying, while 76.2% reported exposure to both verbal and online harassment. Among victims, the most common form of abuse (71.4%) involved receiving humiliating images or profane or other media through mobile phones. Facebook [4] and Instagram [5] were mentioned as the primary social platforms of cyberbullying by 14 respondents. In contrast, 12 mentioned that they were bullied by SMS, and 10 mentioned that they experienced cyberbullying via phone calls [6].

The recent studies identified several risk factors associated with cyberbullying victimization among children. These include low-income family relationships, female gender, urban residence, underlying mental health conditions, and extended screen time [6]. The consequences of cyberbullying are severe, such as sleep disturbance, declining academic performance, anxiety, behavioural issues, and substance use [7]. Furthermore, cyberbullying victimization has been strongly linked to suicidal ideation, highlighting its potential life-threatening consequences [8]. The victims of cyberbullying often remain silent or confide in friends rather than reporting incidents to adults. A key reason to remain silent is a fear of losing access to electronic media, since parents may impose restrictions on device use as an immediate measure to reduce the risk of further victimization [9].

Compared to the traditional bullying (such as face-to-face bullying [10]), Cyberbullying victims experience higher rates of depressive disorders and suicidality [11]. Patchin and Hinduja [12] also found that cyberbullying victims feel anxious or fearful about attending school, with significant negative impacts on their academic performance. Their study also found that 70% of youths who reported being cyberbullied acknowledged a decline in confidence, and one third reported a negative impact on their friendship. Despite these serious concerns, limited research attention [13] has been devoted to developing an automated system for its detection. Therefore, it is the need of the time to build an automatic system that can quickly find harmful content [14].

The early computational approaches for detecting cyberbullying in text were based on hand-crafted features, such as n-grams [15], TF-IDF [16], and sentiment lexicons [17]. The traditional machine learning (ML) models considered for classification are Support Vector Machine (SVM), Naïve Bayes [18]. Although these models achieved success, they suffered from limited contextual understanding and poor adaptability across diverse online platforms. To overcome these concerns, deep learning (DL) methods such as Convolutional Neural Networks (CNN) and Long-Short Term Memory networks were introduced, which can automatically learn the semantic representation of text, and detection has improved significantly across English language datasets [19].

The introduction of transformer architectures [20] marked a paradigm shift in natural language processing (NLP). Unlike early recurrent neural networks (RNNs) or CNNs, transformers are based on a self-attention mechanism, which enables the model to capture relationships between words rather than their distance within the sequence. The original transformer model utilized multi-head self-attention with 16 attention heads, each operating on 16 weight metrics, enabling the architecture to attend to information from different representation subspaces simultaneously. This design helps eliminate the sequential processing bottleneck in RNNs, enabling highly parallelizable training and superior performance on sequence-to-sequence tasks such as machine translation.

The BERT (Bidirectional Encoder Representations from Transformers) model [21] introduced a bidirectional pre-training framework to learn deep contextual embeddings. BERT can capture richer semantic relationships than unidirectional models because it jointly conditions on both left and right context. BERT and its variants have become the state of the art for various NLP tasks, including sentiment analysis, question answering, and toxic language detection. Moreover, in the context of cyberbullying detection, CyberBERT [22], which utilizes the BERT architecture, demonstrated effective performance.

Furthermore, extending the multilingual capabilities of XLM-R (Cross-Lingual Language Model—RoBERTa) [23] was trained on large corpora in over 100 languages. XLM-R has demonstrated strong cross-lingual transfer capabilities through large-scale self-attention, making it a promising approach for tasks such as cyberbullying detection, particularly in multilingual settings. A recent study [24] that evaluated different BERT and XLM-R models for social media cyberbullying detection found that BERT often balanced performance, and XLM-R variants achieved higher robustness across languages.

While ML, DL, and transformer-based models have been successfully applied to cyberbullying detection, most existing work is restricted to English datasets, with only a few addressing Arabic [25,26,27]. To the best of our knowledge, there is a need to propose a single model capable of detecting cyberbullying in English and Arabic. This gap is critical because these two languages represent different linguistic families and are widely used on social media.

2. Literature Review

The exponential growth of online social sites has increased the opportunity for interaction, but also raises concerns of harmful behaviors such as cyberbullying. Cyberbullying is a repeated and intentional aggression through digital platforms such as Facebook and Instagram and can have severe consequences for victims, including physical and emotional harm. Therefore, the early detection of such bullying is necessary for effective communication and interaction on social sites.

Fang et al. [28] proposed a deep learning-based model, Bi-GRU with self-attention, for detecting text-based cyberbullying. They combined the bidirectional gated recurrent units with a multi-head self-attention mechanism to capture contextual dependencies in both directions and highlight salient tokens. In their study, for experiments, they considered three datasets: two based on tweets [29,30] and one based on Wikipedia [31]. They achieved weighted average precisions of 0.849, 0.961, and 0.943 across all three datasets, respectively.

Bharti et al. [32] proposed a BiLSTM and GloVe embedding-based classifier to detect cyberbullying from tweets. To prepare a final dataset of 35,787 labelled tweets, they considered two sources [29,30] and combined them. The 65% tweets were labelled as cyberbullying, and 35% were labelled as non-cyberbullying. The 80% dataset was used for training, and the remaining 20% was used for testing, achieving an accuracy of 92.60%.

Dewani et al. [33] proposed a hybrid deep neural network-based model for Roman Urdu cyberbullying detection, featuring advanced preprocessing tailored to slang, contractions, and Roman Urdu morphology. They achieved the highest validation accuracies of 85.5% and 85% with RNN-LSTM and RNN-BiLSTM, respectively.

Gada et al. [34] proposed an LSTM-CNN-based hybrid model for detecting English cyberbullying. Their model first uses an LSTM to learn spatial features, and then a CNN to learn local features. They achieved an accuracy of 95.2% with their proposed model in their study.

Ahmed et al. [35] proposed a hybrid deep neural classifier for multi-class cyberbullying, with a special focus on detecting Bangla (Bengali) language cyberbullying. Their dataset consists of five classes (non-bully, sexual, threat, troll, religious) and a total size of 44,001 Facebook comments. They achieved an accuracy of 87.19% for a binary classifier and 85% for a multiclass classifier.

Yi et al. [36] focused on cross-platform cyberbullying detection using the XP-CB framework, combining adversarial learning with transformers (BERT, RoBERTa). In their study, they considered unlabeled data from both the source and target platforms to reduce bias and overall improve generalization. They achieved an average macro-F1 score of 0.693 on cross-platform with their proposed framework.

Joshi et al. [37] performed a comparison between a simple transformer and a hybrid Res-CNN-BiLSTM for cyberbullying detection. In their study, they found that the lightweight transformer (≈0.65 million parameters) outperformed the complex hybrid Res-CNN-BiLSTM (≈48.8 million parameters) in accuracy and generalization, while also being significantly faster to train.

Tashtoush et al. [38] proposed a deep learning-based framework for Arabic cyberbullying detection. They explored various deep learning models, including Bi-LSTM, CNN, LSTM, and a hybrid CNN-LSTM, to categorize Arabic comments for both binary and multiclass cases. Their dataset consists of 20,000 Arabic YouTube comments. In their experiments, CNN and CNN-LSTM performed well, reaching an accuracy of about 91.9%. Meanwhile, multiclass LSTM and BiLSTM performed well with an accuracy of 89.5%.

Fati et al. [39] proposed a deep learning-based model that combines attention and continuous bag-of-words feature extraction methods for Twitter cyberbullying detection. They fine-tuned attention layers to capture inner-tweet context and semantic cues via embeddings. They achieved an accuracy of 94.49% with the Conv1DLSTM model in their study.

Akter et al. [40] proposed an LSTM-Autoencoder model using synthetic data augmentation to address dataset insufficiency, especially in Hindi, Bangla, and English. Their proposed model outperformed baseline models, achieving 95% accuracy across datasets.

Wahid et al. [41] proposed a model transformer architecture enriched with lexical, social profiles, contextual embeddings, and semantic similarity features. Their dataset is based on Bangla text, and they achieved an accuracy of 98% for threats and 90% for sarcastic comments.

Sihab-Us-Sakib et al. [42] developed a Cyberbullying Bengali Dataset (CBD) with 2751 manually labeled Bengali social media comments across five classes. They fine-tuned the transformer-based model for cyberbullying classification, achieving an accuracy of 82.61%.

Kumar et al. [43] explored the detection of bias and cyberbullying in their study using large transformer models, such as DeBERTa, and generated synthetic data. In their study, they found that a combination of real and synthetic data helps the transformer models to detect bias and cyberbullying in tweets.

Kaddoura et al. [44] performed a comparison of BERT vs. LLM models such as Mistral 7B and Llama3. They concluded that BERT outperformed these larger LLMs for multiclass cyberbullying detection, achieving an accuracy of 83.67%.

Gutiérrez-Batista et al. [45] introduced a high-performing method for cyberbullying detection by fine-tuning a sentence transformer better to capture semantic distinctions between bullying and non-bullying text. They considered three datasets (BullyingV3.0 [46], MySpace [47], Hate-Speech [29]) to evaluate their model and achieved accuracies of ~83% and 96% across the datasets.

Nath et al. [48] built a two-layer Bidirectional LSTM model for Bangla cyberbullying and compared the optimizers. They achieved the best accuracy of 95.08% using the Adam optimizer [49] in their theoretical study.

Kumar [50] proposed a hybrid framework that combines a modified DeBERTa with Gated Broad Learning System (GBLS) for English cyberbullying detection. The approach incorporated Squeeze-and-Excitation blocks and sentiment features to support contextual learning. They achieved accuracies of 79.3%, 95.41%, 91.37%, and 94.67% for four different datasets: HateXplain [51], SOSNet [52], Mendeley (Mendeley-I, and Mendeley-II) [53].

In another work, Kumar [54] focused on detecting cyberbullying in code-mixed Hindi-English (Hinglish) text using Multilingual Representations for Indian Languages (MURIL). They considered six different datasets, namely Bohra [55], BullyExplain [56], BullySentemo [57], HASOC-2021 [58], Mendeley [59] and Kumar dataset [60], and achieved accuracies ranging from 83.92% to 94.63%.

Prama et al. [61] improved the cyberbullying detection by introducing a user-specific severity classification framework. Their model integrated user text inputs with user features, including demographic, psychological, and behavioural data, and ultimately used an LSTM architecture to classify posts into one of three levels: not bullying, mild bullying, and severe bullying. They achieved an accuracy of 98% with a trained model of 146 features.

Purkayastha et al. [62] considered various machine learning and deep learning models for detecting English text cyberbullying on Twitter. They trained various ML, DL, and transformer models, including Random Forest, Extra Trees, AdaBoost, XGBoost, Bi-LSTM, BiGRU, and BERT. They achieved an effective accuracy of 92% with the BERT model. The literature summary table for text-based cyberbullying detection is available in Table 1.

The existing literature highlights that most of the cyberbullying detection models are developed using monolingual datasets, with a few works addressing bilingual or multilingual scenarios. In this study, bilingual cyberbullying detection is formulated as a binary text classification task, where social media content written in either English or Arabic is mapped to cyberbullying or non-cyberbullying labels. Despite recent progress in multilingual transformer models, bilingual cyberbullying detection remains a challenging task. Languages such as English and Arabic differ significantly in morphology, syntax, tokenization, and cultural expression of abusive language. Shared multilingual vocabularies often lead to vocabulary fragmentation and reduced sensitivity to language-specific abusive patterns. Moreover, cultural and contextual variations in cyberbullying expressions further complicate cross-lingual generalization. These challenges highlight the need for task-specific bilingual architectures that preserve language-dependent features while enabling effective cross-lingual interaction. To bridge this gap, we proposed an approach to detect cyberbullying in bilingual settings, specifically for English and Arabic content. The major contributions of this work are as follows:

Integration of heterogeneous bilingual datasets: Multiple heterogeneous English and Arabic cyberbullying datasets were integrated and standardized into a binary classification framework, enabling a robust bilingual analysis while preserving linguistic diversity.
Development of a unified preprocessing pipeline: A consistent, language-aware preprocessing framework was designed and applied across all datasets to ensure uniformity and improve model reliability.
Controlled LLM-based Data Augmentation: Large Language Model (LLM)–driven augmentation was employed to improve data diversity, address class imbalance, with filtering strategies applied to preserve semantic consistency and label reliability.
Bilingual Fusion Architecture (FusionBullyNet): We propose a language-specialized dual encoder transformer architecture that combines independently fine-tuned English and Arabic encoders using an attention-based fusion mechanism, allowing adaptive weighting of bilingual representation. Unlike multilingual transformers that rely on a shared embedding space, the proposed model explicitly preserves language-specific abusive patterns that are critical for cyberbullying detection.
Extensive Evaluation against Multilingual and Classical Baselines: Comprehensive benchmarking against state-of-the-art multilingual transformer models and machine learning models demonstrated that the proposed bilingual fusion approach consistently outperforms.

3. Research Methodology

In this section, a detailed description of the proposed methodology is presented, including the dataset selection, preprocessing, and modeling used for cyberbullying detection across Arabic, and English.

3.1. Dataset Description

In this study, we selected several publicly available datasets on English- and Arabic-language Cyberbullying. The English Cyberbullying datasets named “CyberBullying Classification” [64] and “Cyberbullying Dataset” [65] obtained from Kaggle, while the Arabic dataset in this study is the combination of different datasets such as “ARBCyD: Arabic Cyberbullying Dataset” [66], “ArCyC: A Fully Annotated Arabic Cyberbullying Corpus” [67] and “Arabic-Abusive-Datasets” [68]. While these datasets offer multiple fine-grained categories, such as bullying related to religion, ethnicity, and sports, we reformulated the task into a binary classification problem to ensure cross-lingual consistency. Specifically, each dataset was recategorized into two general classes: Cyberbullying (Yes) and non-cyberbullying (No). This binary reformulation prioritizes cross-lingual consistency at the cost of fine-grained category distinction. To maintain computational feasibility, we initially considered 15,000 samples per language-label subset for all classes, except the Arabic-cyberbullying subset, which consists of 8375 samples, resulting in a dataset of 53,375 samples before pre-processing and augmentation. Some samples are shown in Table 2. As the dataset originates from different sources, merging them may introduce label noise and dataset bias, particularly due to cultural and linguistic differences in the expression of cyberbullying between English and Arabic. To mitigate this issue, all datasets were standardized using consistent binary labelling rules to avoid conflicting labels.

3.2. Dataset Pre-Processing

To prepare the dataset for modeling, a unified preprocessing pipeline was applied across two languages (English and Arabic). First, noise elements, such as URLs, user mentions, hashtags, email addresses, digits, emotions, and unnecessary symbols, as well as NaN values, were removed. Unicode normalization was performed to ensure consistency in mixed entries. For Arabic text, diacritics were stripped, elongation characters were removed, and common orthographic variants were standardized [69]. English text was lowercased, contractions were standardized, and non-alphabetic symbols were discarded. Finally, white-space normalization was applied to all samples, resulting in clean, consistent text representations that preserve semantic meaning across three languages. If any sample was null after preprocessing, it was removed. Figure 1 and Figure 2 represent class distributions and average words per sample (class) before and after preprocessing, respectively. Before preprocessing, the number of samples was 31588, and after preprocessing, it decreased to 31312.

Transformer models, such as multilingual BERT, employ subword tokenization and contextual embeddings, which inherently capture word morphology and syntactic relationships. Acs et al. [70] demonstrate that mBERT learns morphological features directly from subword units. Alkaoud & Syed [71] conclude that subword splitting preserves semantic richness and handles rich morphology and out-of-vocabulary words as well. Furthermore, Miyajiwala et al. [72] demonstrate that removing many tokens, including stop words, can degrade the performance of BERT models. Therefore, avoiding stopwords and lemmatization helps maintain alignment with the pretraining corpus and preserve critical semantic cues.

3.3. Exploratory Data Visualization

To better illustrate the effect of preprocessing, word cloud-based visualizations are considered before and after applying the cleaning pipeline. A word cloud represents the most frequent terms in a corpus [73], where the size of each word corresponds to its relative frequency. This provides an intuitive overview of the main words and recurring patterns in the text. In the raw dataset, the word cloud is often dominated by noise elements, such as numbers and hashtags, obscuring meaningful terms. After preprocessing, the cleaned word cloud reveals linguistically significant words in both English and Arabic, enabling a more interpretable and semantically relevant visualization of the data. Figure 3 and Figure 4 show the word clouds of the dataset before and after preprocessing.

3.4. Data Augmentation Using Multilingual Large Language Model

To address the class imbalance and overall improve the generalization of the proposed model, we employed data augmentation using a multilingual language model named ‘bigscience/mt0-base’ [74,75]. This approach enables the generation of high-quality, semantically consistent phrases of the original training texts of multiple languages, such as English and Arabic.

For each training sample, the model was prompted to produce a natural paraphrase while preserving the original meaning, considering the language of the original input text. Batch-wise generation and controlled decoding parameters (beam search, top-p sampling, temperature) were used to maintain efficiency and consistency. Multiple texts to be paraphrased simultaneously rather than sequentially. All generated texts were then post-processed to reduce noise, such as converting sentences to lowercase and removing punctuation to minimize tokenization inconsistency. The augmentation performed on minority classes involves upsampling them to match the size of the majority class samples; see Table 3 for details.

3.5. Dataset Language-Based Split Ratio Before and After Augmentation

The dataset was divided by language and cyberbullying labels to ensure balanced representation of Arabic and English texts. The dataset is divided into 80% for training 10% for validation, and 10% testing. Before augmentation, using stratification, the Arabic training subset consists of 11,993 non-cyberbullying and 6700 cyberbullying samples. The English train subset consists of 11,749 non-cyberbullying samples and 11,905 cyberbullying samples. To ensure balanced representation and model generalization, augmentation was applied only to the training data and was based on the majority samples. Minority samples were augmented to match the majority, i.e., each subset had 11,993 samples. The validation and test subsets remain unchanged, with the Arabic subset containing 1499 non-cyberbullying and 838-837 cyberbullying samples, and the English subsets comprising 1468—1469 non-cyberbullying and 1488–1489 cyberbullying samples, respectively (see Table 4).

3.6. Proposed FusionBullyNet (Bilingual Dual-Encoder Model with Attention-Based Fusion)

The proposed FusionBullyNet introduces an effective dual-encoder bilingual cyberbullying identification framework that leverages language-specific contextual representations and an attention-based fusion to enhance cross-lingual understanding between English and Arabic texts. This model is designed to capture both shared semantic information and language-specific nuances, achieving a balanced and adaptable representation for bilingual tasks.

Overall, the system integrates two pre-trained encoders: RoBERTa-base [76,77] for English and aubmindlab/bert-base-arabertv02-twitter [78] for Arabic. Each encoder independently deals with its representative language input and produces high-dimensional contextual embeddings. These embeddings are then projected into a shared latent space with a linear transformation layer to enable subsequent cross-language interaction. The projected embeddings are combined using an attention-based bilingual fusion layer, as shown in Equation (1).

z_{f u s i o n} = A t t e n t i o n (W_{e n} h_{e n}, W_{a r} h_{a r})

(1)

The fused bilingual representation is passed through a classification head comprising dropout (p = 0.4), a ReLU-activated dense layer, and a final layer to generate class probabilities. See Table 5 for an architecture summary.

Furthermore, to achieve robust generalization, several fine-tuning strategies were employed, including early stopping, gradient clipping, and label smoothing, see Table 6 for fine-tuning and optimization strategies.

In this study, to improve training stability and ensure controlled adaptation of transformer layers, a gradual unfreezing strategy was adopted during fine-tuning. Instead of allowing all layers of the pretrained encoder to update simultaneously, which often leads to catastrophic forgetting of language general knowledge, the model progressively releases trainable layers. At the beginning of fine-tuning, the initial layers of both encoders were frozen to preserve the universal linguistic and syntactic representations learned during large-scale pretraining. Only the top layers, such as two layers for both base models and classification components, remain trainable, allowing for adaptation of high-level semantic features. As training progressed, groups of frozen layers were unfrozen with patience if no improvement was observed. This approach helps strike a balance between stability and plasticity, allowing the model to retain its core linguistic understanding while acquiring domain and task-specific nuances. See Table 7 for an unfreezing strategy.

4. Results and Discussion

In this section, the empirical results of the proposed study are presented. We begin by outlining the evaluation metrics employed, followed by a detailed description of the experimental setup. Next, we present the comprehensive performance of the proposed model and compare it with other models.

4.1. Performance Evaluation Metrics

The performance of the proposed and comparative models is evaluated on the test set using widely adopted evaluation metrics, including accuracy, precision, recall, F1, and ROC-AUC. These metrics are formally defined as:

Accuracy: It quantifies the proportion of correctly classified instances relative to the total number of instances, as shown in Equation (2). Although accuracy is a key factor in determining the model’s overall performance, it is less informative, especially when the dataset is imbalanced.

A c c = \frac{T P + F P}{T P + F P + T N + F N}

(2)

Precision: Precision measures the fraction of correctly predicted instances of a given class among all instances predicted to belong to that class, as shown in Equation (3). High precision is critical for cyberbullying tasks to avoid misclassifying multilingual content as cyberbullying.

P = \frac{T P}{T P + F P}

(3)

Recall: evaluated the model’s ability to retrieve all actual instances of a particular bullying class correctly; see Equation (4). Maximum recall ensures that harmful content, regardless of language, is effectively captured and not overlooked.

R = \frac{T P}{T P + F N}

(4)

F1: F1-Score is the harmonic mean of precision and recall, offering a balanced measure of performance across the majority and minority classes, see Equation (5). It is particularly informative in multilingual, multiclass settings, as it accounts for imbalances across languages.

F 1 = \frac{2 \times P \times R}{P + R}

(5)

AUC-ROC: The AUC-ROC (Area Under the Curve of the Receiver Operating Characteristic) measures the model’s discriminative ability across classes. For the multiclass case, we computed a one-vs-rest ROC curve for each class and aggregated the AUC values using both macro- and micro-averaging. A higher AUC-ROC score indicates superior performance in distinguishing cyberbullying from non-cyberbullying across all languages.

4.2. Experimental Setup

The experiments of this study were conducted on the Kaggle [79] platform using its free computational environment. The setup utilized two NVIDIA T4 GPUs (each with 16 GB of VRAM) operating in parallel under a dual-GPU configuration. Kaggle provides 30 GB of RAM and approximately 57.6 GB of disk space in its free environments. The environment was sufficient for training and evaluating the proposed model, ensuring reproducibility within publicly accessible computational resources. All experiments were executed in Python (v3.10) [80] with essential deep learning libraries, including PyTorch (v2.6.0+cu124) [81] and Transformers [82], summary of the experimental environment is available in Table 8.

4.3. Model Training Configurations and Results

The proposed framework is fine-tuned using carefully selected hyperparameters to ensure optimization and efficient convergence. The AdamW optimizer was considered due to its ability to handle sparse gradients while maintaining weight regularization. A layer-wise learning rate decay strategy is adopted in this study, in which smaller learning rates are assigned to lower layers and higher learning rates to upper layers, thereby balancing adaptability and the retention of pretrained linguistic features.

The encoder was trained with a learning rate of 7 × 10⁻⁶ while the classifier head used a slightly higher learning rate of 1 × 10⁻⁵. Gradient accumulation is used to simulate a larger effective batch size of 128 while maintaining GPU memory efficiency. To prevent overfitting, label smoothing was incorporated during training, and early stopping was also considered in this study. Overall, the model is trained for up to 30 epochs (see Table 9).

The proposed model was trained for 30 epochs with a patience of 5 (based on validation accuracy). During the training phase, the proposed model demonstrated consistent and stable improvements across epochs and a gradual unfreezing strategy. The training accuracy improved steadily from 0.51 in the first epoch to approximately 0.89 in later epochs, while validation accuracy increased from 0.58 to above 0.86, indicating strong generalization capability, see Figure 5. The training and validation losses also decreased significantly during the early epochs, with more gradual refinement in the later stages as deeper layers were progressively unfrozen. Overall, the results confirm that the proposed training pipeline is stable, efficient, and effective at improving cross-lingual performance.

The final evaluation on the test set demonstrates strong, balanced performance across both classes. See Table 10. The model achieved an overall accuracy of ~0.86, confirming its ability to generalize effectively beyond training and validation distributions. Class-wise analysis shows that the model identifies Bullying content with a precision of 0.88 and an F1-score of 0.83, while achieving a precision of 0.85 and an F1-score of 0.88 for the Not_Bullying class. These results confirm that the proposed approach provides reliable performance for bilingual bullying detection and is suitable for deployment in real-world moderation scenarios.

To further analyze the model’s predictive ability, a confusion matrix is shown below as Figure 6. The matrix provides a detailed breakdown of true and false predictions for both classes. For the bullying class, the model correctly identified 1870 instances, while 474 samples were misclassified as Not_Bullying. Similarly, for the Not_Bullying class, the model achieved 2687 correct predictions and 258 misclassifications. Overall, the confusion matrix confirms the model’s strong ability to distinguish between harmful and non-harmful text.

To further support the model generalization and discrimination capability, ROC curves were generated for both classes (Figure 7). The ROC curves illustrate how well the model separates bullying and non-bullying content across different classification thresholds. Both classes achieved an AUC of 0.929, indicating excellent predictive performance and a strong ability to distinguish the instances.

4.4. Comparison with Multilingual Models

For the robustness of the proposed approach, several well-known multilingual transformer models, including XLM-RoBERTa-Base [23,83], mdeberta-v3-base [84,85], and google-bert/bert-base-multilingual-cased [21,86], were fine-tuned in this study. These selected multilingual models are pre-trained on large-scale, multilingual corpora covering mostly over 100 languages. The same fine-tuning strategies proposed in the approach were adopted for a fair comparison (see Table 11).

Although FusionBullyNet’s absolute performance relative to a multilingual base model is modest, it is consistent across multiple evaluation metrics, including precision, recall, and F1 Score. In sensitive applications such as cyberbullying detection, even small improvements can be meaningful as they reduce false negatives and contribute to safer online environments. Moreover, FusionBullyNet preserves language-specific features through attention-based fusion, which can be particularly valuable in bilingual or cross-cultural settings where the multilingual models may miss subtle abusive language.

4.5. Comparison with Machine Learning Models

To assess the robustness of the proposed approach, several well-known machine learning models, including LogisticRegression, LinearSVC, MultinomialNB, Random Forest, Extra Trees, XGBoost, LightGBM, KNeighbors, and SGD Classifiers, were trained using a TFIDF Vectorizer in this study. The LogisticRegression outperformed other machine learning models (see Table 12).

Table 11 and Table 12 demonstrate that the proposed model outperforms multilingual transformers and classical machine learning models.

4.6. Comparison with State-of-the-Art Studies

Although a direct comparison with existing state-of-the-art studies is not feasible due to language differences, most benchmark studies have focused on monolingual datasets, either in English or Arabic. In contrast, the current study addresses the bilingual cyberbullying detection task that simultaneously handles both languages. Consequently, differences in data composition and preprocessing pipelines limit the ability to perform a one-to-one comparison.

Table 13 summarizes representative models from the literature that employ transformer-based architecture for monolingual tasks. The reported metrics from existing studies serve as a baseline, highlighting that while language-specific models achieve strong performance within their native language, their generalization remains limited across languages. In contrast, the proposed framework maintains robust performance across both English and Arabic text.

4.7. Theoretical and Practical Implications

The proposed model enables more accurate detection of cyberbullying in English and Arabic contexts, addressing the key limitation of monolingual models. Its design can be easily extended to support other languages, enabling real-world applications in social media and digital safety systems. From a theoretical point of view, the proposed study demonstrates that bilingual fusion-focused integration of two linguistic encoders can outperform multilingual architectures. It highlights how cross-lingual attention and gradual unfreezing improve contextual adaptation and prevent knowledge degradation. This study can contribute to the advancement of bilingual NLP research in low-resource and sociolinguistic domains.

5. Conclusions

This study addresses the growing need for effective cyberbullying detection in multilingual online environments, particularly for English and Arabic content, which together represent a major portion of global social media communication. Existing approaches predominantly rely on monolingual datasets, limiting their applicability to real-world contexts where users frequently switch between languages. To overcome this limitation, we developed a bilingual cyberbullying detection framework that integrates multiple datasets, extensive preprocessing, and data augmentation with a multilingual text-generation LLM to build a diverse and balanced training corpus.

The proposed FusionBullyNet, combining RoBERTa-base and bert-base-arabertv02-twitter, demonstrated strong performance through attention-based fusion, gradual layer unfreezing, and label smoothing. These enhancements contributed to improved generalization and robustness, achieving a test accuracy of 0.86, F1-scores of 0.83 for bullying and 0.88 for not bullying, and ROC-AUC of 0.929. Comparative experiments with multilingual transformer models (XLM-RoBERTa-base, mdeberta-v3-base, and BERT-multilingual-cased), each achieving a test accuracy of 0.84, further validate the robustness of our proposed approach. Traditional machine learning models with TF-IDF were also trained, but their performance remains lower than that of the proposed approach.

Overall results demonstrate that the proposed model provides a scalable, high-accuracy solution for cyberbullying detection in bilingual contexts. There are two main limitations of the proposed study. First, the dual-encoder architecture introduces additional computational cost compared to single multilingual models. Second, considering the cyberbullying detection task as a binary task sacrifices fine-grained category distinctions. Future work may extend the proposed approach to additional languages, develop a lightweight model, incorporate category distinctions, incorporate conversational context, or explore multimodal signals, such as images and metadata, to further enhance cyberbullying detection across diverse online platforms.

Author Contributions

Conceptualization, M.A.M., M.A.A. and S.M.; Methodology, M.A.M., M.A.A. and S.M.; Validation, M.A.M., M.A.A. and S.M.; Formal analysis, M.A.M., M.A.A. and S.M.; Investigation, M.A.M., M.A.A. and S.M.; Resources, M.A.M., M.A.A. and S.M.; Data curation, M.A.M., M.A.A. and S.M.; Writing—original draft, M.A.M., M.A.A. and S.M.; Writing—review and editing, M.A.M., M.A.A. and S.M.; Visualization, M.A.M., M.A.A. and S.M. All authors contributed equally. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The original data presented in the study are openly available in Kaggle, Mendeley Data, and GitHub, at References [64,65,66,67,68].

Conflicts of Interest

The authors declare no conflicts of interest.

References

Mahdi, M.A.; Fati, S.M.; Ragab, M.G.; Hazber, M.A.G.; Ahamad, S.; Saad, S.A.; Al-Shalabi, M. A Novel Hybrid Attention-Based RoBERTa-BiLSTM Model for Cyberbullying Detection. Math. Comput. Appl. 2025, 30, 91. [Google Scholar] [CrossRef]
Li, C.; Wang, P.; Martin-Moratinos, M.; Bella-Fernández, M.; Blasco-Fontecilla, H. Traditional bullying and cyberbullying in the digital age and its associated mental health problems in children and adolescents: A meta-analysis. Eur. Child Adolesc. Psychiatry 2024, 33, 2895–2909. [Google Scholar] [CrossRef]
One in Six School-Aged Children Experiences Cyberbullying, Finds New WHO/Europe Study. Available online: https://www.who.int/europe/news/item/27-03-2024-one-in-six-school-aged-children-experiences-cyberbullying--finds-new-who-europe-study?utm_source=chatgpt.com (accessed on 12 September 2025).
Kokkinos, C.M.; Baltzidis, E.; Xynogala, D. Prevalence and personality correlates of Facebook bullying among university undergraduates. Comput. Hum. Behav. 2016, 55, 840–850. [Google Scholar] [CrossRef]
Ng, J.C.K.; Lin, E.S.S.; Lee, V.K.Y. Does Instagram make you speak ill of others or improve yourself? A daily diary study on the moderating role of malicious and benign envy. Comput. Hum. Behav. 2023, 148, 107873. [Google Scholar] [CrossRef]
Vadysinghe, A.N.; Perera, I.; Wickramasinghe, C.; Darshika, S.; Ekanayake, K.B.; Thilakarathne, I.; Jayasooriya, D.; Wijesiriwardena, Y. Cyber-bullying among Sri Lankan children: Socio-demographic profile, psychosocial behavior pattern and impact of COVID-19 pandemic. In Review, preprint 2022. [Google Scholar] [CrossRef]
Kowalski, R.M.; Limber, S.P. Psychological, Physical, and Academic Correlates of Cyberbullying and Traditional Bullying. J. Adolesc. Health 2013, 53, S13–S20. [Google Scholar] [CrossRef] [PubMed]
Hinduja, S.; Patchin, J.W. Bullying, Cyberbullying, and Suicide. Arch. Suicide Res. 2010, 14, 206–221. [Google Scholar] [CrossRef]
Mishna, F.; Saini, M.; Solomon, S. Ongoing and online: Children and youth’s perceptions of cyber bullying. Child. Youth Serv. Rev. 2009, 31, 1222–1228. [Google Scholar] [CrossRef]
Forssell, R. Exploring cyberbullying and face-to-face bullying in working life–Prevalence, targets and expressions. Comput. Hum. Behav. 2016, 58, 454–460. [Google Scholar] [CrossRef]
Messias, E.; Kindrick, K.; Castro, J. School bullying, cyberbullying, or both: Correlates of teen suicidality in the 2011 CDC youth risk behavior survey. Compr. Psychiatry 2014, 55, 1063–1068. [Google Scholar] [CrossRef]
Patchin, J.W.; Hinduja, S. TWEEN CYBERBULLYING IN 2020; Cyberbullying Research Center: Jupiter, FL, USA, 2020; Available online: https://www.developmentaid.org/api/frontend/cms/file/2022/03/CN_Stop_Bullying_Cyber_Bullying_Report_9.30.20.pdf (accessed on 13 September 2025).
Cuzzocrea, A.; Akter, M.S.; Shahriar, H.; Bringas, P.G. Cyberbullying Detection, Prevention, and Analysis on Social Media via Trustable LSTM-Autoencoder Networks over Synthetic Data: The TLA-NET Approach. Future Internet 2025, 17, 84. [Google Scholar] [CrossRef]
Muneer, A.; Fati, S.M. A Comparative Analysis of Machine Learning Techniques for Cyberbullying Detection on Twitter. Future Internet 2020, 12, 187. [Google Scholar] [CrossRef]
Setiawan, Y.; Maulidevi, N.U.; Surendro, K. The Optimization of n-Gram Feature Extraction Based on Term Occurrence for Cyberbullying Classification. Data Sci. J. 2024, 23, 31. [Google Scholar] [CrossRef]
Setiawan, Y.; Gunawan, D.; Efendi, R. Feature Extraction TF-IDF to Perform Cyberbullying Text Classification: A Literature Review and Future Research Direction. In Proceedings of the 2022 International Conference on Information Technology Systems and Innovation, ICITSI 2022, Bandung, Indonesia, 8–9 November 2022; pp. 283–288. [Google Scholar] [CrossRef]
Atoum, J.O. Cyberbullying Detection Neural Networks using Sentiment Analysis. In Proceedings of the 2021 International Conference on Computational Science and Computational Intelligence, CSCI, Las Vegas, NV, USA, 15–17 December 2021; pp. 158–164. [Google Scholar] [CrossRef]
Dinakar, N.; Reichart, R.; Lieberman, H. Modeling the Detection of Textual Cyberbullying. Proc. Int. AAAI Conf. Web Soc. Media 2011, 5, 11–17. Available online: https://ojs.aaai.org/index.php/icwsm/article/view/14209 (accessed on 12 September 2025). [CrossRef]
Salawu, S.; He, Y.; Lumsden, J. Approaches to Automated Detection of Cyberbullying: A Survey. IEEE Trans. Affect. Comput. 2020, 11, 3–24. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. 2017, p. 1. Available online: https://arxiv.org/pdf/1706.03762 (accessed on 13 September 2025).
Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, MN, USA, 2–7 June 2019; pp. 4171–4186. [Google Scholar] [CrossRef]
Paul, S.; Saha, S. CyberBERT: BERT for cyberbullying identification. Multimed. Syst. 2020, 28, 1897–1904. [Google Scholar] [CrossRef]
Conneau, A.; Khandelwal, K.; Goyal, N.; Chaudhary, V.; Wenzek, G.; Guzmán, F.; Grave, E.; Ott, M.; Zettlemoyer, L.; Stoyanov, V. Unsupervised Cross-lingual Representation Learning at Scale. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020; pp. 8440–8451. [Google Scholar] [CrossRef]
Philipo, A.G.; Sarwatt, D.S.; Ding, J.; Daneshmand, M.; Ning, H. Assessing Text Classification Methods for Cyberbullying Detection on Social Media Platform. IEEE Trans. Inf. Forensics Secur. 2024, 20, 7602–7616. [Google Scholar] [CrossRef]
Azzeh, M.; Alhijawi, B.; Tabbaza, A.; Alabboshi, O.; Hamdan, N.; Jaser, D. Arabic cyberbullying detection system using convolutional neural network and multi-head attention. Int. J. Speech Technol. 2024, 27, 521–537. [Google Scholar] [CrossRef]
Mahdi, M.A.; Fati, S.M.; Hazber, M.A.G.; Ahamad, S.; Saad, S.A. Enhancing Arabic Cyberbullying Detection with End-to-End Transformer Model. Comput. Model. Eng. Sci. 2024, 141, 1651–1671. [Google Scholar] [CrossRef]
Aljalaoud, H.; Dashtipour, K.; Al-Dubai, A.Y. Arabic Cyberbullying Detection: A Comprehensive Review of Datasets and Methodologies. IEEE Access 2025, 13, 69021–69038. [Google Scholar] [CrossRef]
Fang, Y.; Yang, S.; Zhao, B.; Huang, C. Cyberbullying Detection in Social Networks Using Bi-GRU with Self-Attention Mechanism. Information 2021, 12, 171. [Google Scholar] [CrossRef]
Waseem, Z.; Hovy, D. Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter. In Proceedings of the NAACL-HLT 2016, San Diego, CA, USA, 12–17 June 2016; pp. 88–93. Available online: https://aclanthology.org/N16-2013.pdf (accessed on 9 September 2025).
Davidson, T.; Warmsley, D.; Macy, M.; Weber, I. Automated Hate Speech Detection and the Problem of Offensive Language. Proc. Int. AAAI Conf. Web Soc. Media 2017, 11, 512–551. Available online: https://ojs.aaai.org/index.php/ICWSM/article/view/14955 (accessed on 9 September 2025). [CrossRef]
Wulczyn, E.; Thain, N.; Dixon, L. Ex machina: Personal attacks seen at scale. In Proceedings of the 26th International Conference on World Wide Web, Perth, Australia, 3–7 April 2017; pp. 1391–1399. [Google Scholar] [CrossRef]
Bharti, S.; Yadav, A.K.; Kumar, M.; Yadav, D. Cyberbullying detection from tweets using deep learning. Kybernetes 2022, 51, 2695–2711. [Google Scholar] [CrossRef]
Dewani, A.; Memon, M.A.; Bhatti, S. Cyberbullying detection: Advanced preprocessing techniques & deep learning architecture for Roman Urdu data. J. Big Data 2021, 8, 160. [Google Scholar] [CrossRef] [PubMed]
Gada, M.; Damania, K.; Sankhe, S. Cyberbullying Detection using LSTM-CNN architecture and its applications. In Proceedings of the 2021 International Conference on Computer Communication and Informatics, ICCCI, Coimbatore, India, 27–29 January 2021. [Google Scholar] [CrossRef]
Ahmed, M.F.; Mahmud, Z.; Biash, Z.T.; Ryen, A.A.N.; Hossain, A.; Ashraf, F.B. Cyberbullying Detection Using Deep Neural Network from Social Media Comments in Bangla Language. 2021. Available online: https://arxiv.org/pdf/2106.04506 (accessed on 11 September 2025).
Yi, P.; Zubiaga, A. Cyberbullying detection across social media platforms via platform-aware adversarial encoding. Proc. Int. AAAI Conf. Web Soc. Media 2022, 16, 1430–1434. [Google Scholar] [CrossRef]
Joshi, R.; Gupta, A. Performance Comparison of Simple Transformer and Res-CNN-BiLSTM for Cyberbullying Classification. 2022. Available online: https://arxiv.org/pdf/2206.02206 (accessed on 11 September 2025).
Tashtoush, Y.; Banysalim, A.; Maabreh, M.; Al-Eidi, S.; Karajeh, O.; Zahariev, P. A Deep Learning Framework for Arabic Cyberbullying Detection in Social Networks. Comput. Mater. Contin. 2025, 83, 3113–3134. [Google Scholar] [CrossRef]
Fati, S.M.; Muneer, A.; Alwadain, A.; Balogun, A.O. Cyberbullying Detection on Twitter Using Deep Learning-Based Attention Mechanisms and Continuous Bag of Words Feature Extraction. Mathematics 2023, 11, 3567. [Google Scholar] [CrossRef]
Akter, M.S.; Shahriar, H.; Cuzzocrea, A. A Trustable LSTM-Autoencoder Network for Cyberbullying Detection on Social Media Using Synthetic Data. 2023. Available online: https://arxiv.org/pdf/2308.09722 (accessed on 11 September 2025).
Wahid, Z.; Al Imran, A. Multi-feature Transformer for Multiclass Cyberbullying Detection in Bangla. IFIP Adv. Inf. Commun. Technol. 2023, 675, 439–451. [Google Scholar] [CrossRef]
Sihab-Us-Sakib, S.; Rahman, M.R.; Forhad, M.S.A.; Aziz, M.A. Cyberbullying detection of resource constrained language from social media using transformer-based approach. Nat. Lang. Process. J. 2024, 9, 100104. [Google Scholar] [CrossRef]
Kumar, Y.; Huang, K.; Perez, A.; Yang, G.; Li, J.J.; Morreale, P.; Kruger, D.; Jiang, R. Bias and Cyberbullying Detection and Data Generation Using Transformer Artificial Intelligence Models and Top Large Language Models. Electronics 2024, 13, 3431. [Google Scholar] [CrossRef]
Kaddoura, S.; Nassar, R. Language Model-Based Approach for Multiclass Cyberbullying Detection. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Singapore, 2025; Volume 15437, pp. 78–89. [Google Scholar] [CrossRef]
Gutiérrez-Batista, K.; Gómez-Sánchez, J.; Fernandez-Basso, C. Improving automatic cyberbullying detection in social network environments by fine-tuning a pre-trained sentence transformer language model. Soc. Netw. Anal. Min. 2024, 14, 136. [Google Scholar] [CrossRef]
Sui, J. Understanding and Fighting Bullying with Machine Learning-UWDC-UW-Madison Libraries. Available online: https://search.library.wisc.edu/digital/ARXXPUFZWBRX4R9C (accessed on 11 September 2025).
Bayzick, J.; Kontostathis, A.; Edwards, L. Detecting the presence of cyberbullying using computer software. In Proceedings of the 3rd International Web Science Con-ference WebSci’11, Koblenz, Germany, 15–17 June 2011; pp. 93–96. [Google Scholar]
Nath, S.S.; Karim, R.; Miraz, M.H. Deep Learning Based Cyberbullying Detection in Bangla Language. Ann. Emerg. Technol. Comput. 2024, 8, 50–65. [Google Scholar] [CrossRef]
Kingma, D.P.; Ba, J.L. Adam: A Method for Stochastic Optimization. arXiv 2015, arXiv:1412.6980. Available online: https://arxiv.org/abs/1412.6980 (accessed on 1 December 2025).
Kumar, D. A Hybrid DeBERTa and Gated Broad Learning System for Cyberbullying Detection in English Text. 2025. Available online: https://arxiv.org/pdf/2506.16052 (accessed on 11 September 2025).
Mathew, B.; Saha, P.; Yimam, S.M.; Biemann, C.; Goyal, P.; Mukherjee, A. HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection. Proc. AAAI Conf. Artif. Intell. 2021, 35, 14867–14875. [Google Scholar] [CrossRef]
Wang, J.; Fu, K.; Lu, C.T. SOSNet: A Graph Convolutional Network Approach to Fine-Grained Cyberbullying Detection. In Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Atlanta, GA, USA, 10–13 December 2020; pp. 1699–1708. [Google Scholar] [CrossRef]
Ejaz, N.; Choudhury, S.; Razi, F. A Comprehensive Dataset for Automated Cyberbullying Detection. Mendeley Data 2024, 2. [Google Scholar] [CrossRef]
Kumar, D. Cyberbullying Detection in Hinglish Text Using MURIL and Explainable AI. 2025. Available online: https://arxiv.org/pdf/2506.16066 (accessed on 11 September 2025).
Bohra, A.; Vijay, D.; Singh, V.; Akhtar, S.S.; Shrivastava, M. A Dataset of Hindi-English Code-Mixed Social Media Text for Hate Speech Detection. 2018, pp. 36–41. Available online: https://github.com/deepanshu1995/HateSpeech-Hindi-English-Code-Mixed-Social-Media-Text (accessed on 11 September 2025).
Maity, K.; Jain, R.; Jha, P.; Saha, S. Explainable Cyberbullying Detection in Hinglish: A Generative Approach. IEEE Trans. Comput. Soc. Syst. 2024, 11, 3338–3347. [Google Scholar] [CrossRef]
Maity, K.; Saha, S.; Bhattacharyya, P. Emoji, Sentiment and Emotion Aided Cyberbullying Detection in Hinglish. IEEE Trans. Comput. Soc. Syst. 2023, 10, 2411–2420. [Google Scholar] [CrossRef]
Mandl, T.; Modha, S.; Shahi, G.K.; Madhu, H.; Satapara, S.; Majumder, P.; Schäfer, J.; Ranasinghe, T.; Zampieri, M.; Nandini, D.; et al. Overview of the HASOC Subtrack at FIRE 2021: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages. 2021. Available online: https://arxiv.org/pdf/2112.09301 (accessed on 11 September 2025).
Kaware, P. Indo-HateSpeech. Mendeley Data 2024, 1. [Google Scholar] [CrossRef]
Ojha, A.K. Benchmarking Aggression Identification in Social Media. 2018. Available online: https://research.universityofgalway.ie/en/publications/benchmarking-aggression-identification-in-social-media (accessed on 11 September 2025).
Prama, T.T.; Amrin, J.F.; Anwar, M.M.; Sarker, I.H. AI Enabled User-Specific Cyberbullying Severity Detection with Explainability. 2025. Available online: https://arxiv.org/pdf/2503.10650 (accessed on 12 September 2025).
Purkayastha, B.S.; Rahman, M.M.; Talukdar, M.T.I.; Shahpasand, M. Advancing Cyberbullying Detection: A Hybrid Machine Learning and Deep Learning Framework for Social Media Analysis. In Proceedings of the 27th International Conference on Enterprise Information Systems, Porto, Portugal, 4–6 April 2025; Volume 2, pp. 348–355. [Google Scholar] [CrossRef]
Eissa, A.M.; Guirguis, S.K.; Madbouly, M.M. An optimized Arabic cyberbullying detection approach based on genetic algorithms. Sci. Rep. 2025, 15, 38479. [Google Scholar] [CrossRef] [PubMed]
Cyberbullying Classification. Available online: https://www.kaggle.com/datasets/andrewmvd/cyberbullying-classification (accessed on 16 September 2025).
Cyberbullying Dataset. Available online: https://www.kaggle.com/datasets/saurabhshahane/cyberbullying-dataset (accessed on 22 October 2025).
ArbCyD: Arabic Cyberbullying Dataset. Available online: https://www.kaggle.com/datasets/monarasheedalroqi/arbcyd-arabic-cyberbullying-dataset (accessed on 16 September 2025).
Shannag, F. ArCyC: A Fully Annotated Arabic Cyberbullying Corpus. Mendeley Data 2023, 1. [Google Scholar] [CrossRef]
GitHub-Omammar167/Arabic-Abusive-Datasets: Available Arabic abusive and cyber bullying Datasets. Available online: https://github.com/omammar167/Arabic-Abusive-Datasets (accessed on 22 October 2025).
Hegazi, M.O.; Al-Dossari, Y.; Al-Yahy, A.; Al-Sumari, A.; Hilal, A. Preprocessing Arabic text on social media. Heliyon 2021, 7, e06191. [Google Scholar] [CrossRef]
Acs, J.; Hamerlik, E.; Schwartz, R.; Smith, N.A.; Kornai, A. Morphosyntactic Probing of Multilingual BERT Models. Nat. Lang. Eng. 2023, 30, 753–792. [Google Scholar] [CrossRef]
Alkaoud, M.; Syed, M. On the Importance of Tokenization in Arabic Embedding Models. 2020, pp. 119–129. Available online: https://github.com/attardi/wikiextractor (accessed on 17 September 2025).
Miyajiwala, A.; Ladkat, A.; Jagadale, S.; Joshi, R. On Sensitivity of Deep Learning Based Text Classification Algorithms to Practical Input Perturbations. Available online: https://arxiv.org/abs/2201.00318 (accessed on 1 December 2025).
Turki, T.; Roy, S.S. Novel Hate Speech Detection Using Word Cloud Visualization and Ensemble Learning Coupled with Count Vectorizer. Appl. Sci. 2022, 12, 6611. [Google Scholar] [CrossRef]
Muennighoff, N.; Wang, T.; Sutawika, L.; Roberts, A.; Biderman, S.; Le Scao, T.; Bari, M.S.; Shen, S.; Yong, Z.X.; Schoelkopf, H.; et al. Crosslingual Generalization through Multitask Finetuning. Proc. Annu. Meet. Assoc. Comput. Linguist. 2023, 1, 15991–16111. [Google Scholar] [CrossRef]
Bigscience/mt0-Base Hugging Face. Available online: https://huggingface.co/bigscience/mt0-base (accessed on 14 October 2025).
Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. RoBERTa: A Robustly Optimized BERT Pretraining Approach. 2019. Available online: https://arxiv.org/pdf/1907.11692 (accessed on 14 December 2025).
FacebookAI/Roberta-Base Hugging Face. Available online: https://huggingface.co/FacebookAI/roberta-base (accessed on 14 December 2025).
Antoun, W.; Baly, F.; Hajj, H. AraBERT: Transformer-Based Model for Arabic Language Understanding. 2020. Available online: https://arxiv.org/pdf/2003.00104 (accessed on 14 December 2025).
Kaggle: Your Home for Data Science. Available online: https://www.kaggle.com/ (accessed on 6 October 2025).
Welcome to Python.org. Available online: https://www.python.org/ (accessed on 6 October 2025).
Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. Adv. Neural Inf. Process Syst. 2019, 32. Available online: https://arxiv.org/pdf/1912.01703 (accessed on 6 October 2025).
Wolf, T.; Debut, L.; Sanh, V.; Chaumond, J.; Delangue, C.; Moi, A.; Cistac, P.; Rault, T.; R´emi, L.; Funtowicz, M.; et al. HuggingFace’s Transformers: State-of-the-Art Natural Language Processing. 2019. Available online: https://arxiv.org/pdf/1910.03771 (accessed on 6 October 2025).
FacebookAI/Xlm-Roberta-Base Hugging Face. Available online: https://huggingface.co/FacebookAI/xlm-roberta-base (accessed on 14 December 2025).
He, P.; Gao, J.; Chen, W. DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing. Available online: https://arxiv.org/pdf/2111.09543 (accessed on 14 December 2025).
Microsoft/mDeberta-v3-Base Hugging Face. Available online: https://huggingface.co/microsoft/mdeberta-v3-base (accessed on 14 December 2025).
Google-Bert/Bert-Base-Multilingual-Cased Hugging Face. Available online: https://huggingface.co/google-bert/bert-base-multilingual-cased (accessed on 14 December 2025).

Figure 1. Distribution of Arabic and English text samples across cyberbullying and non-cyberbullying classes before preprocessing.

Figure 2. Distribution of Arabic and English text samples across cyberbullying and non-cyberbullying classes after preprocessing.

Figure 3. Word clouds of the English and Arabic datasets before preprocessing.

Figure 4. Word clouds of the English, and Arabic datasets after preprocessing.

Figure 5. Proposed Model Learning Graphs.

Figure 6. FusionBullyNet Model Confusion Matrix.

Figure 7. Proposed FusionBullyNet Model ROC Curves.

Table 1. Text Cyberbullying Detection Literature Review Summary.

Study	Language/Dataset	Architecture	Results
Fang et al. (2021) [28]	English (Social media posts)	Bi-GRU with self-attention	Weighted average precisions of 0.849, 0.961, and 0.943 across all three datasets ([29,30,31]).
Bharti et al. (2021) [32]	English (Tweets)	GloVe embeddings + Bi-LSTM	Accuracy 92.60%
Dewani et al. (2021) [33]	Roman Urdu (Custom Dataset)	Hybrid RNN (LSTM/Bi-LSTM) + CNN with preprocessing	Accuracy of 85.5%
Gada et al. (2021) [34]	English	LSTM + CNN hybrid	Accuracy of 95.2%
Ahmed et al. (2021) [35]	Bengali	Deep neural + ensemble for multiclass	Accuracy of 85% for multiclass
Yi et al. (2022) [36]	English/3 Platforms	Adversarial Encoding + Transformers	Average macro F1 0.693 (69.30%)
Joshi et al. (2022) [37]	English	Simple Transformer vs. hybrid Res-CNN-BiLSTM	Training and validation accuracy are much better in the simple transformer case
Tashtoush et al. (2022) [38]	Arabic YouTube Comments	CNN, LSTM, Bi-LSTM, CNN-LSTM Hybrid	Binary classification: CNN & CNN-LSTM ≈ 91.9% accuracy Multiclass: LSTM & Bi-LSTM ≈ 89.5% accuracy
Fati et al. (2023) [39]	English (Twitter)	Attention + Continuous bag of words	Accuracy 94.49%
Akter et al. (2023) [40]	English, Bangla, Hindi	LSTM-Autoencoder + Synthetic data	≈95% across datasets
Wahid et al. (2023) [41]	Bangla	Transformer + lexical + contextual + semantic features	98% for threats and 90% for sarcastic
Sihab-Us-Sakib et al. (2024) [42]	Bengali	Fine-Tuned Transformer	Accuracy 82.61%
Kumar et al. (2024) [43]	English (Twitter + Synthetic Data)	Multiple Transformers	Effective detection of bias and cyberbullying
Kaddoura et al. (2024) [44]	English	BERT vs. LLM	Accuracy 83.67%
Gutiérrez-Batista et al. (2024) [45]	English (BullyingV3.0 [46], MySpace [47], Hate-Speech [29])	Pre-trained Sentence Transformer fine-tuned	~83% and ~97% across datasets
Nath et al. (2024) [48]	Bengali	Two Layer Bi-LSTM	95.08% Accuracy
Kumar (2025) [50]	English (HateXplain [51], SOSNet [52], Mendeley (Mendeley-I, and Mendeley-II) [53])	Hybrid DeBERTa + GBLS	79.3–95.41% accuracy
Kumar (2025) [54]	Hinglish (Bohra [55], BullyExplain [56], BullySentemo [57], HASOC-2021 [58], Mendeley [59] and Kumar dataset [60])	MURIL Transformer	83.92–94.63% accuracy
Prama et al. (2025) [61]	English	LSTM + User Specific Data	98% Accuracy
Purkayastha et al. (2025) [62]	Twitter Dataset Binary	ML (RF, XGB) vs. DL (Bi-LSTM, BiGRU, BERT)	92% Accuracy
Eisa et al. (2025) [63]	Arabic	SVM + GA-FS (Genetic Algorithm (GA)-based feature selection)	Accuracy after GA-FS: 71.76

Table 2. Cyberbullying dataset samples and associated class before preprocessing.

Text	Language	Cyberbullying
“مابظن في أحلى وأعذب من صوتك”	Arabic	No
“رضا عبعال من ازبل الشخصيات اللي ممكن تسمعها او تضيع وقتك علشان تسمعه حتي لو حيضحكك شخصيه بنت ١٠٠تين كلب واللي بيطلعوه ع الشاشات اوسخ منه الواحد قرف من الاشكال المعفنه دي اللي الكوره عملت منهم بني ادمين ودوقتهم الفلوس وهما في الاصل كانوا حثاله متلقحه في كوم زباله”	Arabic	Yes
“Just discovered what happens when jeans are too big while wearing over a swim. Whelp. Time to make a belt.”	English	No
“Only person that’s still being a dumb dickwad is D’Undre. But fuck him. Big lip, buttfucking nigger.”	English	Yes

Table 3. Parameters and settings used for data augmentation using a language model.

Parameter	Description/Value
Language Model	bigscience/mt0-base
Augmentation Type	Multilingual Paraphrasing
Input Languages	English, Arabic
Generation Mode	Batch-Wise (32 Batch Size)
Maximum Input Length	128 Tokens
Maximum Output Length	128 Tokens
Decoding Strategy	Beam search (num_beams = 4), top-p sampling (top_p = 0.95), temperature = 0.9
Impact	Increased train samples and balanced class distribution across all languages

Table 4. Dataset Distribution by Language and Cyberbullying Label Before and After Augmentation.

Split	Language	Non-Cyberbullying	Cyberbullying	Total Samples
Train (Before)	Arabic	11,993	6700	18,693
Train (Before)	English	11,749	11,905	23,654
Train (After)	Arabic	11,993	11,993	23,986
Train (After)	English	11,993	11,993	23,986
Validation	Arabic	1499	838	2337
Validation	English	1468	1488	2956
Test	Arabic	1499	837	2336
Test	English	1469	1489	2958

Table 5. Model Architecture Summary.

Component	Description
Encoders	RoBERTa-Base (12 layers)/aubmindlab/bert-base-arabertv02-twitter (12 layers)
Projection	Liner Transformation to Shared latent dimension (1024)
Fusion	Multi-Head Attention (4 heads)
Classifier	Dropout (0.4), Dense (ReLU)
Parallelization	Dual-GPU setup using the Accelerate framework
Precision	Automatic Mixed Precision (AMP)
Loss Function	Weighted Cross Entropy + Label Smoothing (0.1)
Optimizer	AdamW with weight decay of 0.01

Table 6. Fine-Tuning and Optimization Techniques.

Technique	Purpose	Configuration
Layer-wise LR Decay (LLRD)	Preserves low-level linguistic features while adapting deeper layers	Decay = 0.9
Label Smoothing	Improves model calibration and mitigates overfitting	ε = 0.1
Gradient Accumulation	Simulates a larger effective batch size with limited GPU memory	2 steps → effective batch 128
Automatic Mixed Precision (AMP)	Reduces memory use and increases throughput	Enabled (autocast, GradScaler)
Early Stopping	Prevents over-training when validation loss plateaus	Patience = 5
Gradient Clipping	Stabilizes training by preventing exploding gradients	Max norm = 0.8

Table 7. Gradually Unfreezing Configuration.

Encoder	Total Layers	Initially Frozen	Initially Trainable	Layers Unfrozen per Step	Patience
English (RoBERTa-Base)	12	10	2	+2	2
Arabic (aubmindlab/bert-base-arabertv02-twitter)	12	10	2	+2	2

Table 8. Hardware and resource specifications during model training on Kaggle.

Component	Specifications
GPU	2 × NVIDIA T4
GPU Memory (1st GPU)	15 GB
GPU Memory (2nd GPU)	15 GB
System RAM	30 GB
Disk Storage	57.6 GB
Operating Platform	Kaggle Free Service
Programming Language	Python (v3.10)

Table 9. Model Training Hyperparameters.

Training Hyperparameters	Value/Setting
Optimizer	AdamW
Encoder Learning Rate	7 × 10⁻⁶
Classifier Learning Rate	1 × 10⁻⁵
Batch Size	64 (effective 128 with accumulation)
Scheduler	Linear decay with 15% warmup
Loss Function	Cross-Entropy with label smoothing (ε = 0.1)
Epochs	30
Early Stopping	Patience = 5

Table 10. Class-wise Performance of the Proposed FusionBullyNet.

	Precision	Recall	F1-Score	Support
Bullying	0.88	0.80	0.83	2326
Not_Bullying	0.85	0.91	0.88	2968
Macro Avg	0.86	0.85	0.86	5294
Weighted Avg	0.86	0.86	0.86	5294
Overall Accuracy = 0.86

Table 11. Comparison with Identical Multilingual Models.

Models	Accuracy	Weighted Precision	Weighted Recall	Weighed F1
XLM-Roberta-base	0.84	0.84	0.84	0.84
mdeberta-v3-base	0.84	0.85	0.84	0.84
google-bert/bert-base-multilingual-cased	0.84	0.84	0.84	0.84
Proposed FusionBullyNet	0.86	0.86	0.86	0.86

Table 12. Comparison with Machine Learning Models using TF-IDF.

Models	Accuracy	Weighted Precision	Weighted Recall	Weighed F1
LogisticRegression	0.82	0.82	0.82	0.81
LinearSVC	0.81	0.81	0.81	0.81
MultinomialNB	0.76	0.76	0.76	0.76
Random Forest Classifier	0.81	0.82	0.81	0.81
Extra Trees Classifier	0.81	0.81	0.81	0.81
XGBoost Classifier	0.82	0.83	0.82	0.81
LightGBM Classifier	0.82	0.83	0.82	0.81
KNeighbors Classifier	0.59	0.70	0.59	0.47
SGDClassifier Classifier	0.81	0.82	0.81	0.81
Proposed FusionBullyNet	0.86	0.86	0.86	0.86

Table 13. Comparison with state-of-the-art monolingual studies.

Study	Language	Model	Result
Bharti et al. (2021) [32]	English	GloVe embeddings + Bi-LSTM	Accuracy: 92.60%
Kaddoura et al. (2024) [44]	English	BERT vs LLM	Accuracy 83.67%
Purkayastha et al. (2025) [62]	English	ML (RF, XGB) vs DL (Bi-LSTM, BiGRU, BERT)	92% Accuracy
Ahmed et al. (2021) [35]	Bengali	Deep neural + ensemble for multiclass	Accuracy: 85%
Sihab-Us-Sakib et al. (2024) [42]	Bengali	Fine-Tuned Transformer	Accuracy 82.61%
Tashtoush et al. (2022) [38]	Arabic	CNN, LSTM, Bi-LSTM, CNN-LSTM Hybrid	Binary classification: CNN & CNN-LSTM ≈ 91.9% accuracy Multiclass: LSTM & Bi-LSTM ≈ 89.5% accuracy
Eisa et al. (2025) [63]	Arabic	SVM + GA-FS (Genetic Algorithm (GA)-based feature selection)	Accuracy after GA-FS: 71.76
Proposed FusionBullyNet	English + Arabic	Transformer Models + Fusion	Accuracy: 0.86

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Mahdi, M.A.; Arshed, M.A.; Mumtaz, S. FusionBullyNet: A Robust English—Arabic Cyberbullying Detection Framework Using Heterogeneous Data and Dual-Encoder Transformer Architecture with Attention Fusion. Mathematics 2026, 14, 170. https://doi.org/10.3390/math14010170

AMA Style

Mahdi MA, Arshed MA, Mumtaz S. FusionBullyNet: A Robust English—Arabic Cyberbullying Detection Framework Using Heterogeneous Data and Dual-Encoder Transformer Architecture with Attention Fusion. Mathematics. 2026; 14(1):170. https://doi.org/10.3390/math14010170

Chicago/Turabian Style

Mahdi, Mohammed A., Muhammad Asad Arshed, and Shahzad Mumtaz. 2026. "FusionBullyNet: A Robust English—Arabic Cyberbullying Detection Framework Using Heterogeneous Data and Dual-Encoder Transformer Architecture with Attention Fusion" Mathematics 14, no. 1: 170. https://doi.org/10.3390/math14010170

APA Style

Mahdi, M. A., Arshed, M. A., & Mumtaz, S. (2026). FusionBullyNet: A Robust English—Arabic Cyberbullying Detection Framework Using Heterogeneous Data and Dual-Encoder Transformer Architecture with Attention Fusion. Mathematics, 14(1), 170. https://doi.org/10.3390/math14010170

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

FusionBullyNet: A Robust English—Arabic Cyberbullying Detection Framework Using Heterogeneous Data and Dual-Encoder Transformer Architecture with Attention Fusion

Abstract

1. Introduction

2. Literature Review

3. Research Methodology

3.1. Dataset Description

3.2. Dataset Pre-Processing

3.3. Exploratory Data Visualization

3.4. Data Augmentation Using Multilingual Large Language Model

3.5. Dataset Language-Based Split Ratio Before and After Augmentation

3.6. Proposed FusionBullyNet (Bilingual Dual-Encoder Model with Attention-Based Fusion)

4. Results and Discussion

4.1. Performance Evaluation Metrics

4.2. Experimental Setup

4.3. Model Training Configurations and Results

4.4. Comparison with Multilingual Models

4.5. Comparison with Machine Learning Models

4.6. Comparison with State-of-the-Art Studies

4.7. Theoretical and Practical Implications

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI