Adversarial Robustness in Cognitive Systems: A Trustworthiness Assessment Perspective for 6G Networks

Alexandropoulos, Ilias; Koumaras, Harilaos; Rentoula, Vasiliki; Papanikolaou-Ntais, Gerasimos; Georgoulas, Spyridon; Makropoulos, George

doi:10.3390/electronics14112285

Open AccessArticle

Adversarial Robustness in Cognitive Systems: A Trustworthiness Assessment Perspective for 6G Networks

by

Ilias Alexandropoulos

,

Harilaos Koumaras

^*

,

Vasiliki Rentoula

,

Gerasimos Papanikolaou-Ntais

,

Spyridon Georgoulas

and

George Makropoulos

National Centre of Scientific Research “DEMOKRITOS”, Institute of Informatics and Telecommunications, 15341 Athens, Greece

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(11), 2285; https://doi.org/10.3390/electronics14112285

Submission received: 28 April 2025 / Revised: 25 May 2025 / Accepted: 3 June 2025 / Published: 4 June 2025

(This article belongs to the Special Issue Recent Advances and Challenges in IoT, Cloud and Edge Coexistence)

Download

Browse Figures

Versions Notes

Abstract

As B5G systems are evolving toward 6G, their coordination increasingly relies on AI-driven automation and orchestration actions, a process that is characterized as cognition. Therefore, a 6G system, through this cognitive process, acts as an intent-handling entity that comprehends sophisticated intent semantics from the users/tenants and calculates the ideal goal state for the specific intent, organizing the necessary adaptation actions that are needed for the transition of the system into that state. However, the use of cognitive-driven AI models to coordinate the purposes of a 6G system creates new risks, as a new surface of attack is born, where the whole 6G system operation may be maliciously affected by adversarial attacks within the user-intents. Focusing on this challenge, this paper realizes a prototype cognitive coordinator for 6G trustworthiness provision and investigates its adversarial robustness for different BERT-based quantification models, which are used for realizing the 6G cognitive system.

Keywords:

cognitive; 6G networks; trustworthiness; adversarial attacks; BERT

1. Introduction

Nowadays, AI can be employed in almost every aspect, segment, and domain of a mobile network, enabling automated network operation and user service support [1]. The network architecture in 6G will comprise a fully end-to-end Machine Learning (ML) and model accessibility, encompassing autonomic networking by taking advantage of AI/ML capabilities. Different AI/ML algorithms (e.g., supervised, unsupervised, federated, or reinforcement learning) have different impacts on improving the coordination of resource and service orchestration in 6G systems. Thus, different AI/ML techniques assist a 6G system to realize the end-to-end orchestration by bringing together different enabling technology domains that realize the expected 6G KPIs [2].

Among the various candidate AI/ML techniques that are considered [3], the cognitive one is considered the most dominant, since it is capable of managing and synchronizing both the control and the data planes. The mental action or process of learning information and understanding through experience and the senses is characterized as cognition. An autonomous system is a technical application of cognition, since it is designed to perform the operational tasks of understanding by experiencing and sensing. Thus, a 6G Cognitive system becomes an intent-handling coordinating function that comprehends sophisticated and abstract intent semantics, calculates the ideal system goal state, and organizes activities to transition the system into this trustworthy state.

Ensuring the robustness of cognitive coordination systems in 6G networks is paramount to the overall reliability, security, and operational continuity of next-generation communication infrastructures. In this context, “robustness” is defined as the intrinsic ability of the cognitive system to preserve its intended functionality under adversarial perturbations, environmental uncertainties, and dynamic operational stresses. Mathematically, robustness can be characterized by the system’s resistance to adversarial attacks, its capacity to generalize effectively to out-of-distribution inputs, and its resilience in the presence of model degradation or external disturbances. These criteria are operationalized through metrics such as score deviation, regression error, and perturbation coherence, which collectively quantify the stability and accuracy of a system when exposed to adversarial scenarios. This rigorous definition aligns with established frameworks in robust machine learning and secure federated learning, which emphasize not only error minimization under nominal conditions but also the maintenance of performance under worst-case adversarial perturbations.

In addition to traditional notions of adversarial robustness, the 6G cognitive coordinator must also demonstrate adaptive resilience to support dynamic trust-based orchestration across both the control and data planes. As these systems increasingly rely on AI, they become susceptible to adversarial black-box attacks. Such attacks can undermine the integrity and reliability of the network, leading to potential vulnerabilities that can be exploited by malicious entities. Hence, it is crucial to design cognitive models that can withstand these adversarial threats, maintaining the smooth and safe coordination of 6G networks. Due to this and in line with other similar works, the term cognitive coordinator is used in the rest of the paper to name the cognitive systems with coordinating responsibilities in 6G. The envisioned cognitive coordinator leverages AI-driven decision mechanisms—exemplified by models such as BERT—to interpret and manage complex user intents by deploying auxiliary network functions (NFs), enforcing policy rules, and selecting appropriate quality index parameters. In practical terms, this means that the cognitive system must resist adversarial attacks and continually adapt to evolving network conditions while ensuring that safety, reliability, and service quality are not compromised. The quantitative assessment of these characteristics through rigorous experiments and trial implementations is intended to guide future enhancements in secure and trustworthy AI coordination within 6G networks.

Although adversarial attacks in NLP have been extensively studied, their potential implications within 6G cognitive systems—especially those responsible for trust-based orchestration—remain largely unexplored. This paper uniquely assesses the robustness of BERT-based cognitive coordinators in handling adversarial user-intents, addressing a critical gap in the deployment of secure AI coordination frameworks for 6G. A variety of adversarial black-box attacks are defined and executed in this study without direct access to the model weights. By employing diverse perturbation strategies, the robustness of the cognitive coordinator is rigorously tested. The evaluation metrics include score deviation, regression error, and perturbation coherence, with a particular focus on identifying class-specific vulnerabilities. This comprehensive assessment aims to provide insights into the resilience of the cognitive coordination in 6G systems.

As a representative case, the paradigm of trustworthiness provision is examined in this paper, as a popular task that a cognitive coordinator will be asked to manage in a 6G system. In this case, the user-intents are classified into five classes (trustworthiness dimensions): Security, Safety, Privacy, Resilience, and Reliability, based on which actions both at the control and data planes will be mandated.

The rest of this paper is organized as follows: Section 2 provides some definitions of cognitive coordinators in 6G networks and a background of adversarial robustness in different domains. Then, Section 3 and Section 4 are devoted to analyzing the different cognitive coordinator models, adversarial attacks, and the evaluation process. Finally, the main conclusions are drawn in Section 5.

2. Materials and Methods

2.1. Related Work in the Cognitive Coordinator Paradigm for Trustworthiness Provision

The mental action or process of learning information and understanding through experience and the senses is characterized as cognition. An autonomous system, such as the Cognitive Coordination component for 6G trustworthiness, is a technical application of cognition since it is designed to perform the operational tasks of understanding through experience and perception.

In recent research initiatives for user-centric 6G networks [4], the Cognitive Coordinator serves as the central mechanism for ensuring a specific property at the user or the system domain, such as user trust and system trustworthiness (Figure 1). This paradigm facilitates a dynamic interaction loop where trust semantics initiated by users as intents are processed to ensure robust network operations aligned with user expectations and system integrity.

The starting point of the process is the user, whose trust requirements are pivotal for deciding the system’s response. The user communicates their trust expectations via an AI chatbot [5], which then classifies their expectations into the five trust dimensions: Security, Safety, Privacy, Resilience, and Reliability. This is mapped output is defined as trust semantics.

Once the trust semantics are defined, they are passed to the Cognitive Coordinator, a central AI-assisted component that includes a BERT-based regression model. This model plays a critical role in interpreting the trust semantics to a desired level of trustworthiness (LoTw) and then to the system’s transition actions related to the five trustworthiness dimensions.

The cognitive coordinator does more than just calculate scores; it dynamically decides on the best action to align the network’s behavior with the calculated trustworthiness level. This might involve adjusting network configurations, enhancing security protocols, or reallocating resources. This decision process is responsible for taking into account the potential impacts of each action, ensuring that decisions deliver trust without compromising the efficiency or performance of the network.

Referred to as the Cognitive Coordinator model, this regression-based system ensures scalability and adaptability, delivering a robust mechanism for preliminary trustworthiness estimation. In this paper, we evaluate the robustness of the Cognitive Coordinator model—the core computational element of the SAFE-6G framework—to validate its effectiveness and reliability across varying operational conditions.

In the light of the rapidly evolving landscape of trustworthy AI [6,7], our work makes several novel contributions:

Cognitive Coordinator Framework: We propose a BERT-based multi-head regression model that quantifies trustworthiness across five critical dimensions—Security, Safety, Privacy, Resilience, and Reliability—and leverages these predictions to autonomously drive network configuration decisions in 6G environments.
Adversarial Robustness Evaluation: We introduce a comprehensive adversarial robustness testing framework employing three distinct black-box attack strategies—TextFooler, BERT-Attack, and Probability Weighted Word Saliency (PWWS)—to systematically assess the resilience of transformer-based cognitive coordinators. This evaluation extends traditional adversarial analyses, typically applied in image or NLP domains, to the context of network orchestration.
Comparative Transformer Analysis: By experimentally comparing multiple transformer architectures (BERT-uncased, RoBERTa-Base, ALBERT-Base, and ELECTRA-Base) under adversarial perturbations, our study provides new insights into the interplay between architectural characteristics and adversarial vulnerability, thereby informing the design of more robust and trustworthy AI systems for 6G networks.
Bridging Trustworthiness Domains: Our approach both aligns with and extends the current understanding of trustworthy AI [6,7] by embedding trust quantification and adversarial defense mechanisms within the operational context of 6G network management.

In addition to the textual adversarial challenges explored in this work, recent investigations have underscored that adversarial vulnerabilities extend to network and graph-based systems, which are especially critical for 6G applications. For example, research on adversarial attacks in graph neural networks [8,9] demonstrates that even minor perturbations in network topology or node features can significantly impair model performance. Furthermore, studies focusing on adversarial threats in network-assisted IoT systems [10] have illustrated that attacks on network-level data can disrupt key operations such as spectrum sensing and resource allocation. These findings underscore the importance of adopting a holistic security approach that encompasses both text-based and graph-based adversarial threats. While the current study focuses on the text domain intrinsic to user-intent inputs in cognitive coordinators, our proposed security framework and evaluation metrics are designed with an awareness of the broader adversarial landscape in 6G, thus paving the way for future integration of network and graph adversarial defenses.

These contributions collectively advance the state-of-the-art in integrating cognitive coordination with secure network orchestration, offering promising avenues for future research and implementation in real-world 6G scenarios.

2.2. Related Work in Adversarial Robustness

Neural networks, despite their impressive performance across various domains, such as image and speech recognition, exhibit vulnerabilities that challenge their robustness and reliability. One notable vulnerability is their susceptibility to adversarial attacks. These attacks exploit the inherent discontinuities in neural networks, allowing small, imperceptible perturbations in input data to cause significant changes in output predictions. The phenomenon of adversarial attacks first started in the domain of image processing [11], where it is demonstrated that by introducing imperceptible perturbations to images, neural networks could be easily misled into making mistakes in predictions with high confidence. This insight initiated extensive research into the robustness of neural networks and the methodologies for crafting adversarial examples. The visual aspect of these attacks was especially notable, as the altered images looked identical to the originals to human viewers.

Building on the foundational work in image-based attacks [12,13], researchers extended their exploration to audio processing. Audio adversarial attacks presented unique challenges due to the temporal and frequency-domain characteristics of audio signals. By introducing subtle perturbations to waveforms or spectral features, such as Mel-Frequency Cepstral Coefficients (MFCCs), adversarial examples could effectively fool speaker recognition systems or speech-to-text models [14]. These attacks demonstrated that vulnerabilities in neural networks were not limited to static inputs like images but also extended to dynamic, time-dependent data.

More recently, adversarial research has shifted focus to the text domain [15], which forms the core of this paper. Text-based adversarial attacks are particularly challenging due to the discrete and structured nature of language. Techniques such as synonym replacement, paraphrasing, and insertion of contextually plausible errors exploit the sensitivity of natural language processing models to slight variations in input, often leading to significant misclassifications [16].

Adversarial attacks have expanded beyond particular data fields to encompass communication networks, including 5G systems and upcoming network technologies. As machine learning becomes integral to critical operations in these networks, such as spectrum sensing, network slicing, and resource allocation [17], adversarial machine learning introduces new vulnerabilities. For instance, adversaries can exploit machine learning models used for spectrum sharing in 5G to disrupt communications or mislead the system into allocating resources inefficiently [18]. These attacks take advantage of the inherent openness of wireless environments, where adversaries can observe and manipulate both data and control signals, creating a novel attack surface.

This evolving landscape of adversarial threats underscores the importance of developing robust defense mechanisms across all domains, from images and audio to text and communication networks. This paper builds on this foundation by investigating adversarial vulnerabilities in text-based applications within the communication networks domain, such as the cognitive coordinator that receives text-based user-intent input and proposes strategies to mitigate such attacks.

3. Methodology of Assessing the Robustness of Cognitive Coordinator

In this section, we describe the methodology employed to develop and evaluate the robustness of the Cognitive Coordinator model and the adversarial attack strategies applied. The core of the system is a BERT-based five-head regression model designed to independently predict scores related to trustworthiness for five trust-related classes: Reliability, Privacy, Security, Resilience, and Safety.

3.1. Dataset Creation Principles

The dataset [19] was created with the input of five domain experts, each specializing in one of the trust-related classes: Reliability, Privacy, Security, Resilience, and Safety. It consists of annotated phrases and sentences for each class with corresponding trustworthiness scores, which a user could ask for.

Data augmentation techniques were applied to enhance the dataset’s diversity and robustness. These augmentations aim to simulate real-world variations and improve the model’s generalization capabilities:

Synonym Replacement: A subset of words in the text was replaced with synonyms derived from WordNet [20] to preserve the original context while creating variability.
Normalization: Scores were normalized to a range of 0 to 1 for consistency and to facilitate regression-based learning.

Each sample in the dataset was tokenized using the BERT tokenizer, with text sequences padded or truncated to a maximum length of 128 tokens. The data was then split into training (70%), validation (15%), and test (15%) sets to ensure balanced evaluation.

3.2. Model Architecture of Cognitive Coordinator

The architecture of our Cognitive Coordinator model builds upon the widely used BERT transformer framework. It uses a shared BERT encoder alongside five distinct regression heads, each designed to predict scores for specific trust-related dimensions. The key components of the architecture are as follows:

BERT Encoder: A pretrained BERT model serves as the shared feature extractor. It processes input text into a pooled output vector, the size of which corresponds to the hidden dimension of the BERT model.
Independent Regression Heads: These are fully connected layers, each tailored for a specific trust-related dimension: Reliability, Privacy, Security, Resilience, and Safety. This modular approach allows each head to specialize in its respective task.
Forward Pass: The BERT encoder generates a pooled representation for every input, which is then passed to the appropriate regression head. This head computes the trustworthiness score for the corresponding dimension.

This architecture strikes a balance between shared learning across dimensions and specialized prediction, promoting both efficiency and accuracy.

3.3. Model Training

The model was fine-tuned using a systematic grid search to identify optimal hyperparameters, including the learning rate, batch size, number of epochs, and weight decay. Additionally, the grid search evaluated multiple transformer-based pre-trained models to benchmark their performance in trustworthiness quantification. The models considered in the search included:

BERT-Base-Uncased: A widely adopted general-purpose model with 12 layers and 110 M parameters.
RoBERTa-Base: A robustly optimized variant of BERT, trained on a larger dataset with enhanced pre-training techniques.
ALBERT-Base (v2): A lightweight version of BERT that reduces model size via factorized embeddings and parameter sharing, while retaining high accuracy.
ELECTRA-Base: A model that trains with a discriminator-generator framework, offering faster pre-training and strong downstream performance.

These models were selected for their unique approaches in addressing the challenges of language modeling processing. Their selection helps to ensure that different aspects of natural language processing (NLP) challenges, such as computational efficiency, model size, and training methodology, are adequately explored and addressed.

The final grid search hyperparameter values that yielded the best performance were:

Learning rate = 2 × 10⁻⁵
Batch size = 16
Epochs = 5
Weight decay: 0.01

The training utilized the AdamW optimizer paired with a linear learning rate scheduler. The mean squared error (MSE) was employed as the loss function, and early stopping based on validation loss was implemented to mitigate overfitting.

3.4. Adversarial Attack Setup

To assess the robustness of the model, three popular adversarial black-box attacks were implemented. These attacks manipulate semantics while retaining surface coherence. The impact of such perturbations on model outputs is demonstrated in Figure 2, which showcases how small changes result in significant deviations in trustworthiness scores. The attacks aimed to evaluate the model’s sensitivity to input variations and its ability to maintain accurate trust assessments.

3.4.1. The TextFooler Attack

The TextFooler attack [21] was employed as the primary adversarial strategy. This attack operates by identifying and perturbing the most salient words in the input, which have the highest impact on the model’s predictions. The attack framework includes the following steps:

Salient Word Identification: Identifies the most impactful words by masking and analyzing the drop in model confidence.
Synonym Replacement: Replaces identified words with synonyms using WordNet v3.1, ensuring grammatical consistency.
Semantic and Perturbation Constraints: Maintains semantic similarity and minimizes the proportion of altered words to preserve input coherence.

3.4.2. The BERT-Attack for Textual Entailment

To complement TextFooler, we employed the BERT-Attack for textual entailment (BAE) [22], which utilizes a masked language model to replace or insert contextually appropriate words. This attack uses BERT’s ability to predict masked tokens in the text, allowing for more sophisticated perturbations. The framework for BAE includes the following steps:

Salient Word Identification: Similar to TextFooler, the BAE identifies salient words based on their contribution to the model’s predictions. Words are prioritized based on the drop in the model’s confidence when masked.
Masked Word Replacement: BERT’s masked language model is used to predict and replace words contextually.
Semantic and Perturbation Constraints: Perturbations are constrained to maintain semantic coherence and minimize the number of modified tokens, ensuring that adversarial examples remain close to the original text in meaning.

3.4.3. The Probability Weighted Word Saliency Attack

To further evaluate the model’s robustness, we implemented the PWWS (Probability Weighted Word Saliency) attack [23], which employs a saliency-based strategy to target the most impactful words in the input. PWWS utilizes a probability-weighted approach to rank word importance, making it particularly effective for identifying and replacing critical words in a sentence. The framework for PWWS includes the following steps:

Saliency Score Computation: Assigns scores to words based on their impact on model predictions.
Synonym Replacement: Focuses on high-saliency words, replacing them with contextually appropriate alternatives.
Iterative Perturbation: The attack proceeds iteratively, replacing words until a significant deviation in the model’s prediction is achieved or all high-saliency words have been perturbed.
Semantic and Perturbation Constraints: Like TextFooler and BAE, PWWS enforces constraints to preserve the original text’s meaning and ensure coherence. The number of modifications is minimized to create adversarial examples that remain realistic and meaningful.

PWWS differs from TextFooler by using probability-weighted saliency metrics, providing a more granular ranking of word importance. This approach often leads to fewer and more targeted perturbations, making it an effective complement to the other attacks.

In the next section, the three different attacks are compared, while metrics, such as score deviation, regression error, and perturbation coherence, are presented to quantify the impact of each adversarial attack.

4. Experimental Results of the Cognitive Coordinator Model Under Adversarial Perturbations

This section evaluates the robustness of the Cognitive Coordinator model under adversarial perturbations, examining the three attacking strategies that have been explained in the previous section: TextFooler, BAE, and PWWS. The analysis employs key metrics—Score Deviation, Mean Squared Error (MSE), Perturbation Coherence, and Success Rate—to comprehensively assess the model’s vulnerabilities and resilience against these attacks. These metrics demonstrate the model’s capacity to maintain reliable trustworthiness assessments under adversarial conditions.

More specifically, the following KPIs were used for quantifying the model’s robustness:

Score Deviation: Measures the average change in the predicted score between the original and adversarial examples, reflecting the model’s sensitivity to perturbations:

S D = |S_{o r i g} - S_{a d v}|

MSE: Quantifies the overall error introduced by adversarial attacks.

M S E = \frac{1}{N} \sum_{i = 1}^{N} {(S_{o r i g, i} - S_{a d v, i})}^{2}

Perturbation Coherence: Quantifies semantic similarity between original and adversarial texts using cosine similarity between their sentence embedding (generated via a pre-trained sentence-BERT model). Scores range from 0 (dissimilar) to 1 (identical)
Success Rate: Represents the percentage of adversarial examples that caused a deviation larger than a predefined threshold $τ$ . In our study, we used $τ = 0.3$ :

S R = \frac{# o f s a m p l e s w h e r e S D > τ}{t o t a l s a m p l e s} \times 100 %

The quantitative results for these metrics across all attacks are summarized in Table 1.

The results reveal multiple key insights into the performance of the Cognitive Coordinator model under adversarial attacks. Among the models evaluated, the RoBERTa-Base exhibited the highest vulnerability, which can be attributed to its training on larger and more diverse corpora, potentially increasing its sensitivity to minor perturbations. ALBERT’s architectural design with parameter sharing and factorized embeddings seems to contribute to its resilience against TextFooler, but its reduced model capacity might explain its lower robustness against BAE and PWWS. These findings suggest that architectural traits like redundancy, pretraining corpus, and model sparsity play a pivotal role in adversarial resistance. However, the RoBERTa-Base was also the most vulnerable to attacks, as evidenced by the high success rates across all attack strategies, which reflect the ease with which adversarial examples disrupted its predictions. This increased vulnerability was further supported by its higher MSE and Score Deviation values compared to other models.

On the other hand, ALBERT-Base showed exceptional semantic resilience, achieving a Perturbation Coherence of 0.992 with TextFooler. Despite this strength, it exhibited reduced robustness against BAE and PWWS attacks, as seen in the higher Success Rates under these strategies. BERT-uncased emerged as the most balanced model, maintaining relatively low Score Deviations and MSE while preserving semantic coherence effectively, making it the most robust across different attack scenarios.

From an attack-specific perspective, TextFooler caused minimal semantic disruption while inducing significant score deviations, making it a suitable strategy for evaluating subtle vulnerabilities in the models. In contrast, BAE introduced more aggressive semantic changes, leading to a marked decrease in perturbation coherence. PWWS provided a balanced approach, combining moderate semantic preservation with effective adversarial impact, offering valuable insights into the practical resilience of the models.

Additionally, Table 2 provides qualitative insights by showcasing examples of original and adversarial texts.

Many adversarial inputs exploit lexical ambiguities or synonym replacements that change the perceived context. For instance, replacing ‘retention policy’ with ‘insurance policy’ triggers a semantic drift toward legal or privacy contexts. The model’s sensitivity to named-entity changes (e.g., ‘AI-driven’ to ‘Army Intelligence-driven’) also reveals over-reliance on key tokens. These shifts highlight a need for robust contextual grounding in model architecture.

The results underscore several critical observations:

Model Selection and Use Cases: The choice of the transformer model depends on the operational context. For environments requiring high semantic coherence, RoBERTa-Base and ELECTRA-Base are not preferable. In resource-constrained scenarios, ALBERT-Base offers an efficient alternative.
Attack-specific Performance: TextFooler exhibited minimal semantic disruption but caused notable score deviations, making it suitable for subtle adversarial testing. BAE, while more aggressive, significantly impacted coherence. PWWS balanced perturbation impact and coherence, offering practical insights into adversarial robustness.

In a real 6G context, a misclassified trust dimension due to adversarial input could result in incorrect system actions, such as enabling less secure network slices or failing to apply safety-critical configurations. These risks underline the importance of adversarial robustness in maintaining safe AI-driven orchestration within next-gen networks. In the next section, we discuss the implications of these results and outline potential paths for improving the robustness of a cognitive coordinator model in a 6G network.

5. Conclusions

In conclusion, this study demonstrates the various BERT-based cognitive coordinators in 6G networks, and BERT-uncased emerges as the most balanced choice. It offers a robust defense against adversarial attacks while maintaining high levels of perturbation coherence, making it ideally suited for environments where maintaining semantic integrity is critical. On the other hand, RoBERTa-Base, despite its high sensitivity to adversarial manipulations, might be preferable in scenarios where higher vulnerability can be compensated by its superior performance in undisturbed conditions. Future research should focus on enhancing the robustness of cognitive coordinator models by integrating advanced adversarial defense techniques, such as adversarial training and input-level perturbation filtering. Additionally, incorporating lightweight NLP baselines (e.g., TF-IDF with logistic regression or FastText) under the same adversarial evaluation framework would offer deeper insights into whether these vulnerabilities are unique to transformer architectures. Finally, deploying and testing these models in real-world 6G scenarios will be essential to validate their effectiveness and adaptability in dynamic network environments.

Author Contributions

Conceptualization, I.A. and H.K.; methodology, I.A., H.K. and V.R.; validation, I.A., V.R. and G.P.-N.; formal analysis, I.A., H.K. and G.M.; investigation, I.A. and S.G.; resources, H.K. and V.R.; data curation, I.A. and V.R.; writing—original draft preparation, I.A. and H.K.; writing—review and editing, V.R., G.P.-N., S.G. and G.M.; visualization, S.G.; supervision, H.K. and G.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by (i) the SAFE-6G project that has received funding from the Smart Networks and Services Joint Undertaking (SNS JU) under the European Union’s Horizon Europe research and innovation program under Grant Agreement No. 101139031, (ii) the 6G-VERSUS project that has received funding from the SNS JU under the European Union’s Horizon Europe research and innovation program under Grant Agreement No. 101192633, (iii) the 6G-SANDBOX project that has received funding from the SNS JU under the European Union’s Horizon Europe research and innovation program under Grant Agreement No. 101096328, (iv) the 6G-UNITY project that has received funding from the SNS JU under the European Union’s Horizon Europe research and innovation program under Grant Agreement No. 101192650.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

6G	Sixth Generation
AI	Artificial Intelligence
BAE	BERT-Attack for textual entailment
BERT	Bidirectional Encoder Representations from Transformers
KPIs	Key Performance Indicators
LoTw	Level of Trustworthiness
ML	Machine Learning
MSE	Mean Squared Error
NLP	Natural Language Processing
PWWS	Probability Weighted Word Saliency

References

Makropoulos, G.; Fragkos, D.; Koumaras, H.; Alonistioti, N.; Kaloxylos, A.; Setaki, F. Exploiting Core Openness as Native-AI Enabler for Optimised UAV Flight Path Selection. In IEEE Conference on Standards for Communications and Networking (CSCN); IEEE: Piscataway, NJ, USA, 2023. [Google Scholar]
Tsolkas, D.; Koumaras, H.; Charismiadis, S.; Foteas, A. Artificial intelligence in 5G and beyond networks. In Applied Edge AI; Auerbach Publications: Boca Raton, FL, USA, 2022; pp. 73–103. [Google Scholar]
Christopoulou, M.; Barmpounakis, S.; Koumaras, H.; Kaloxylos, A. Artificial Intelligence and Machine Learning as key enablers for V2X communications: A comprehensive survey. Veh. Commun. 2023, 39, 100569. [Google Scholar] [CrossRef]
Gkatzios, N.; Koumaras, H.; Fragkos, D.; Koumaras, V. A Proof of Concept Implementation of an AI-assisted User-Centric 6G Network. In Proceedings of the Joint European Conference on Networks and Communications & 6G Summit (EUCNC), Antwerp, Belgium, 3–6 June 2024; pp. 907–912. [Google Scholar] [CrossRef]
Gkatzios, N.; Vryonis, N.; Fragkos, C.; Sakkas, C.; Mavrikakis, V.; Koumaras, V.; Makropoulos, G.; Fragkos, D.; Koumaras, H. A chatbot assistant for optimizing the fault detection and diagnostics of industry 4.0 equipment in the 6g era. In IEEE Conference on Standards for Communications and Networking (CSCN); IEEE: Piscataway, NJ, USA, 2023. [Google Scholar]
Zhang, H.; Wu, B.; Yuan, X.; Pan, S.; Tong, H.; Pei, J. Trustworthy Graph Neural Networks: Aspects, Methods and Trends. Proc. IEEE 2024, in press. [Google Scholar] [CrossRef]
Liu, H.; Wang, Y.; Fan, W.; Liu, X.; Li, Y.; Jain, S.; Jain, A.K.; Tang, J. Trustworthy AI: A Computational Perspective. ACM Trans. Intell. Syst. Technol. 2022, 14, 1–59. [Google Scholar] [CrossRef]
Sun, L.; Dou, Y.; Yang, C.; Zhang, K.; Wang, J.; Yu, P.S. Adversarial Attack and Defense on Graph Data: A Survey. IEEE Trans. Knowl. Data Eng. 2022, 35, 7693–7711. [Google Scholar] [CrossRef]
Zhai, Z.; Li, P.; Feng, S. State of the Art on Adversarial Attacks and Defenses in Graphs. Neural Comput. Appl. 2023, 35, 18851–18872. [Google Scholar] [CrossRef]
Son, B.D.; Hoa, N.T.; Chien, T.V.; Khalid, W.; Ferrag, M.A.; Choi, W.; Debbah, M. Adversarial Attacks and Defenses in 6G Network-Assisted IoT Systems. IEEE Internet Things J. 2024, 11, 19168–19187. [Google Scholar] [CrossRef]
Szegedy, C.; Zaremba, W.; Sutskever, I.; Bruna, J.; Erhan, D.; Goodfellow, I.; Fergus, R. Intriguing Properties of Neural Networks. In Proceedings of the 2nd International Conference on Learning Representations (ICLR 2014), Banff, AB, Canada, 14–16 April 2014; Available online: https://arxiv.org/abs/1312.6199 (accessed on 25 May 2025).
Khamaiseh, S.Y.; Bagagem, D.; Al-Alaj, A.; Mancino, M.; Alomari, H.W. Adversarial Deep Learning: A Survey on Adversarial Attacks and Defense Mechanisms on Image Classification. IEEE Access 2022, 10, 102266–102291. [Google Scholar] [CrossRef]
Zeng, X.; Liu, C.; Wang, Y.-S.; Qiu, W.; Xie, L.; Tai, Y.-W.; Tang, C.-K.; Yuille, A.L. Adversarial Attacks Beyond the Image Space. In Proceedings of the Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 4297–4306. [Google Scholar] [CrossRef]
Tan, H.; Wang, L.; Zhang, H.; Zhang, J.; Shafiq, M.; Gu, Z. Adversarial Attack and Defense Strategies of Speaker Recognition Systems: A Survey. Electronics 2022, 11, 2183. [Google Scholar] [CrossRef]
Zhang, W.E.; Sheng, Q.Z.; Alhazmi, A.; Li, C. Adversarial Attacks on Deep-learning Models in Natural Language Processing: A Survey. ACM Trans. Intell. Syst. Technol. (TIST) 2020, 11, 3. [Google Scholar] [CrossRef]
Han, X.; Zhang, Y.; Wang, W.; Wang, B. Text Adversarial Attacks and Defenses: Issues, Taxonomy, and Perspectives. Secur. Commun. Netw. 2022, 1, 2022. [Google Scholar] [CrossRef]
Alexandropoulos, I.; Rentoula, V.; Fragkos, D.; Gkatzios, N.; Koumaras, H. An AI-assisted User-Intent 6G System for Dynamic Throughput Provision. In Proceedings of the IEEE International Workshop on Computer Aided Modeling and Design of Communication Links and Networks (CAMAD 2024), Athens, Greece, 21–23 October 2024. [Google Scholar]
Sagduyu, Y.E.; Erpek, T.; Shi, Y. Adversarial Machine Learning for 5G Communications Security; John Wiley & Sons, Ltd.: Hoboken, NJ, USA, 2021. [Google Scholar]
Alexandropoulos, I.; Koumaras, H. Cognitive Coordinator Dataset of User Intent for Trustowrthiness on Five Principles (Reliability, Privacy, Security, Resilience, and Safety). Zenodo. Available online: https://zenodo.org/records/15512671 (accessed on 2 June 2025).
Princeton University. WordNet. Available online: https://wordnet.princeton.edu/ (accessed on 2 June 2025).
Jin, D.; Jin, Z.; Zhou, J.T.; Szolovits, P. Is BERT Really Robust? Natural Language Attack on Text Classification. arXiv 2019, arXiv:1907.11932. [Google Scholar]
Garg, S.; Ramakrishnan, G. BAE: BERT-based Adversarial Examples for Text Classification. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP); IEEE: Piscataway, NJ, USA, 2020. [Google Scholar]
Ren, S.; Deng, Y.; He, K.; Che, W. Generating Natural Language Adversarial Examples through Probability Weighted Word Saliency. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019. [Google Scholar]

Figure 1. AI-assisted cognitive coordination of 6G trustworthiness.

Figure 2. Example of a slightly input change but a completely altered calculation result.

Table 1. Adversarial Attack Results.

Attack Type	Model	Score Deviation	MSE	Perturbation Coherence	Success Rate
TextFooler	BERT-uncased	0.102	0.019	0.896	14.79%
	RoBERTa	0.208	0.079	0.985	40.14%
	Albert	0.125	0.024	0.992	16.19%
	Electra	0.165	0.038	0.982	33.09%
BAE	BERT-uncased	0.159	0.044	0.617	33.10%
	RoBERTa	0.208	0.081	0.947	39.43%
	Albert	0.157	0.043	0.884	24.64%
	Electra	0.171	0.041	0.958	35.21%
PWWS	BERT-uncased	0.127	0.028	0.657	22.34%
	RoBERTa	0.228	0.097	0.802	43.66
	Albert	0.186	0.063	0.735	28.87%
	Electra	0.192	0.051	0.812	43.66%

Table 2. Qualitative Insights of Adversarial Attack.

Original Text	Adversarial Text	Original Score	Adversarial Score
Reliable application	Dependable practical application	0.205	0.679
Data retention policy	Information on the insurance policy	0.705	0.506
Full device trust validation	Mobile device trust validation	0.855	0.364
AI-driven capacity planning	Army Intelligence—drive capacity planning	0.84	0.512
Protect against unauthorized modifications to the network infrastructure	Protect against unauthorized modifications to the network framework	0.82	0.54
Conduct safety reviews for digital transformation initiatives	Carry out safety reviews for digital transformation initiatives	0.705	0.49
Provide secure and reliable infrastructure solutions	Provide protected and dependable framework solutions	0.755	0.511
Implement safe network segmentation strategies	Apply safe system segmentation strategies	0.745	0.444
Create an ontology of intent for the resilience function	Generate an ontology of intent for robustness operation	0.717	0.408
Perform autonomous resource orchestration	Execute autonomous asset orchestration	0.878	0.548

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Alexandropoulos, I.; Koumaras, H.; Rentoula, V.; Papanikolaou-Ntais, G.; Georgoulas, S.; Makropoulos, G. Adversarial Robustness in Cognitive Systems: A Trustworthiness Assessment Perspective for 6G Networks. Electronics 2025, 14, 2285. https://doi.org/10.3390/electronics14112285

AMA Style

Alexandropoulos I, Koumaras H, Rentoula V, Papanikolaou-Ntais G, Georgoulas S, Makropoulos G. Adversarial Robustness in Cognitive Systems: A Trustworthiness Assessment Perspective for 6G Networks. Electronics. 2025; 14(11):2285. https://doi.org/10.3390/electronics14112285

Chicago/Turabian Style

Alexandropoulos, Ilias, Harilaos Koumaras, Vasiliki Rentoula, Gerasimos Papanikolaou-Ntais, Spyridon Georgoulas, and George Makropoulos. 2025. "Adversarial Robustness in Cognitive Systems: A Trustworthiness Assessment Perspective for 6G Networks" Electronics 14, no. 11: 2285. https://doi.org/10.3390/electronics14112285

APA Style

Alexandropoulos, I., Koumaras, H., Rentoula, V., Papanikolaou-Ntais, G., Georgoulas, S., & Makropoulos, G. (2025). Adversarial Robustness in Cognitive Systems: A Trustworthiness Assessment Perspective for 6G Networks. Electronics, 14(11), 2285. https://doi.org/10.3390/electronics14112285

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Adversarial Robustness in Cognitive Systems: A Trustworthiness Assessment Perspective for 6G Networks

Abstract

1. Introduction

2. Materials and Methods

2.1. Related Work in the Cognitive Coordinator Paradigm for Trustworthiness Provision

2.2. Related Work in Adversarial Robustness

3. Methodology of Assessing the Robustness of Cognitive Coordinator

3.1. Dataset Creation Principles

3.2. Model Architecture of Cognitive Coordinator

3.3. Model Training

3.4. Adversarial Attack Setup

3.4.1. The TextFooler Attack

3.4.2. The BERT-Attack for Textual Entailment

3.4.3. The Probability Weighted Word Saliency Attack

4. Experimental Results of the Cognitive Coordinator Model Under Adversarial Perturbations

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI