A GNN-Based Log Anomaly Detection Framework with Prompt Learning for Edge Computing

Hu, Xianlang; Feng, Guangsheng; Huang, Xinling; Kong, Xiangying; Lv, Hongwu

doi:10.3390/computers15050273

Open AccessArticle

A GNN-Based Log Anomaly Detection Framework with Prompt Learning for Edge Computing

by

Xianlang Hu

^1,2,

Guangsheng Feng

^1,*

,

Xinling Huang

¹,

Xiangying Kong

² and

Hongwu Lv

¹

College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China

²

Jiangsu Institute of Automation, Lianyungang 222006, China

^*

Author to whom correspondence should be addressed.

Computers 2026, 15(5), 273; https://doi.org/10.3390/computers15050273

Submission received: 20 March 2026 / Revised: 17 April 2026 / Accepted: 20 April 2026 / Published: 24 April 2026

(This article belongs to the Special Issue Mobile Fog and Edge Computing)

Download

Browse Figures

Versions Notes

Abstract

System logs have been critical for analyzing the operational status and abnormal behavior of highly distributed and heterogeneous edge computing nodes. In edge environments, logs exhibit cross-event and cross-field structural interactions, making it difficult to uncover potential anomaly patterns from isolated events. Moreover, sparse annotations and varying log formats limit the effectiveness of existing methods. To address these challenges, we propose a graph neural network (GNN) anomaly detection framework with prompt learning. It leverages few-shot prompt learning to automatically extract key fields and constructs a weighted directed graph that jointly models semantic embeddings and temporal dependencies, fully representing the structural interactions and semantic associations across events and fields. Furthermore, the framework performs graph-level anomaly detection by jointly optimizing graph representation learning and classification objective within an enhanced one-class directed graph convolutional network, enabling effective identification of global structural anomaly patterns in log graphs. Experimental results demonstrate that the proposed method achieves an average F1-score of 93.3%, surpassing the current state-of-the-art (SOTA) methods by 6.93%.

Keywords:

edge computing; log anomaly detection; graph neural networks; few-shot prompt learning

1. Introduction

Edge computing has emerged as the critical infrastructure underpinning low-latency and highly reliable services in the era of intelligence driven by large-scale distributed applications, the Internet of Things, and intelligent sensing systems. It plays an indispensable role in key sectors such as smart manufacturing, intelligent transportation, and mobile healthcare [1]. However, the inherent characteristics of large-scale distributed deployment of edge nodes and the coexistence of heterogeneous operating environments significantly increase system operation and maintenance complexity, exposing these platforms to unprecedented security risks [2]. System logs, serving as the primary source for recording service calls, resource scheduling and abnormal behaviors, have become the critical basis for perceiving system operational status, conducting in-depth security audits and tracing faults [3]. Nevertheless, constrained by limited computational resources and the structural complexity of cross-node interactions, how to achieve accurate and real-time anomaly detection in massive and heterogeneous log streams remains a non-trivial task that urgently requires resolution.

In this work, we consider an edge-assisted log monitoring scenario with delay tolerance, where logs generated by distributed nodes are periodically collected and processed at nearby edge servers using sliding-window batch analysis. Such tasks prioritize robust anomaly detection in resource-constrained and heterogeneous environments. To illustrate, in maritime edge systems, multiple unmanned surface vehicles (USVs) operate collaboratively under shore or ship-based edge servers, periodically offloading logs reflecting navigation, communication, and task execution for system monitoring rather than participating in latency-critical control loops. In this circumstances, anomaly detection is performed on aggregated log sequences, and the key requirement is to achieve a balance between detection accuracy and computational efficiency, enabling practical deployment in resource-constrained edge environments. However, due to differences in mission types, software stacks, and communication protocols result in significant heterogeneity at the event and field levels. Additionally, logs inherently present semi-structured complexity, temporal dependencies, and rich semantics, which together pose substantial challenges for anomaly detection, necessitating methods capable of simultaneously modeling structural and semantic patterns under resource-constrained edge environments.

Existing anomaly detection techniques show obvious limitations in handling such complexity. Traditional rule-based methods [4,5] struggle to keep up with varying and heterogeneous log formats and are costly to maintain. Statistical models [6,7] have limited ability to model non-linear cross-service dependencies, failing to capture implicit relationships among semantic fields. In recent years, sequence-based deep learning methods [8,9] have demonstrated notable performance in cloud environments. They mainly focus on the linear ordering of logs, while often overlooking the structural relationships and their associated fields, without accounting for the complete structural dependencies among events. Therefore, existing research still faces two main challenges: Topological Dependencies of Event-Fields: In edge computing environments, anomalies often remain concealed within structural interactions spanning multiple events and fields. The field associations between different events form complex topological structures that reveal underlying anomaly patterns. Since anomalies within individual events or isolated patterns are not readily apparent, the interconnections between them are frequently overlooked, leading to missed detections or false positives. Label Deficiency in Heterogeneous Environments: Edge logs are voluminous, yet genuine labeled samples remain extremely scarce. This stems primarily from the fragmentation of edge-side logs and the diverse array of devices, making it exceptionally challenging to acquire large-scale, high-quality labeled log samples. Traditional supervised learning methods rely on extensive manual labeling, struggling to adapt to the diverse log formats and sparse anomaly samples characteristic of edge environments, severely limiting the generalization capabilities.

To address the above issues, we propose a log anomaly detection framework for edge computing scenarios, which uniformly models the structural dependencies and semantic associations between log events and key fields. The anomaly detection is trained in an unsupervised manner on normal log graphs, while a prompt-based few-shot learning is employed for field extraction with limited annotated samples. Specifically, we first design a prompt-based learning module for few-shot field extraction, which automatically identifies key log fields under minimal annotation conditions. Second, we transform log events and extracted field features into attribute-enhanced weighted directed graphs to capture structural interactions and semantic relationships across events and fields. Finally, by jointly optimizing graph representation learning and anomaly detection objectives, we achieve graph-level anomaly detection using a single-class directed graph initial convolutional network in an unsupervised manner, effectively identifying abnormal behaviors hidden within complex log interaction patterns. We evaluate the proposed method against state-of-the-art log anomaly detection approaches. Experimental results on 3 benchmark datasets show that the average F1-score of our method is 93.3%, which is 6.93% higher than the previous state-of-the-art (SOTA) method.

Overall, The main contributions of this work are summarized as follows:

We propose a graph-based anomaly detection model for edge computing. Log events and fields are uniformly incorporated into the attribute-enhanced weighted directed log graph, and the nodes are embedded with contextual semantic information. It significantly improves the ability to identify complex structural anomaly patterns.
We propose a prompt-based few-shot field extraction module that formulates log field identification as a prompt-driven sequence generation task. By leveraging the semantic capabilities of pre-trained language model, it achieves precise extraction of key fields while substantially reducing reliance on handcrafted rules and large-scale annotated data.
We formalize log anomaly detection as a graph-level anomaly detection problem to achieve end-to-end collaborative optimization of unsupervised graph representation learning and anomaly detection. It not only identifies isolated event anomalies but also effectively captures structural deviations across events and fields, enhancing the accuracy of anomaly detection tasks.

The remainder of this paper is organized as follows. Section 2 reviews the related work in this area. Section 3 and Section 4 present the problem formulation and the design of the proposed log anomaly detection framework, respectively. Section 5 reports the experimental results along with discussions on relevant issues. Finally, Section 6 concludes this study.

2. Related Work

2.1. Probability-Based Log Detection

One common line of research is probabilistic methods [10,11,12], which leverage the statistical characteristics of log events for anomaly detection. They typically identify anomalies by extracting count features, frequency distributions, or statistical indicators from log data. Early research used numerical vectors to represent log sequences, generated by counting the number of various log events in the sequence. Xu et al. [10] employed principal component analysis (PCA) to detect log anomalies; Lin et al. [11] utilized hierarchical clustering to identify log anomalies; He et al. [12] extended their work by identifying the correlation between logs and system metrics. Nonetheless, these works assume that the data distribution conforms to specific statistical laws, resulting in inadequate performance in handling nonlinear and complex relationships, and limited ability to detect novel or unknown anomalies.

2.2. Sequence-Based Log Detection

Beyond probabilistic methods, sequence-based methods [8,9,13,14] treat logs as time-series data, leveraging sequence modeling techniques to capture temporal dependencies between events. DeepLog [8] detects anomalies using LSTM to predict the next log event. Similarly, LogAnomaly [9] combines semantic vectors generated from word embeddings with numerical vectors to predict subsequent event patterns through LSTM. Zhang et al. [13] represent log events using semantic vectors and employ BiLSTM to detect anomalies across entire log sequences in a supervised manner. PLELog [14] extends to a semi-supervised approach that utilizes Positive-Unlabeled Learning to estimate labels for log sequences. They excel at capturing temporal patterns in log events, but have limitations in identifying anomalous patterns involving complex structures and multidimensional dependencies.

2.3. Graph-Based Log Detection

Graph-based methods [15,16,17,18] are not only applicable for detecting anomalies in graph topology and node attributes but also for identifying abnormal sequences, a field that is gaining increasing attention. Leading GNN models include SAGE [19], GAT [20] and Transformer Graph (GT) [21]. These graph-level representation learning methods are able to learn the mapping from graphs to vectors and can be combined with off-the-shelf anomaly detectors such as OCSVM and iForest [22] to perform graph-level anomaly detection. To improve detection performance, Ma et al. [15] performed random distillation on graph and node representations to learn “normal” graph patterns. Additionally, Qiu et al. [16] combined neural transformation learning with a classification approach to learn graph representations for anomaly detection. Although these methods are unsupervised or semi-supervised, they can only handle attribute graphs, undirected graphs, and unweighted graphs. iGAD [17] treats graph-level anomaly detection as a graph classification problem and combines attribute-aware graph convolutions with substructure-aware deep random walks to learn graph representations. However, iGAD is a supervised approach and can only handle attribute, undirected and unweighted graphs. Nguyen et al. [18] adopted the minimum description length principle to identify anomalous graphs. Yet this method can only process labeled, directed and edge-weighted graphs, and its computational cost is extremely high. In contrast, we introduce a universal unsupervised graph-level anomaly detection method that can handle attribute, directed and weighted graphs.

3. Problem Definition

We formulate the problem under an edge-assisted log monitoring setting with delay tolerance, where logs generated by distributed nodes are periodically collected and analyzed at nearby edge servers. The task is defined over fixed-length sliding windows, with each window representing an aggregated sequence of logs rather than individual events. The model focuses on log-level heterogeneity arising from variations in event templates, field structures, and semantic representations. While such heterogeneity may be influenced by deployment conditions in edge environments, this work does not explicitly model its underlying sources, e.g., hardware differences or node-level variations. Instead, these factors are abstracted into the observed log data, allowing the model to focus on capturing structural and semantic variability within log sequences. In this formulation, the detection task is performed at the sequence (window) level, where each log sequence is further represented as a graph. The objective is to identify execution patterns that deviate from normal system behavior without relying on labeled anomalies. Based on this formulation, we first define several key terms relevant to our work.

In this work, a log message is represented as a token sequence

l = {w_{1}, \dots, w_{| l |}}

, here

w_{i}

denotes the i-th token and

| l |

represents the log length.

Next, a log sequence is a chronologically ordered sequence of logs within an observation time window

S_{t} = {l_{1}, \dots, l_{| S_{t} |}}

, where

l_{i}

denotes the i-th log,

| S_{t} |

represents the total number of logs within the time window.

For the

S_{t}

of log sequences within the time window t, we construct a dynamic graph

G_{t} = {V_{t}, E_{t}, X_{t}, Y_{t},}

Specifically, a node set

V_{t}

consists of a unique log event template that occurs within that time window, with each node

v_{i} \in V_{t}

representing a specific class of system events.

E_{t} \subseteq V_{t} \times V_{t}

describes the temporal correlation between events. If two consecutive log messages

l_{k}

,

l_{k + 1}

in sequence

S_{t}

are mapped to event templates

v_{i}

,

v_{j}

respectively, then a directed edge

(v_{i}, v_{j}) \in E_{t}

is established. Node attribute matrix

X_{t} \in R^{| V_{t} | \times d}

, the i-th row vector

x_{i}

represents the semantic embeddings corresponding to node

v_{i}

, d is the embedding dimension. The element

y_{i j}

in the edge weight matrix

Y_{t} \in N^{| V_{t} | \times | V_{t} |}

represents the number of times event

v_{j}

immediately follows log

v_{i}

within time window t, serving to quantify the strength of event transitions. In this manner, we can construct a set of log graphs

{G_{t}}_{t = 1}^{T}

from log sequences across different time windows, serving as input for subsequent graph representation learning and anomaly detection models. It is important to note that the log graph constructed in this work is inherently heterogeneous. Specifically, the node set consists of two types of nodes: (i) event template nodes, representing abstracted log event patterns, and (ii) field nodes, representing key semantic attributes extracted from log messages. We first define the event-level graph structure based on temporal dependencies in this section. The integration of field nodes and the full heterogeneous graph construction will be introduced in Section 4.2.

4. Methodology

In this section, we propose a log anomaly detection framework for edge computing scenarios, specifically designed to address resource-constrained and delay-tolerant edge-assisted monitoring environments. To capture both semantic and structural dependencies within logs, log events and fields are organized into graph representations. As shown in Figure 1, the overall pipeline consists of log parsing, semantic feature extraction, graph construction, graph representation learning, and anomaly detection. Log event templates are first extracted using the widely adopted Drain [23] parser. Based on the parsed logs, we integrate graph representation learning and anomaly detection in an end-to-end manner, making the model to directly learn normal execution patterns from log interaction graphs without requiring labeled anomalies. Notably, the unsupervised setting applies to the anomaly detection, whereas few-shot annotations are exclusively leveraged in the prompt-based field extraction stage.

4.1. Prompt-Based Few-Shot Field Extraction

In the edge environment, logs are generated by a large number of edge nodes with heterogeneous configurations, and the difference between the software stack and the service load of different nodes leads to a high degree of heterogeneity in the log format. This decentralization not only increases the difficulty of log parsing and understanding, but also makes manual annotation extremely difficult. Although there have been studies [24,25] to extract event templates and fields from log messages through rule matching or search strategies, such methods rely on predefined syntax patterns for structurally stable field types such as IP addresses, emails, or URLs, which are difficult to adapt to in the face of dynamic service interactions, user behavior descriptions and denormalized text that are prevalent in edge systems. To overcome the above challenges, we transform log field extraction into a Named Entity Recognition (NER)-style semantic extraction task implemented via a prompt-based sequence generation method. Unlike conventional token-level NER, our method adopts a prompt-based generative modeling, enabling more flexible semantic extraction under few-shot supervision. By introducing task-specific prompts, the model is guided to capture the semantic roles within log messages, facilitating effective identification of key fields with only a limited number of annotated samples. Referencing common log entity definitions [26,27,28], we selected 15 log field types critical to system monitoring, including IP address, email, process ID, user ID, username, timestamp, service, server, file path, URL, port, session, duration, domain, and version.

We build the field extraction as a seq2seq learning process, as shown in Figure 2. Given a set of log messages

l = {w_{1}, \dots, w_{| l |}}

, we focus on identifying from the log messages the type of field that is closely related to the state of the system, which contains a set of golden fields

C = {c_{1}, \dots, c_{| C |}}

. Specifically, we enumerate the candidate text spans in the log message

w_{i : j}

and create a prompt sequence

P_{c_{k}, x_{i : j}} = {p_{1}, \dots, p_{m}}

for each span with field type

c_{k} \in C

. When the text span

w_{i : j}

is labeled as a field type

c_{k}

, a positive hint is given, such as “<

w_{i : j}

)> is a <

c_{k}

>field”; Otherwise, negative prompts are given, such as “<

w_{i : j}

> does not belong to any field type”. Through this prompting mechanism, the model can discriminate the semantic roles in the log text during the generation process, thereby reducing the dependence on domain rules and a large number of annotation samples.

During training, some log messages contain field type annotations for text spans as supervisory signals without requiring fine-grained annotation. These annotations are used solely for training the field extraction module and are not involved in the anomaly detection process. Following in [28,29], we construct prompt sample pairs for each log message. For each log message l, we traverse all of its golden fields to create a positive sample

(l, P^{+})

and a negative sample

(l, P^{-})

by randomly sampling unlabeled text spans. To adapt to the resource constraints in edge scenarios, we limit the number of n-gram spanned to 15, namely creating 5 × n negative prompts for each log message, and control the number of negative samples to be about three times that of positive samples. Given a sequence pair

(l, P)

, we feed the log message l to a BART [29] encoder with a hidden size of

d_{h}

to get the hidden state

h^{e n c} \in R^{d_{h}}

:

h^{e n c} = E n c o d e r (w_{1 : | l |})

(1)

In the c decoding step, the decoder combines the encoded representation with the previously generated prompt marker

p_{1 : c - 1}

to calculate the current decoding state through the attention mechanism:

h^{e n c} = D e c o d e r (h^{e n c}, p_{1 : c - 1})

(2)

The conditional probability of the current tag

p_{c}

is defined as:

P (p_{c} | p_{1 : c - 1}, l) = s o f t m a x (W_{n e r} \cdot h^{e n c} + b_{n e r})

(3)

where

W_{n e r} \in R^{d_{h} \times | V |}

,

b_{n e r} \in R^{| V |}

,

| V |

representing the model vocabulary size. The training goal is to minimize cross-entropy loss for all positive and negative cue sequences:

L o s s_{n e r} = - \sum_{c = 1}^{m} log P (p_{c} ∣ p_{1, c - 1}, l)

(4)

In the inference phase, the model traverses all candidate text spans

w_{i : j}

in the log message and calculates the score for each prompt

P_{c_{k}, w_{i : j}} = {p_{1}, \dots, p_{m}}

as follows:

f (P_{c_{k}, w_{i : j}}) = \sum_{c = 1}^{m} log P (p_{c} ∣ p_{1, c - 1}, l)

(5)

By comparing the prompt scores of different field types, the one with the highest score is selected as the extraction result, as illustrated in Figure 3. This iterative process ensures that all relevant fields are extracted in the edge system. This design reduces the reliance on large-scale annotated data, which is particularly important for edge environments where data collection and annotation are constrained.

4.2. Log Graph Construction

As shown in Figure 3, to construct a graph representation from the log sequence, extracted field nodes are associated with log event nodes obtained from the log parser according to predefined connection rules, forming a log graph structure. Subsequently, pre-trained Sentence-BERT [30] is employed to perform semantic encoding on node content. The encoded vectors as node attributes input to the model, while the connection relationships between nodes are represented by an adjacency matrix. Ultimately, node attributes and graph structural information are jointly utilized in subsequent graph representation learning and anomaly detection processes.

4.2.1. Graph Structure Configuration

To effectively transform heterogeneous logs into structured representations of system state in the edge environment, we take snapshots of batch log messages with a sliding window at fixed time intervals, and construct an attribute-enhanced directed weighted graph consisting of event nodes and field nodes in each time window.

Specifically, event nodes are composed of event templates extracted from raw log messages by Drain [23] to characterize the behavior of the system during execution. The field nodes are automatically recognized by the prompt-based few-shot field extraction module and retain their field type information. By introducing field nodes, the fine-grained semantic information obtained is directly mapped to the graph structure, which enhances the expression ability of heterogeneous logs in the edge environment at the structural level.

Subsequently, we use event-field dependencies and event-event timing relationships to construct heterogeneous directed edges. For each log message, we interconnect the event template with each extracted field, and the weight of the edges is determined by the frequency of the field’s occurrence with the corresponding event in the current time window to capture the semantic association in the log. Between the event nodes, we establish a directed timing connection in the order in which the logs arrive. If the event

l_{j}

appears immediately after the event

l_{i}

, add a directed edge from

l_{i}

to

l_{j}

and set its edge weight to 1; If the edge already exists, its weight is added accordingly. Finally, an attribute-enhanced directed weighted graph is formed, and the heterogeneous node types reflect the multiple composition of log semantics, and the edge weight weights the correlation strength between nodes.

4.2.2. Graph Node Attribute Configuration

In order to make the graph contain both structural information and content semantics, we construct semantic representations for event nodes and field nodes separately. Specifically, we use the pre-trained Sentence-BERT [30] to learn sentence embeddings as its properties. For log events, we use their templates as encoders to input text, while for log fields, we use the prompts we define as input text to ensure that the semantic characteristics and type information of the fields are fully captured, such as “imap://localhost/ is a server field”. The output hidden state of each input text captures the node semantics and is used as a node feature to build a property graph.

In the above way, the output hidden state of each input text captures the node semantics and is used as a node feature to construct the property graph. The encoded embedding vectors constitute the attribute matrix

X_{t}

of the graph, and the correlation relationship between nodes constitutes the edge weight matrix

Y_{t}

, which together provide structured and semantic graph representation inputs for subsequent graph anomaly detection.

Although the log graph constructed is of heterogeneous node type (i.e., event nodes and field nodes), we treat it as a uniform property directed graph in subsequent representation learning. The semantic embeddings generated by Sentence-BERT [30] encode node type information, allowing GNNs to jointly capture interactions between events and fields without introducing additional type-specific parameters. This avoids increasing model complexity while retaining the ability to detect anomalies in resource- constrained environments.

4.3. Graph-Based Anomaly Detection for Event Logs

In edge systems, anomalies manifest as abrupt shifts in event propagation directions, deviations in execution paths, or abnormal reorganization of field relationships. To more effectively capture directional structural changes within graphs, we construct an anomaly detection model based on the Directed Graph Convolutional Network [31] (DiGCN), distinguishing between in-edges and out-edges during message propagation, which is crucial for graph-level anomaly detection.

Specifically, given a graph

G_{t}

described by an adjacency matrix

A_{t} \in R^{| V_{t} | \times V_{t} |}

, a node attribute matrix

X_{t} \in R^{| V_{t} | \times d}

, and an edge weight matrix

Y_{t} \in N^{| V_{t} | \times V_{t} |}

, DiGCN defines the k-th order directed graph convolution as:

Z^{(k)} = σ ({\hat{A}}_{i n} X_{t} Θ_{i n}^{(k)} + {\hat{A}}_{o u t} X_{t} Θ_{o u t}^{(k)})

(6)

where,

Z^{(k)} \in R^{| V_{t} | \times f}

denotes the node representations at the k-th layer,

Θ_{i n}^{(k)}

and

Θ_{o u t}^{(k)}

are trainable weight matrices, and

{\hat{A}}_{i n}

,

{\hat{A}}_{o u t}

are normalized adjacency matrices for incoming and outgoing edges, respectively. The function

σ (\cdot)

denotes a nonlinear activation function.

After obtaining multi-scale features

Z^{(0)}, \dots, Z^{(k)}

, DiGCN incorporates an Inception structure block to fuse information from different-order neighborhoods:

Z = σ (Γ ({Z^{(0)}, \dots, Z^{(k)}}))

(7)

Here,

σ

denotes the activation function, and

Γ (\cdot)

represents the fusion operation. In practice, we often employ a fusion operation that preserves the output dimension, i.e.,

Z \in R^{| V_{t} | \times f}

. Thus,

Z_{i}

denotes the learned vector representation of node

v_{i}

within a given layer. For brevity, we omit further details; for comprehensive information, please refer to [31].

Given that DiGCN was originally designed for node representation learning, we adapted it for graph representation learning, as follows:

z = R e a d o u t (Z_{i} | i \in {1, 2, \dots, | V |})

(8)

That is, at the final iteration layer, we utilize the so-called Readout(·) function to aggregate node vector representations to obtain a graph vector representation. Importantly, Readout(·) can be a simple permutation-invariant function, such as a maximum, sum, or average, or a more advanced graph-level pooling function. This operation unifies the node representations into a single graph vector, used to describe the system’s running state within a time window.

Since DiGCN follows the message passing neural network (MPNN) framework [32],

Y_{t}

can be included in the equation

Z^{(k)}

for similar calculation, participate in the node representation update process, and realize edge feature learning. Given a set of graphs

{G}_{t = 1}^{T}

, we can use the Readout(·) function to obtain the explicit vector representation of each graph, and represent the vector of

G_{t}

learned by the DiGCN model as

D i G C N (G_{t}, H)

.

In graph anomaly detection, anomalies are identified based on reconstruction or distance loss. We train a one-class classifier by optimizing the One-Class Deep SVDD objective [33]:

\begin{matrix} min_{H} \frac{1}{T} \sum_{t = 1}^{T} {∥f (G_{t}; H) - o∥}_{2}^{2} + λ {∥H∥}_{F}^{2} \end{matrix}

(9)

where,

f (G_{t}; H) = z_{t}

denotes the graph representation learned by the DiGCN model parameterized by H,

o \in R^{f}

is the center of the hypersphere initialized as the mean of training representations, and

λ

is a regularization coefficient.

After training the model on a set of non-anomaly graphs, given a test graph

G_{t}

we define its distance from the center of the representation space as its anomaly score:

\begin{matrix} s c o r e (G_{t}) = {∥f (G_{t}; H) - o∥}_{2}^{2} \end{matrix}

(10)

Anomaly detection follows a two-stage, consisting of anomaly scoring and threshold classification. The anomaly score provides a continuous measure of deviation from normal behavior, capturing both structural and semantic discrepancies. The greater the distance, the more the log graph deviates from normal system behavior, resulting in a higher anomaly score. However, anomaly scores alone do not provide explicit decision boundaries for classification. To obtain binary labels, a threshold

τ

is introduced to map scores to normal or anomalous classes. A graph

G_{t}

is classified as anomalous if

s c o r e (G_{t}) > τ

, and normal otherwise. The threshold

τ

is determined on a validation set and will be detailed in Section 5.2. This two-stage design enables both fine-grained anomaly ranking and consistent classification for quantitative evaluation.

The greater the distance, the more the log graph deviates from the normal system in terms of structure and semantics, thus it is judged as abnormal.

In summary, it comprises an L-layer DiGCN architecture for learning node representations and a Readout(·) function for obtaining graph representations. By leveraging DiGCN to learn structured representations under the joint influence of events and fields, and integrating a One-Class SVDD objective to achieve unsupervised end-to-end training, it effectively identifies anomalous log behaviors in edge environments. Furthermore, the graph-based modeling captures structural dependencies while avoiding strict sequential processing, and the shallow DiGCN architecture helps control model complexity. These designs enable a practical balance between detection performance and resource efficiency, making the framework suitable for deployment in resource-constrained edge settings.

5. Experiments and Analysis of Results

5.1. Experimental Setup

5.1.1. Datasets

To evaluate the effectiveness of our method, we carry out experiments selecting three widely used public log datasets: HDFS [34], BGL [24], and Thunderbird [24]. Although these datasets are originally collected from cloud-based distributed systems, they exhibit key characteristics highly similar to edge computing environments, making them suitable for our evaluation. Below we describe the details of these 3 datasets, whose statistical information is shown in Table 1.

HDFS [34] is a Hadoop distributed file system derived from 200 Amazon EC2 nodes. It contains a total of 11,175,629 log messages, and the log events are grouped into different groups according to their block IDs, reflecting the execution status of programs in the HDFS system. Due to its distributed architecture and strong temporal dependencies across nodes, HDFS effectively captures coordinated behaviors in data storage and replication, which are highly compatible with edge storage and data collaboration scenarios.

BGL [24] dataset contains logs from the Lawrence Livermore National Laboratory (LLNL) supercomputer system. It contains 4,747,963 annotated log entries over 215 days, of which 949,024 entries are marked as anomalous. The anomalous entries cover hardware failures, software anomalies, and operational instability. We group the log messages according to the Node variable to capture interactions among a large number of interconnected components, thereby reflecting the continuous operation and structural complexity of edge-assisted systems.

Thunderbird [25] dataset was released to the public by Oliner and Stearley in 2007. It is collected from the Thunderbird supercomputer system at Sandia National Laboratories (SNL) and contains more than 200 million log messages. It exhibits diverse failure patterns and noisy system behaviors, which align with the heterogeneity and dynamic conditions commonly observed in edge environments. Due to the high requirements for real-time performance and processing efficiency of the algorithm in edge scenarios and the huge amount of logs, we utilize the first 5 million log messages for evaluation.

5.1.2. Baselines

To evaluate the performance, we adopt three representative log anomaly detection methods as the baseline for comparison, covering lightweight models: PCA [35], OCSVM [36]; sequence methods: DeepLog [8], LogAnomaly [9] and PLELog [14]; and graph methods: DeepTraLog [37] and DSGN [38]. It is worth noting that this work goal is to advance an anomaly detection framework that is compatible with edge-assisted deployment settings. Accordingly, the evaluation focuses on comparisons with state-of-the-art log anomaly detection methods, which provide a standard and widely accepted benchmark for assessing detection performance.

5.1.3. Evaluation Metrics

We apply five widely used metrics to evaluate the performance of anomaly detection models. Precision, recall, and F1 score are threshold-dependent classification metrics that measure the accuracy of anomaly identification at specific thresholds, reflecting the model’s ability to balance false positives and false negatives. Additionally, we apply ranking metrics ROC AUC and PRC AUC to assess the model’s overall performance across different thresholds. ROC AUC measures the trade-off between true positive rate and false positive rate, while PRC AUC (also known as mean precision) focuses on the relationship between precision and recall, proving particularly effective in scenarios with imbalanced data. For both ROC AUC and PRC AUC, values closer to 1 indicate superior model performance. For threshold metrics, the classification results are obtained by applying the threshold

τ^{*}

determined on the validation set, as described in Section 5.2 In contrast, threshold-independent metrics are computed directly based on anomaly scores without requiring a predefined threshold.

5.2. Model Implementation and Configuration

We implement all algorithms in Python 3.8 on a workstation equipped with an Intel(R) Core(TM) Ultra 9 185H CPU and an NVIDIA GeForce RTX 4060 Laptop GPU. All GNNs in the work are built on the PyTorch Geometric (PyG) framework. These models are configured with a two-layer structure, with 768 input channels and 1024 output channels. For semantic embedding, we adopt the pre-trained models bert-baseuncased and bart-base provided by the Hugging Face platform. In the field extraction process, we consider two scenarios: one is to fine-tune BART [29] by 100 epoches using 10 training samples, and the other is to directly apply predefined regular expressions. In anomaly detection, we divide the log sequence into training/validation/test sets in a 6:1:3 ratio. We employ an unsupervised learning paradigm, training only with normal log sequences and tuning hyperparameters through grid search on the validation set. In addition to model hyperparameters, the anomaly score threshold

τ

is determined on the validation set. We evaluate a range of candidate thresholds over the anomaly scores and select the optimal threshold

τ^{*}

that maximizes the F1-score on the validation set. The selected threshold is applied to the test set for final evaluation. Note that the optimal threshold may vary across datasets due to differences in anomaly score distributions. Specifically, we leverage the AdamW optimizer with a learning rate of 1

\times 10^{- 3}

, mean

μ

= 0.3, decay rate

γ

= 0.5, global weight

α

= 1, and weight decay

λ

= 5

\times 10^{- 7}

. All analyses apply a 60s sliding time window.

5.3. Experimental Results and Analysis

5.3.1. Overall Performance

An overall performance comparison of our method against 7 mainstream benchmarks on the HDFS, BGL, and Thunderbird datasets is presented in Table 2. Experimental results demonstrate that our method achieves superior performance across these datasets compared to competing methods, with particularly notable advantages over rivals on BGL and Thunderbird, highlighting its robust capabilities on complex datasets.

Sequence models and graph-based methods outperform lightweight models in overall performance, indicating that simple statistical features or shallow kernel functions cannot capture the complex semantic relationships within log sequences. They struggle to handle the high-dimensional and varying log streams generated by edge devices. However, sequence methods still face accuracy limitations when processing logs with complex concurrent relationships (e.g., BGL), due to their lack of deep mining into the internal field structures of logs. In contrast, our approach precisely extracts log fields under small samples through prompt learning, compensating for the feature representation shortcomings of pure sequence models.

Graph-based methods typically outperform sequence-based approaches, validating the superiority of graph structures in modeling inter-log correlations. Compared to the state-of-the-art DSGN, our F1-score improved by approximately 5.7%, 11.8%, and 3.3% across 3 datasets. This improvement stems from our constructed heterogeneous graph structure, which captures directed, fine-grained semantic changes within log streams, not merely coarse-grained topological relationships.

Compared to classification metrics based on a single threshold, ROC AUC and PRC AUC more comprehensively reflect the model’s robustness in handling imbalanced data. As shown in Figure 4, our method achieves the highest PRC AUC and ROC AUC values on all 3 datasets and exhibits smaller performance fluctuations across different datasets. More importantly, it exhibits relatively low variance across repeated runs, indicating strong training stability. From a cross-dataset perspective, our method maintains a clear performance margin over competing approaches, especially on the more challenging Thunderbird dataset, where it achieves substantial improvements while maintaining near-zero variance. This demonstrates not only superior ranking capability but also robustness under noisy and heterogeneous log distributions.

5.3.2. Ablation Study

w/o prompt few-shot: To verify the effectiveness of the field extraction method, we compare the proposed method with the traditional rule-based (regex) method and further analyze the impact of different prompt designs and annotation scales (n-shots) on extraction performance. Specifically, we annotate n log messages for each field type using two prompts (

P_{1}

and

P_{2}

in Table 3), training the model under 1-shot, 5-shot, and 10-shot settings. As shown in Table 4, when trained solely with 5-shot learning, our field extraction model achieves F1-Score performance comparable to rule-based methods. When scaled to 10-shot learning, our approach significantly outperforms rule-based methods, demonstrating the practicality of our few-shot approach in low-resource scenarios with limited annotations. Notably, different prompts exhibit varying performance. Considering the importance of false positive control in anomaly detection tasks, we employ

P_{1}

for graph construction in this work.

w/o node attributes: To evaluate the effectiveness of log event semantic embeddings as node attributes, we replace node semantic attributes with node labels (one-hot encoding), while keeping all other settings unchanged. As shown in Figure 5, using semantic embeddings consistently achieves better performance than node labels. Introducing log event semantic embeddings yields stable gains across all datasets, with the most significant improvement observed on the HDFS dataset, indicating that the benefit is intrinsic to the representation itself rather than incidental to particular data characteristics. This result indicates that relying solely on node labels makes it difficult to distinguish fine-grained semantic information in edge computing systems. While semantic attributes can map logs with different expressions but similar meanings to the same representation space, thereby enhancing the model’s generalization capability in resource-constrained, device-diverse edge environments. Since the PRC AUC results exhibit a similar trend to the ROC AUC, they are not elaborated upon here.

5.3.3. Parameter Sensitivity Analysis

We study the impact of two key hyperparameters, namely window size and number of GNN layers. To control variables, only one hyperparameter is adjusted per experiment while others remaine constant. The model maintains stability across different parameter configurations, indicating the proposed method’s robustness to hyperparameter variations in Figure 6. Specifically, Figure 6a demonstrates that longer monitoring periods yield higher accuracy gains. This occurs because smaller windows (e.g., 0.5 min) may fragment a complete business logic, preventing the model from capturing long-range event dependencies and leading to false positives. Larger windows, conversely, encompass more semantic features per sample, enabling the model to delineate “normal” boundaries more effectively. Thus, adjusting the window size allows balancing higher true positive rates with reduced false positive rates.

As observed in Figure 6b, performance improves significantly when the number of GNN layers increases from 1 to 2. Layers 3 to 5 gradually stabilize, with minor fluctuations or saturation even occurring at certain layer counts. This occurs because excessive layers cause each node to continuously aggregate information from increasingly distant neighborhoods, leading to “over-smoothing” that weakens the discriminative capability of node representations. Balancing performance gains against computational overhead, we adopt a 2-layer GNN architecture. It effectively captures sufficient information while maintaining low inference latency, meeting the low-resource constraints of edge computing scenarios.

5.3.4. Efficiency Analysis

We compare the training and testing times of different methods on the BGL dataset. As shown in Table 5, traditional methods for instance PCA and OCSVM exhibit lower computational costs due to their simpler structures. OCSVM requires constructing a feature matrix based on normalized data for anomaly detection, resulting in higher computational overhead. Deep learning-based sequence methods (e.g., DeepLog, LogAnomaly, and PLELog) typically incur higher computational costs due to their complex architectures. LogAnomaly entails the highest computational cost due to its complex template vectorization learning process and low parallelizability. While graph-based methods exhibit lower computational costs, highlighting the computational efficiency advantages of graph structures. Our method requires processing heterogeneous graphs and computing directed edges for semantic embeddings, resulting in a slight increase in computational overhead. Nevertheless, our approach remains comparable in time complexity to other graph-based methods and is significantly more efficient than sequence models. These results indicate that the proposed method achieves a favorable balance between detection performance and computational cost. Given that anomaly detection is performed over sliding windows, the observed inference time is sufficient to support monitoring in delay-tolerant edge scenarios, demonstrating the practical feasibility of deploying the framework in edge-assisted log monitoring systems.

6. Conclusions

We propose a log anomaly detection framework specifically designed for edge computing scenarios. Beyond temporal information and log semantics, it incorporates event-field relationships, effectively addressing the limitations of sequence-based methods in identifying cross event and field structural anomalies at the edge. Furthermore, a prompt-based learning module is introduced for few-shot field extraction, which achieves efficient extraction from heterogeneous log formats under minimal annotation conditions, significantly reducing labeling costs in edge environments. Building upon this, an enhanced DiGCN collaboratively optimizes graph representation learning and unsupervised one-class classification objectives, enabling precise detection of graph-level anomalies. Experimental results demonstrate that our method outperforms existing mainstream approaches on 3 benchmark datasets, achieving an average F1-score improvement of 6.93%. Future research will further explore cross-domain anomaly detection capabilities and optimize the model’s real-time performance and resource efficiency for edge deployment.

Author Contributions

Conceptualization, X.H. (Xinling Huang); Methodology, X.H. (Xianlang Hu); Validation, X.H. (Xianlang Hu) and X.K.; Investigation, X.H. (Xinling Huang) and X.K.; Writing—original draft, X.H. (Xianlang Hu) and X.H. (Xinling Huang); Writing—review & editing, X.H. (Xianlang Hu) and G.F.; Supervision, G.F. and H.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the National Natural Science Foundation of China (No.62272126), the Fundamental Research Funds for the Central Universities (No.3072024LJ0602).

Data Availability Statement

No new data were created in this study. All data used in this work are publicly available from the corresponding referenced datasets.

Conflicts of Interest

The authors declare no conflict of interest.

References

Wang, Z.; Tian, J.; Fang, H.; Chen, L.; Qin, J. LightLog: A lightweight temporal convolutional network for log anomaly detection on the edge. Comput. Netw. 2022, 203, 108616. [Google Scholar] [CrossRef]
Nguyen, T.A.; Le, L.T.; Nguyen, T.D.; Bao, W.; Seneviratne, S.; Hong, C.S.; Tran, N.H. Federated PCA on Grassmann Manifold for IoT Anomaly Detection. IEEE/ACM Trans. Netw. 2024, 32, 4456–4471. [Google Scholar] [CrossRef]
Li, Z.; Leeuwen, V.M. Feature selection for fault detection and prediction based on event log analysis. ACM SIGKDD Explor. Newsl. 2022, 24, 96–104. [Google Scholar] [CrossRef]
Zhang, W.; Zhang, Q.; Yu, E.; Ren, Y.; Meng, Y.; Qiu, M.; Wang, J. LogRAG: Semi-Supervised Log-based Anomaly Detection with Retrieval-Augmented Generation. In Proceedings of the IEEE International Conference on Web Services (ICWS), Shenzhen, China, 7–13 July 2024; pp. 1100–1102. [Google Scholar]
Gan, W.; Chen, L.; Wan, S.; Chen, J.; Chen, C.M. Anomaly rule detection in sequence data. IEEE Trans. Knowl. Data Eng. 2021, 35, 12095–12108. [Google Scholar] [CrossRef]
Liu, J.; Huang, J.; Huo, Y.; Jiang, Z.; Gu, J.; Chen, Z.; Feng, C.; Yan, M.; Lyu, M.R. Log-based Anomaly Detection based on EVT Theory with feedback. arXiv 2023, arXiv:2306.05032. [Google Scholar] [CrossRef]
Luo, R.; Krishnamurthy, V. Fréchet-Statistics-Based Change Point Detection in Dynamic Social Networks. IEEE Trans. Comput. Soc. Syst. 2024, 11, 2863–2871. [Google Scholar] [CrossRef]
Du, M.; Li, F.; Zheng, G.; Srikumar, V. Deeplog: Anomaly detection and diagnosis from system logs through deep learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, Dallas, TX, USA, 30 October–3 November 2017; pp. 1285–1298. [Google Scholar]
Meng, W.; Liu, Y.; Zhu, S.; Zhang, S.; Pei, D.; Liu, Y.; Chen, Y.; Zhang, R.; Tao, S.; Sun, P.; et al. Loganomaly: Unsupervised detection of sequential and quantitative anomalies in unstructured logs. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), Macao, China, 10–16 August 2019; Volume 19, pp. 4739–4745. [Google Scholar]
Xu, W.; Huang, L.; Fox, A.; Patterson, D.; Jordan, M. Largescale system problem detection by mining console logs. In Proceedings of the SOSP, Big Sky, MT, USA, 11–14 October 2009; Volume 9, pp. 1–17. [Google Scholar]
Lin, Q.; Zhang, H.; Lou, J.G.; Zhang, Y.; Chen, X. Log clustering based problem identification for online service systems. In Proceedings of the 38th International Conference on Software Engineering Companion, Austin, TX, USA, 14–22 May 2016; pp. 102–111. [Google Scholar]
He, S.; Lin, Q.; Lou, J.G.; Zhang, H.; Lyu, M.R.; Zhang, D. Identifying impactful service system problems via log analysis. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Lake Buena Vista, FL, USA, 4–9 November 2018; pp. 60–70. [Google Scholar]
Zhang, X.; Xu, Y.; Lin, Q.; Qiao, B.; Dang, Y.; Xie, C.; Cheng, Q.; Li, Z.; Chen, J.; He, X.; et al. Robust log-based anomaly detection on unstable log data. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Tallinn, Estonia, 26–30 August 2019; pp. 807–817. [Google Scholar]
Yang, L.; Chen, J.; Wang, Z.; Wang, W.; Jiang, J.; Dong, X.; Zhang, W. Semi-supervised log-based anomaly detection via probabilistic label estimation. In Proceedings of the IEEE/ACM 43rd International Conference on Software Engineering (ICSE), Madrid, Spain, 25–28 May 2021; pp. 1448–1460. [Google Scholar]
Ma, R.; Pang, G.; Chen, L.; Van Den Hengel, A. Deep graphlevel anomaly detection by glocal knowledge distillation. In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining, Tempe, AZ, USA, 21–25 February 2022; pp. 704–714. [Google Scholar]
Qiu, C.; Kloft, M.; Mandt, S.; Rudolph, M. Raising the bar in graph-level anomaly detection. arXiv 2022, arXiv:2205.13845. [Google Scholar] [CrossRef]
Zhang, G.; Yang, Z.; Wu, J.; Yang, J.; Xue, S.; Peng, H.; Su, J.; Zhou, C.; Sheng, Q.Z.; Akoglu, L.; et al. Dual-discriminative graph neural network for imbalanced graph-level anomaly detection. Adv. Neural Inf. Process. Syst. 2022, 35, 24144–24157. [Google Scholar]
Nguyen, H.T.; Liang, P.J.; Akoglu, L. Detecting anomalous graphs in labeled multi-graph databases. ACM Trans. Knowl. Discov. Data 2023, 17, 1–25. [Google Scholar] [CrossRef]
Hamilton, W.L.; Ying, Z.; Leskovec, J. Inductive representation learning on large graphs. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
Shi, Y.; Huang, Z.; Feng, S.; Zhong, H.; Wang, W.; Sun, Y. Masked label prediction: Unified message passing model for semi-supervised classification. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI), Montreal, QC, Canada, 19–27 August 2021. [Google Scholar]
Liu, F.T.; Ting, K.M.; Zhou, Z.H. Isolation-based anomaly detection. ACM Trans. Knowl. Discov. Data (TKDD) 2012, 6, 1–39. [Google Scholar] [CrossRef]
He, P.; Zhu, J.; Zheng, Z.; Lyu, M.R. Drain: An online log parsing approach with fixed depth tree. In Proceedings of the IEEE International Conference on Web Services (ICWS), Honolulu, HI, USA, 25–30 June 2017; pp. 33–40. [Google Scholar]
Oliner, A.; Stearley, J. What supercomputers say: A study of five system logs. In Proceedings of the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), Edinburgh, UK, 25–28 June 2007; pp. 575–584. [Google Scholar]
Zhu, J.; He, S.; Liu, J.; He, P.; Xie, Q.; Zheng, Z.; Lyu, M.R. Tools and benchmarks for automated log parsing. In Proceedings of the 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), Montreal, QC, Canada, 25–31 May 2019; pp. 121–130. [Google Scholar]
Wang, F.; Bundy, A.; Li, X.; Zhu, R.; Mauceri, S.; Xu, L.; Wang, F.; Pan, Z.J. LEKG: A system for constructing knowledge graphs from log extraction. In Proceedings of the 10th International Joint Conference on Knowledge Graphs; ACM: New York, NY, USA, 2022; pp. 181–185. [Google Scholar]
Ekelhart, A.; Ekaputra, F.J.; Kiesling, E. The slogert framework for automated log knowledge graph construction. In European Semantic Web Conference; Springer International Publishing: Cham, Switzerland, 2021; pp. 631–646. [Google Scholar]
Kurniawan, K.; Ekelhart, A.; Kiesling, E.; Winkler, D.; Quirchmayr, G.; Tjoa, A.M. Virtual knowledge graphs for federated log analysis. In Proceedings of the 16th International Conference on Availability, Reliability and Security, Virtually, 17–20 August 2021; pp. 1–11. [Google Scholar]
Cui, L.; Wu, Y.; Liu, J.; Yang, S.; Zhang, Y. Template-based named entity recognition using BART. arXiv 2021, arXiv:2106.01760. [Google Scholar] [CrossRef]
Reimers, N.; Gurevych, I. Sentence-bert: Sentence embeddings using siamese bert-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP); Association for Computational Linguistics: Kerrville, TX, USA, 2019; pp. 3982–3992. [Google Scholar]
Li, Y.; Yu, X.; Liu, Y.; Chen, H.; Liu, C. Uncertainty-aware bootstrap learning for joint extraction on distantly-supervised data. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, Toronto, ON, Canada, 9–14 July 2023; pp. 1349–1358. [Google Scholar]
Tong, Z.; Liang, Y.; Sun, C.; Li, X.; Rosenblum, D.; Lim, A. Digraph inception convolutional networks. In Proceedings of the Advances in Neural Information Processing Systems, Virtually, 6–12 December 2020; pp. 17907–17918. [Google Scholar]
Gilmer, J.; Schoenholz, S.S.; Riley, P.F.; Vinyals, O.; Dahl, G.E. Neural message passing for quantum chemistry. In Proceedings of the International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; pp. 1263–1272. [Google Scholar]
Xu, W.; Huang, L.; Fox, A.; Patterson, D.; Jordan, M.I. Detecting large-scale system problems by mining console logs. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, Big Sky, MT, USA, 11–14 October 2009; pp. 117–132. [Google Scholar]
Zhang, C.; Wang, X.; Zhang, H.; Zhang, H.; Han, P. Log sequence anomaly detection based on local information extraction and globally sparse transformer model. IEEE Trans. Netw. Serv. Manag. 2021, 18, 4119–4133. [Google Scholar] [CrossRef]
Miao, X.; Liu, Y.; Zhao, H.; Li, C. Distributed online one-class support vector machine for anomaly detection over networks. IEEE Trans. Cybern. 2018, 49, 1475–1488. [Google Scholar] [CrossRef] [PubMed]
Zhang, C.; Peng, X.; Sha, C.; Zhang, K.; Fu, Z.; Wu, X.; Lin, Q.; Zhang, D. Deeptralog: Trace-log combined microservice anomaly detection through graph-based deep learning. In Proceedings of the 44th International Conference on Software Engineering, Pittsburgh, PA, USA, 22–24 May 2022; pp. 623–634. [Google Scholar]
Yang, H.; Sun, D.; Wang, Y.; Huang, W. DSGN: Log-based anomaly diagnosis with dynamic semantic gate networks. Inf. Sci. 2024, 680, 121174. [Google Scholar] [CrossRef]

Figure 1. Overview of the model architecture.

Figure 2. Prompt-based few-shot field extraction.

Figure 3. Construct an attribute-enhanced directed weighted log graph.

Figure 4. ROC AUC and PRC AUC comparison on 3 datasets.

Figure 5. Performance comparison using node labels vs. semantic embeddings.

Figure 6. Model performance under different parameters.

Table 1. Dataset statistics.

Name	#Events	#Graphs	#Anomalies	#Nodes	#Edges
HDFS	48	575,061	16,838	7	20
BGL	1848	69,251	31,374	10	30
Thunderbird	1013	52,160	6,814	16	52

Note: #Events refers to the number of log event templates obtained using the log parser Drain [23]; #Groups indicates the number of generated graphs; #Anomalies denotes the number of anomaly graphs; #Nodes represents the average number of nodes in the generated graphs; #Edges signifies the average number of edges in the generated graphs.

Table 2. Performance comparison of different models.

Method	HDFS			BGL			Thunderbird
Method	Precision	Recall	F1	Precision	Recall	F1	Precision	Recall	F1
PCA	0.74	0.82	0.78	0.81	0.94	0.87	0.34	0.91	0.49
OCSVM	0.63	0.79	0.70	0.63	0.73	0.68	0.44	0.87	0.58
DeepLog	0.83	0.87	0.85	0.89	0.80	0.84	0.48	0.89	0.62
LogAnomaly	0.86	0.89	0.87	0.91	0.79	0.84	0.51	0.87	0.64
PLELog	0.88	0.93	0.90	0.92	0.96	0.94	0.85	0.94	0.89
DeepTraLog	0.89	0.91	0.90	0.86	0.89	0.87	0.87	0.87	0.87
DSGN	0.88	0.87	0.87	0.79	0.92	0.85	0.86	0.94	0.90
Ours	0.90	0.95	0.92	0.93	0.97	0.95	0.92	0.96	0.93

Table 3. Extracting the two suggested fields.

Prompt Type	$P^{+}$	$P^{-}$
Prompt $P_{1}$	<candidate_span> is a/an <entity_type> entity	<candidate_span> is not a named entity
Prompt $P_{2}$	<entity_type> $= <$ candidate_span>	<candidate_span>=none

Table 4. Performance comparison of rule-based vs. prompt-based n-shot field extraction (%).

Technique		Precision	Recall	$F 1$ -Score
regex		36.48	44.28	40.00
$P_{1}$	1-shot	16.53	59.34	25.86
	5-shot	28.33	74.38	41.03
	10-shot	66.28	85.22	74.57
$P_{2}$	1-shot	17.89	58.14	27.36
	5-shot	28.00	73.76	40.59
	10-shot	64.68	87.82	74.49

Table 5. Time consumption of different models.

	PCA	OCSVM	DeepLog	LogAnomaly	PLELog	DeepTraLog	DSGN	Our
Training Time	87.36 s	235.79 s	2321.21 s	4420.02 s	1648.37 s	1261.55 s	1753.82 s	1578.52 s
Testing Time	0.61 s	107.13 s	1595.18 s	2625.36 s	894.56 s	796.81 s	1050.07 s	986.33 s

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hu, X.; Feng, G.; Huang, X.; Kong, X.; Lv, H. A GNN-Based Log Anomaly Detection Framework with Prompt Learning for Edge Computing. Computers 2026, 15, 273. https://doi.org/10.3390/computers15050273

AMA Style

Hu X, Feng G, Huang X, Kong X, Lv H. A GNN-Based Log Anomaly Detection Framework with Prompt Learning for Edge Computing. Computers. 2026; 15(5):273. https://doi.org/10.3390/computers15050273

Chicago/Turabian Style

Hu, Xianlang, Guangsheng Feng, Xinling Huang, Xiangying Kong, and Hongwu Lv. 2026. "A GNN-Based Log Anomaly Detection Framework with Prompt Learning for Edge Computing" Computers 15, no. 5: 273. https://doi.org/10.3390/computers15050273

APA Style

Hu, X., Feng, G., Huang, X., Kong, X., & Lv, H. (2026). A GNN-Based Log Anomaly Detection Framework with Prompt Learning for Edge Computing. Computers, 15(5), 273. https://doi.org/10.3390/computers15050273

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A GNN-Based Log Anomaly Detection Framework with Prompt Learning for Edge Computing

Abstract

1. Introduction

2. Related Work

2.1. Probability-Based Log Detection

2.2. Sequence-Based Log Detection

2.3. Graph-Based Log Detection

3. Problem Definition

4. Methodology

4.1. Prompt-Based Few-Shot Field Extraction

4.2. Log Graph Construction

4.2.1. Graph Structure Configuration

4.2.2. Graph Node Attribute Configuration

4.3. Graph-Based Anomaly Detection for Event Logs

5. Experiments and Analysis of Results

5.1. Experimental Setup

5.1.1. Datasets

5.1.2. Baselines

5.1.3. Evaluation Metrics

5.2. Model Implementation and Configuration

5.3. Experimental Results and Analysis

5.3.1. Overall Performance

5.3.2. Ablation Study

5.3.3. Parameter Sensitivity Analysis

5.3.4. Efficiency Analysis

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI