1. Introduction
Distributed acoustic sensing (DAS) technology senses infrastructure, which transforms standard optical fiber cables into dense distributions of acoustic sensor arrays [
1]. Further, DAS can monitor dynamic events for long distances using backscattered illumination in fiber-optic cables. The applicability of DAS across various domains, such as seismic activity detection [
2], CO
2 storage monitoring [
3], ship trajectory tracking [
4], oceanographic observation [
5], and pipeline safety monitoring [
6,
7], via vibration-based pattern recognition, is well-documented.
As DAS data become more available, deep learning approaches have increasingly been adopted for classifying events and detecting anomalies. The one-class support vector machine (OCSVM) [
5] and convolutional neural network (CNN)-based autoencoder methods [
8,
9] demonstrate remarkable performance in this task. Transfer learning [
10], few-shot learning [
11], and zero-shot learning [
12] approaches have been employed to address the lack of generalizability due to unseen types of events and limited quantities of label data.
Despite these advancements, DAS-based anomaly detection suffers from some critical problems. First, most models treat the DAS signals in one-dimensional (1D) flat form or 2D spectrogram-like formats, but this treatment leads to the loss of the spatial structure of the DAS system, where each channel is mapped to a sensor along the optical fiber. The representations ignore the interchannel relationships, which are crucial for the detection of spatially distributed events.
Second, CNN-based temporal models, including combinations of multiscale CNNs with hidden Markov models [
13] or CNNs with recurrent neural networks (RNNs) [
14], exhibit limitations in anomaly detection with DAS signals. The CNN relies on fixed and translation-invariant filters under the stationarity assumption [
15] and is poorly suited to dynamic spectral shifts or propagating vibrations. Although essentially sequential, RNNs suffer from vanishing gradients and short memory, inhibiting their management of long-range and nonlinear dependencies common in spatially distributed acoustic signals [
16].
These limitations require the adoption of more expressive and robust generative models to cope better with complex temporal patterns in DAS data. Furthermore, these limitations highlight the need for a more structured method to consider the spatial layout of the DAS sensor arrays. This paper proposes a novel method, GraphDiffusion, the integration of a graph neural network (GNN), and the conditional denoising diffusion probabilistic model (DDPM), to address the spatial and temporal modeling challenges in DAS-based anomaly detection.
Recently, GNNs have performed anomaly detection well [
17]. The GNN, notably the graph convolutional network (GCN) [
18,
19,
20], is suitable for representing the spatial topology of DAS systems, where the channels map to physical locations along the fiber. The fiber layout of the DAS system is explicitly represented as a graph, where channels are modeled as nodes and edges are determined by physical proximity. In installations (e.g., perimeter fences) where the optical fibers are looped along the bottom and top, this would allow the connection of the neighbors, including diagonally or symmetrically aligned ones, to respond together to the same events. This approach embeds 2D coordinates for each channel and forms edges based on the Euclidean distance, including horizontal, vertical, and diagonal connections. Edge weights are calculated as the inverse of the distance, enabling the model to focus on spatially relevant dependencies. Graph building facilitates GCNs to discover local and distributed vibrational patterns that are crucial for spatially extending or propagating anomaly detection.
Although the GNN component extracts spatial correlations within the DAS array, the conditional DDPM offers a mechanism to model the temporal progression of signals. The conditional DDPM discovers the distribution of normal temporal patterns by learning an iterative denoising process, which is trained on a generative task by incrementally corrupting clean signals and learning to invert noise [
21]. During inference, reconstruction after denoising serves as the baseline for comparison, and anomalies can be detected via significant reconstruction errors. The probabilistic and generative properties of DDPMs facilitate the handling of nonstationary and nonlinear dynamics, even those with low signal-to-noise ratios. Anomaly detection via diffusion-based models has performed well in diverse fields (e.g., medical and industrial monitoring) because it detects subtle irregularities and remains robust even with noisy or complex data [
21,
22,
23].
GraphDiffusion addresses the limitations of past approaches by modeling the spatial dependence and temporal dynamics simultaneously. GraphDiffusion obtains the spatial context from the DAS layout via the GNN and employs it in the denoising process as a spatial condition of the conditional DDPM. The temporal denoising procedure can consider the spatial signal patterns observed throughout the DAS array by employing these embeddings as conditions for the DDPM. Experiments on DAS datasets are performed and compared with other models to demonstrate the performance of the proposed GraphDiffusion method by using the metrics, such as the area under the curve (AUC) of F1-score at K different levels (F1K-AUC) and the AUC of receiver operating characteristic (ROC) at K different levels (ROCK-AUC).
The remainder of this paper is organized as follows.
Section 2 presents a brief review of the anomaly-detection method applied to DAS and multivariate time series (MTS).
Section 3 explains the fundamental principles of DAS, and
Section 4 describes the site configuration, DAS system parameters, and data-collection method. Then,
Section 5 proposes the GraphDiffusion, comprising the GNN and conditional DDPM, and
Section 6 outlines the dataset, implementation details, and evaluation protocol used to validate the anomaly-detection performance of GraphDiffusion. Finally,
Section 7 presents the quantitative results and ablation studies confirming the effectiveness of GraphDiffusion, and
Section 8 concludes the paper.
3. Principles of the DAS System
Using standard optical fiber, DAS detects acoustic disturbances and external vibrations in real time with high resolution. The DAS system can localize and characterize events with fine spatial and temporal granularity by measuring phase changes in Rayleigh backscattered signals along the fiber. This section describes the fundamental operating principles of DAS, including critical system parameters that support robust sensing capabilities for security and infrastructure monitoring, the trade-off of pulse width selection, and the physical interpretation of phase measurements.
3.1. Fundamental DAS Principles
The DAS system functions according to the principle of Φ-OTDR. This system involves launching a narrow-linewidth laser pulse into an optical fiber and continuously monitoring the Rayleigh backscattered light that returns from each location along the fiber. The phase and intensity of the backscattered signal are modulated when an external acoustic or vibrational disturbance is applied to the fiber, causing a local strain or deformation. Analyzing these backscattered signals in the time domain permits detecting and localizing external events along the entire length of the fiber in real time.
3.2. System Parameters and Mathematical Relationships
The performance characteristics of DAS systems are governed by several parameters that determine the spatial and temporal capabilities of the system. The spatial sampling distance
along the fiber is fundamentally determined by the sampling rate of the data-acquisition system and is expressed as follows:
where
denotes the speed of light in a vacuum,
represents the sampling interval, and
n indicates the refractive index of the optical fiber core (typically about 1.468 for standard single-mode fiber). The factor of 2 accounts for the round-trip propagation of light in the fiber. The temporal sampling distance,
, is directly related to the pulse repetition rate
of the interrogating laser.
However, the spatial resolution of the system is primarily determined by the gauge length
, corresponding to the physical length of fiber over which the acoustic signal is integrated. The gauge length is intrinsically linked to the pulse width τ of the interrogating laser pulse:
where
c denotes the speed of light in optical fiber (typically
).
3.3. Pulse-Width Trade-Offs
This relationship establishes a fundamental trade-off between spatial resolution and system sensitivity. Shorter pulse widths result in improved spatial resolution by reducing the gauge length, enabling more precise localization of acoustic events. However, because the shorter gauge length integrates acoustic signals over a smaller fiber section, potentially lowering the signal-to-noise ratio, this improvement comes at the cost of decreased sensitivity. On the other hand, longer pulse widths increase the gauge length, improving sensitivity through signal integration across a larger fiber segment but lowering spatial resolution.
The application requirements must be carefully considered when choosing the ideal pulse width, striking a balance between the need for accurate event localization and sufficient detection sensitivity. For most security and monitoring applications, typical DAS systems use pulse widths between 10 and 100 ns, which correspond to gauge lengths of roughly 1 to 10 m. This offers a workable compromise between resolution and sensitivity.
3.4. Phase Measurement and Physical Interpretation
The phase (
φ), which is measured in radians and represents the optical path difference caused by external disturbances like vibration waves and acoustic sound pressure, is the basic unit of measurement in DAS systems. Important physical details about the optical fiber’s mechanical deformation and acoustic field are contained in this phase measurement. The relationship between phase change and fiber strain is governed by the photo-elastic effect, and phase change is expressed as follows:
where
represents the interrogating laser wavelength,
n indicates the refractive index of the optical fiber core, and
represents the longitudinal strain along the fiber at spatial position
x and time
t. The gauge length
is computed using Equation (2). The detection of mechanical deformations as small as one nano-strain (1 × 10
−9) is made possible by this relationship, which shows that phase measurements directly correlate with nano-strain levels.
The relationship between the fiber strain and the applied acoustic field is what makes the conversion from phase measurements to acoustic pressure possible. When acoustic waves hit fiber-optic cables, the strain that is caused is linked to the acoustic pressure through the mechanical properties of the cable and how well it couples. The acoustic pressure
P can be derived from the measured phase as follows:
where
represents the elastic coefficient of the fiber, and
denotes the longitudinal strain as described in Equation (3). This dual relationship among the phase, strain, and acoustic pressure lets DAS systems work as both mechanical strain sensors and acoustic pressure detectors. Because of this connection, these systems can be used for a wide range of purposes, such as monitoring the health of structures, detecting earthquakes, and acoustic surveillance.
4. Data Collection
This work established a large DAS-based experimental setup to simulate real-world security threats and gather data to test the proposed anomaly-detection framework. This section explains how to set up a site, the DAS system settings, and how to collect data. To improve detection coverage, a dedicated test environment was set up using a dual-height fiber-optic installation strategy. Different physical activities were then performed in a planned way to create representative datasets.
4.1. Site Configuration and Installation
The experimental deployment was conducted at a designated test site with a perimeter security fence system integrated with DAS technology.
Figure 1 depicts the experiment environment, optical fiber, and DAS interrogator installation. The security fence was constructed using a U-shaped configuration about 2 m in height, offering a controlled environment for intrusion-detection testing. The optical fiber cable was carefully put along the perimeter of the fence to make a distributed sensing network that could detect and localize attempts at intrusion. The DAS interrogator unit was kept in a weatherproof container at the entrance to the site. This kept the sensitive optical equipment safe while still allowing for easy access for system monitoring and maintenance.
A standard single-mode telecommunications fiber with improved mechanical protection that is appropriate for outdoor installation makes up the fiber-optic cable used in this study. To improve detection reliability and offer redundant sensing coverage, a dual-fiber installation strategy was used; 310 sensing channels are supported throughout the installation. Two different height levels were used for the installation of the optical fibers: 30 cm above the ground and 160 cm above the ground. A continuous sensing loop that covers both lower and upper fence sections is created by the fiber cable ascending along one side of the fence posts to the upper mounting point and then descending along the opposite side. This dual-height configuration was designed using a round-trip topology.
Cable ties were employed to securely attach the cables to the fence structure, making sure that the fiber and fence framework were in contact with each other to maximize vibration transmission. This round-trip installation method gives the system better coverage of space and allows it to compare the two height levels, which makes it better at distinguishing the difference between different types of intrusion attempts, like climbing and cutting. The total monitored perimeter length was about 300 m, and each fiber strand covered the entire distance, which doubled the sensing density along the fence line.
4.2. DAS Operational Parameters
The phase-sensitive DAS unit used in this study was produced by the Korea Photonics Technology Institute.
Table 1 summarizes the operational parameters, where Msps denotes mega samples per second. The parameters were chosen to optimize the detection performance for perimeter security applications, providing sufficient spatial resolution for event localization while maintaining sufficient temporal resolution for prompt detection.
4.3. Scenario Design and Data Acquisition
To simulate realistic security threats, a variety of physical activities were methodically carried out during several sessions from 2023 to 2024. The dataset used in this paper was gathered between 4 September and 6 September 2023. To evaluate the DAS system’s detection sensitivity and pattern discriminability, the experiments were made to produce unique vibrational and acoustic signatures. The dataset contains recordings of human movement, digging, ladder intrusions, and fence vibration.
To simulate physical tampering, the fence was deliberately shaken and struck to produce a fence vibration scenario.
Table 2 describes each event scenario and the data associated with the scenario during data collection. In the digging scenario, digging actions were performed using a shovel near the fence line. Ladder intrusion events were simulated using a ladder to climb up the fence and capture signals related to ladder placement, elevation, and descent. Human movement activities included walking and running experiments conducted under various conditions, and sensitivity tests were performed at distances of between 1 and 5 m around the fence to evaluate detection performance under various proximity conditions.
In this study, “walking” and “running” are defined as normal events, whereas “fence impact,” “fence shaking,” “digging,” and “ladder intrusions” were treated as anomalies. From the normal portion of the data, 398,900 traces were employed as the training dataset, and the testing and validation datasets comprise 22,300 normal traces and 24,000 anomalous traces, ensuring a balanced and realistic performance evaluation. The DAS dataset is represented as an MTS , where T denotes the number of traces and C = 310 is the number of distributed sensing channels.
5. GraphDiffusion Methods
GraphDiffusion combines a GNN with a conditional DDPM to jointly learn spatial and temporal patterns in DAS signals.
Figure 2 shows the overall structure of the proposed GraphDiffusion. In
Figure 2, the GNN extracts a spatial embedding
H among DAS channels, encoding how vibrations propagate across the fiber layout. In parallel, a diffusion process perturbs the input
with Gaussian noise to learn the temporal distribution of normal signals. The spatial embeddings from the GNN and the temporally corrupted signals
are concatenated and passed into a denoising U-Net, which progressively reconstructs a clean signal,
. The hierarchical structure of U-Net allows it to combine local details with the global context, while conditioning it on the spatial information from the GNN. This architecture enables GraphDiffusion to capture complex spatiotemporal dependencies for generative anomaly detection in DAS systems.
5.1. DAS Signal Preprocessing
For raw DAS data , T denotes the number of traces, and C is the number of channels (sensors) along the fiber. With a stride S, these sequences are divided into windows of a fixed length of size W. The diversity of training samples is increased using a sliding window technique with S < W to create overlapping segments during training. A nonoverlapping windowing strategy is employed during validation and testing by setting the stride equal to the window size (i.e., S = W) to ensure a consistent and nonredundant evaluation protocol. Thus, each windowed segment creates an input tensor .
5.2. GNN-Based Spatial Representation Learning
This section details the graph-construction process that encodes the physical topology of fiber-optic placement sensors and the design of GNN that are employed in the proposed GraphDiffusion to model the spatial structure of DAS signals.
Figure 3 illustrates the structure of GNN, where windowed segment
is processed by a two-layer GCN with the rectified linear unit (ReLU) activation function [
37] applied between layers. The resulting output is spatial feature map
S. These components allow the proposed work to capture neighborhood dependencies and spatial correlations that are essential for robust anomaly detection.
5.2.1. Graph Construction
Each window of DAS data is converted into an undirected graph
G = (
V,
E) where each node
represents a DAS sensor, to model the spatial dependencies in the DAS system. The graph structure was explicitly derived from the actual dual-fiber installation described in
Section 4.1, where two parallel optical fibers were installed at different heights along the perimeter fence.
Figure 4 depicts the graph structure as a two-row topology to imitate the installation of DAS, enabling spatial dependency modeling, including horizontal, vertical, and diagonal signal propagation. Each sensor is assigned a virtual 2D coordinate to represent the two-row topology. As shown in
Figure 4a, the DAS sensors were deployed in a dual-fiber configuration along the bottom and top of the fence. As shown in
Figure 4b, the top-half channels, indexed from 0 to
N − 1, are assigned uniformly spaced horizontal positions at a vertical level of
y = 0 to mimic the upper cable run. The bottom-half channels, indexed from
N to 2
N − 1, are assigned to
y = 1 in a horizontally mirrored order to replicate the inverted orientation of the lower cable run. This configuration is horizontally mirrored but vertically aligned. Every channel has a coordinate
, where
indicates the vertical position and
indicates the horizontal position. An adjacency matrix with a fixed spatial distance threshold
d is created to specify the graph topology. Edges in this adjacency matrix are created between any pair of nodes (
i, j) if their Euclidean distance is less than or equal to
d:
Inversely proportional to the Euclidean distance, each edge is assigned a weight (, where δ is a small constant to prevent division by zero. The resulting adjacency matrix encodes the physical geometry of DAS channels, enabling the GCN layers to learn symmetric and local spatial patterns efficiently, including distributed or subtle propagations. Self-loops are added to ensure that each node’s unique features are included during message passing and to preserve per-channel information. This structure is implemented as a matrix that specifies the source–target edge pairs and is encoded using a static edge index that is constant across batches.
5.2.2. GNN Architecture in the Proposed GraphDiffusion
A two-layer GCN is employed for graph-based spatial feature extraction [
19]. In the first layer, the input
is passed through a GCN followed by an ReLU activation function [
37], projecting them into a hidden representation. The GCN applies a convolution over the shared graph structure for every batch sample, reshaping the features to fit the initial input dimensions. Formally, the spatial feature map
is computed as follows:
The GCN output
is combined with the original input
via an elementwise sum operation to create the final spatial embedding
:
Interchannel dependencies resulting from the actual sensor arrangement and detected signal correlations are encoded in these spatial embeddings. The temporal denoising procedure can consider spatial signal patterns observed throughout the DAS segment using these embeddings as conditions for the conditional DDPM.
5.3. Diffusion-Based Temporal Modeling
We used a conditional DDPM [
38] that incorporates a U-Net-based denoising network
[
39] to model the temporal structure of DAS signals. The spatial context obtained from the DAS layout can be employed to guide the denoising process due to this conditional design. We applied graph-based spatial embeddings produced by the GNN to condition the diffusion model. This module design can simultaneously encode high-level contextual information across the signal and capture local temporal dependencies [
21,
40,
41].
5.3.1. Denoising U-Net Architecture
The stepwise denoising procedure, which is essential for DDPMs, is learned using U-Net.
Figure 5 illustrates the architecture of U-Net, which applies a hierarchical structure of downsampling and upsampling layers with feature multipliers of 1, 2, and 4, enhanced with residual blocks and attention mechanisms, to operate on noisy input and recover the original clean signal. Reshape-based size-aware convolutional layers are applied to implement downsampling and upsampling operations to guarantee compatibility with nonsquare DAS inputs
. Features from matching downsampling blocks are concatenated in the upsampling path using skip connections. A two-layer multilayer perceptron, for which output modulates the residual blocks via featurewise linear modulation-style scale and shift parameters, is employed to achieve timestep conditioning after sinusoidal positional embedding. Additionally, the model allows for self-conditioning, promoting improved temporal consistency in predictions by concatenating a previously denoised output to the input during training.
5.3.2. Diffusion Process
Gaussian noise is added to a clean window
over
time steps during training via the diffusion process:
where
represents the cumulative product of noise schedule coefficients, defined as
, where
, and
controls the noise level at each timestep. Intuitively,
indicates the proportion of the original signal
preserved at step
, and
reflects the amount of added noise. As
increases,
decreases, gradually replacing the signal with pure noise.
Given the noisy input
in Equation (8), the diffusion timestep
, and the GNN-derived condition
in Equation (7), the model is trained to reverse this process and predict the added noise
while minimizing the following loss:
where
denotes the denoising network.
At inference, we added noise to an input
and then denoised the input back to
. We iteratively denoised the previous input
times, starting from
=
. The denoised
is calculated as follows:
where
, and
if
n > 1, otherwise
z = 0. The anomaly score at trace
w is calculated using the reconstruction error between the original input signal
and denoised signal
= [
] where each trace element
and
is a feature vector:
This anomaly score is used for verifying the anomaly-detection models with evaluation metrics in
Section 6.4.
This design allows the model to learn fine-grained temporal patterns and global contextual dependencies via the resolution hierarchy of U-Net. By conditioning the denoising process on encoded temporal representations, the model can better generalize to complex and subtle anomalies in DAS data.
6. Experimental Setup
This section presents a setup of experiments designed using a proprietary DAS dataset collected in realistic infrastructure monitoring scenarios to verify the effectiveness of the GraphDiffusion method. The purpose of the experiments is to assess the model’s capacity to identify subtle, spatially correlated anomalies. This section describes the details, implementation settings, and evaluation metrics of the dataset to ensure reproducibility and a fair comparison with comparative models.
6.1. Dataset
A proprietary DAS dataset for anomaly detection in MTS is employed for the experiments. A time series of signals with C = 310 channels (i.e., fiber-sensing channels) comprises the dataset, with = 398,900 traces in the training set and = = 46,300 traces in each of the validation and test sets. Note that there were no overlaps among the training, validation, and test sets.
Let represent an MTS with C dimensions and T traces to formalize the data. A set of windowed sequences is created by dividing each dataset into overlapping windows of length W = 300 using a stride of S = 150. Nonoverlapping windows are employed for the test and validation sets, and the majority vote of the included traces determines each window label. The 24,000 anomalous and 22,300 normal traces in each set allow a balanced and realistic performance evaluation.
6.2. Implementation Details
The GNN in GraphDiffusion has a hidden-layer size that is half of the window size. A linear noise schedule increases from to , with a diffusion step count of 100. Only normal DAS data are used for training, and the Adam optimizer is employed with a batch size of 32 and a learning rate of 1 × 10−3. A learning rate scheduler is used, which increases the learning rate linearly for the first 10% of training steps and then decreases it linearly afterward. All experiments were implemented using three Nvidia A5000 GPUs (NVIDIA Corporation, Santa Clara, CA, USA) and PyTorch version 1.13.1. The average training time per epoch of the proposed model is 204 s, and the experiments were conducted over 40 epochs. At inference, the model generates 300 traces in 0.0004 s (≈1.3 µs per trace).
6.3. Comparative Models
This work compares the proposed model to a set of comparative models to assess its effectiveness. The pipelines from previous anomaly detection in DAS studies were reimplemented, including an anomaly-detection model using OC-SVM [
41] and a generative learning model with an autoencoder [
9].
Furthermore, the comparative models were six recent generative models: the AnomalyTransformer [
28], TranAD [
29], iTransformer [
30], MAAT [
31], DiffusionAE [
21], and graph deviation network (GDN) [
32]. For anomaly scoring, TranAD uses a two-stage transformer architecture that combines reconstruction and prediction goals. AnomalyTransformer measures association discrepancies in temporal attention to quantify anomaly likelihoods. For long-term series, the iTransformer uses an inductive bias and a shifted windowing mechanism to improve scalability and efficiency. Mamba-based state space modeling is integrated into MAAT to enhance anomaly localization and temporal representation. DiffusionAE learns the manifold of normal signals for reconstruction-based anomaly detection by combining an autoencoder structure with a denoising diffusion probabilistic model. The GDN provides a structure-aware method that is ideal for high-dimensional correlated data by modeling MTS as graph structures and identifying anomalies by learning to measure deviations from graph-based normal patterns.
6.4. Evaluation Metrics
This work employs two robust metrics recently proposed for time series anomaly detection to assess performance: an area under the curve (AUC) of the F1-score at
K different levels (F1
K-AUC), and an AUC of receiver operating characteristic (ROC) at
K different levels (ROC
K-AUC) [
21]. Prior methods [
42,
43] often overestimate performance by considering a segment as detected if even a single anomalous point is correctly predicted. In contrast, F1
K-AUC is obtained by computing the F1-score across
and calculating the area under the curve, where a segment is detected only if at least
K% of its anomalous points are identified. This work calculates the anomaly score threshold
δ across 50 threshold values
, where
denotes the maximum anomaly score across all traces in the validation set. The threshold
δ that results in the highest F1
K-AUC in the validation set is applied for an evaluation based on the test set. This work also reports ROC
K-AUC to remove the reliance on a threshold
δ. ROC
K-AUC is calculated by measuring the true positive rates and false positive rates across thresholds
δ and
K values. A threshold-independent comparison of various models is possible by reporting the resulting area under this 2D surface as ROC
K-AUC.
7. Performance Evaluation
This section presents the quantitative results from the evaluation, highlighting how GraphDiffusion performs against traditional machine learning, convolutional autoencoders, and transformer- and diffusion-based models. The ablation study evaluates the contribution of the graph topology selection and sliding window parameters. This result demonstrates the superiority of the proposed framework in capturing spatial and temporal patterns, which are critical for DAS anomaly detection.
7.1. Comparison with Conventional Anomaly-Detection Models
This work compares the suggested GraphDiffusion with a wide range of generative anomaly-detection models, including diffusion-based generative models, convolutional neural networks, transformer-based architectures, and conventional machine learning methods, to assess its effectiveness. Two evaluation metrics, F1K-AUC and ROCK-AUC, evaluate the robustness of anomaly detection under various thresholds and adjustment conditions. The real-time factor (RTF) is also reported to evaluate computational efficiency concerning real-time data rates.
Table 3 summarizes the performance of all comparative models. With an F1
K-AUC of 70.2%, ROC
K-AUC of 73.8%, and RTF of 0.307, OCSVM performs moderately well. Its comparatively poor performance in this DAS raises questions about its ability to model the intricate temporal and spatial dependencies in distributed sensing signals accurately. The CNN-based autoencoder reports an RTF of 0.171, an ROC
K-AUC of 62.7%, and an F1
K-AUC of 79.3%. Its comparatively poor performance might be due to the difficulty in handling distributed or spatially correlated anomalies across DAS channels, although it performs well in reconstructing temporal patterns.
Graph-based deep learning techniques yield measurable improvements. Using spatiotemporal GNNs, GDN maintains an RTF of 0.776 while producing an F1K-AUC of 86.9% and an ROCK-AUC of 73.9%. The GDN achieves the second-best performance after the proposed GraphDiffusion highlighting the strength of GNNs in capturing learning spatial and temporal representations in DAS. Transformer-based models yield different outcomes. With a ROCK-AUC of 68.3% and an F1K-AUC of 77.8%, TranAD leads this category, outperforming iTransformer, MAAT, and Anomaly Transformer. However, all these transformer variations still lag diffusion-driven and graph-based approaches, presumably due to their limitations in generalizing to anomalies that are spatially distributed or low signal-to-noise conditions, which are typical of DAS.
With an F1K-AUC of 82.0% and an ROCK-AUC of 70.7%, DiffusionAE, which combines a DDPM with an autoencoder backbone, outperforms all transformer-based models in terms of detection capability. This finding illustrates how denoising-based generative modeling can be applied to detect anomalous deviations and capture the temporal distribution of normal signals. However, among all the evaluated models, its high RTF of 1.382 makes it the least effective.
With an F1K-AUC of 98.2%, ROCK-AUC of 98.0%, and an RTF of 0.890, the proposed GraphDiffusion model provides best performance. The GraphDiffusion achieves a 35.6% lower RTF and improves the F1K-AUC and ROCK-AUC by 16.2% and 27.3%, respectively, in comparison to the DiffusionAE. The significant performance margin emphasizes the importance of including the spatial structure in temporal generative models. In contrast to DiffusionAE, GraphDiffusion applies a graph structure to encode the physical layout of DAS channels and employs GCNs to learn spatial dependencies. The proposed model can identify subtle and spatially distributed anomalies that are often ignored by models processing time series separately because it combines diffusion-based temporal modeling with spatially aware representation learning.
7.2. Effect of the Graph Topology and Distance Threshold in the Proposed GraphDiffusion
We carried out an ablation study across several configurations to evaluate the influence of the graph topology and spatial distance threshold,
d, in edge construction based on the performance of the GraphDiffusion.
Table 4 summarizes the results of the ablation studies measuring the performance under different thresholds and graph topologies. We evaluated two graph construction strategies with different values of
: an index-based topology (called “normal”) and a 2D coordinate-based topology (called the “two-row topology”).
If the absolute difference between the indices of the two channels is less than or equal to the threshold,
d, then an edge is created between them in the index-based topology. As shown in
Table 4, both F1
K-AUC and ROC
K-AUC increase as
d rises from 1 to 3, peaking at 97.2% and 89.8%, respectively, at
d = 3. However, F1
K-AUC slightly decreases to 96.8% at
d = 4 and then decreases to
d = 5, indicating that oversmoothing may result from excessive neighborhood aggregation.
Channels in the two-row topology are symmetrically placed along the top and bottom fence edges, resembling the physical configuration of actual DAS installations in a 2D coordinate system. By defining the edges according to the Euclidean distance between channel coordinates, connectivity is possible horizontally, vertically, and diagonally. Oversmoothing occurs at lower
d values than in the index-based topology because the 2D adjacency, which includes vertical and diagonal connections, allows even a small distance threshold (e.g.,
d = 3) to encompass a larger set of neighbors. As shown in
Table 4, the two-row topology produces the best overall results with an F1
K-AUC of 98.2% and an ROC
K-AUC of 98.0% at
d = 1.5.
Additionally, this work includes a DDPM baseline devoid of any graph-based spatial modeling to separate the contribution of the GNN. The performance of this configuration was lower than the GraphDiffusion, achieving an F1K-AUC of 81.6% and ROCK-AUC of 68.8%.
These findings reveal that anomaly-detection performance is significantly influenced by the graph topology design and distance threshold d. In addition, by incorporating the spatial structure through the graph topology based on physical information and GNN, more precise modeling of the acoustic signals propagating in DAS systems is possible.
7.3. Effect of Window Size in the Proposed GraphDiffusion
We experimented with different window sizes while maintaining a stride of 50% of the window length to assess how the window size affects anomaly-detection performance. This 50% overlap is a commonly used technique in time series anomaly detection, as it ensures that potentially significant events close to window boundaries are not ignored while balancing the computational cost and information preservation. The technique maintains a manageable number of redundant data while more reliably capturing continuous or slowly evolving anomalies by halving the overlap between consecutive windows.
Table 5 summarizes the resulting performance metrics, which explain how various window sizes influence detection accuracy. The performance results indicate a steady improvement in detection performance with an increasing window size. Due to the limited temporal context, 50 small windows with a stride of 25 (50% overlapping) exhibit moderate performance at an F1
K-AUC of 93.8% and ROC
K-AUC of 68.9%. When the window size was increased to 100 with the same 50% strides, the performance significantly improved at an F1
K-AUC of 93.9% and ROC
K-AUC of 81.9%. The benefit of a broader temporal view is demonstrated by the notable gains obtained by further increasing the window size to 200, reaching 96.5% for F1
K-AUC and 96.1% for ROC
K-AUC. A window size of 300 and a stride of 150 yield the best results out of all configurations, at an F1
K-AUC of 98.2% and an ROC
K-AUC of 98.0%. These results highlight that a window size of 300 with a 50% stride achieves optimal detection accuracy in DAS anomaly-detection tasks while maintaining efficiency.
7.4. Hidden-Layer-Size Effects in the GNN of GraphDiffusion
We performed an ablation study in GraphDiffusion, fixing the window size to 300 and varying the size of the hidden layer in the GNN. We evaluated four configurations with sizes of 100, 150, 300, and 600 for the hidden layer and reported the performance and model size.
Table 6 summarizes these outcomes. Moving from 100 to 150 slightly increases the trainable parameters from 9.95 M to 9.98 M but delivers a considerable increase in performance, with the F1
K-AUC rising from 89.3 to 98.2 and ROC
K-AUC increasing from 89.9 to 98.0. Increasing the hidden size from 150 to 300 raises the parameters to 10.07 M, yielding only a modest F1
K-AUC gain from 98.2 to 98.9, whereas the ROC
K-AUC decreases from 98.0 to 93.3. Increasing it to 600 raises the parameters to 10.25 M and reduces the AUC for F1
K and ROC
K to 97.3 and 92.0, respectively.
The model with a hidden-layer size of 100 underfits the data and cannot completely capture the spatial dependencies. In contrast, the larger hidden-layer sizes introduce extra capacity that increases the computational cost without consistent benefits and can reduce robustness, likely due to oversmoothing or overfitting. The hidden-layer size of the 150 configuration offers the best balance, retaining a high F1K-AUC and ROCK-AUC value. Based on this evidence, we adopted 150 as the default hidden-layer size to balance performance and stability.
7.5. Performance Comparison of the GraphDiffusion and DDPM Based on a Different DAS Dataset
We employed a different DAS dataset to assess various site configurations and installations, fiber-optic cable layouts, and noise environments.
Figure 6 presents the layout of the testbed where the new data collection was conducted. The testbed includes fiber-optic cables laid in a squared S-shaped loop, with three strategically placed stimulation zones: a soil bay, concrete bay, and asphalt bay. The DAS interrogators and patch panels were installed in waterproof containers at the site entrance, and the fiber-optic cables were buried about 50 cm to 1 m underground. Along the buried section, the soil bay experiments included compaction, blasting, and scaffolding effects. The concrete bay and asphalt bay experiments included basket effects, surface scraping, and hydraulic breaker impacts on the concrete and asphalt. The DAS data collection testbed comprises 970 sensors, and the number of DAS data channels is 970.
Using the DAS data, we constructed a training set with 357,100 traces, a validation set with 19,800 normal traces and 20,100 anomalous traces, and a test set with 19,900 normal traces and 20,100 anomalous traces. The window size was set to 100, and the stride size was set to 50 to accommodate the high channel count. After training GraphDiffusion and the general DDPM using the training and validation datasets, performance was measured using the test dataset. The GraphDiffusion model generated a graph topology that mimics the layout of a fiber-optic cable, and the spatial distance threshold was set to d = 1.5.
Table 7 contrasts the GraphDiffusion model with a general DDPM. The GraphDiffusion model delivers substantially higher F1
K-AUC and ROC
K-AUC scores compared to the baseline DDPM. This performance improvement underscores the benefit of encoding the fiber-optic cable layout via the GNN, leading to more accurate anomaly detection across diverse DAS data. Although the absolute score based on the new dataset decreased due to the increased layout complexity, underground burial, and anomalous events in various bays, GraphDiffusion still outperformed the baseline DDPM. These results demonstrate that encoding the physical cable layouts using GNNs yields better performance regardless of the installation and noise.
7.6. GNN Embedding Analysis Using DAS Signals
Spatially structured anomaly patterns highlight the need to model intersensor relationships in DAS anomaly detection. Because acoustic energy propagates along the cable, anomalous signals rarely remain isolated and instead affect adjacent channels. The proposed approach captures these dependencies with a GNN that encodes the DAS sensor layout as a graph and learns spatially coherent features by aggregating information from neighboring sensors via adjacency-based message passing.
Figure 7 compares the raw DAS input (a) with the GNN output (b) for the same segment, with sensors on the
x-axis and time on the
y-axis. In normal intervals, scattered noise in the input is suppressed, producing a cleaner background in the GNN output. In anormal intervals, banded patterns spanning the neighboring sensors become clearer and more continuous after the GNN. These sharper spatial features are provided to the conditional DDPM, enabling temporal denoising with explicit spatial context and clarifying that the observed gains stem from the GNN-derived spatial abstraction rather than from a plain DDPM.
7.7. Discussion
The outcomes of the experiment demonstrate how well the suggested GraphDiffusion works for generative anomaly detection in DAS data. The proposed method overcomes the primary limitations of earlier approaches by combining temporal generative modeling via a conditional DDPM with spatial modeling via a GNN. GraphDiffusion continuously outperforms comparative models, such as GDN, TranAD, and DiffusionAE, in terms of the F1K-AUC and ROCK-AUC metrics. This enhancement highlights how crucial it is for DAS to capture intricate temporal dynamics and interchannel dependencies simultaneously.
Additionally, the ablation study reveals that model performance is strongly affected by the graph topology selection. Although index-based adjacent graphs perform well, using graphs with a two-row topology yields even more benefits. This topology improves the ability of the model to represent acoustic propagation across the sensor array by constructing edges according to Euclidean distances. Thus, the model can capture richer spatial relationships, including horizontal, vertical, and diagonal connectivity. The findings imply that a spatially aware graph structure that more accurately reflects the actual arrangement of sensors is advantageous for real-world DAS configurations.
Furthermore, the analysis of the distance threshold, d, indicates a nontrivial trade-off. Excessively dense graphs can weaken local structural cues and increase the computational load, but larger values of d boost connectivity and might provide each node with more context. A moderate threshold (e.g., d = 1.5 in the two-row topology-based graph) strikes the best balance in the experiments, producing the highest ROCK-AUC and F1K-AUC scores.
The comparison between diffusion and GraphDiffusion further highlights the contribution of spatial awareness. Diffusion has limited ability to detect anomalies that are scattered or propagated due to a lack of information regarding the spatial structure. GraphDiffusion can contextualize the signal of each channel within its neighborhood by incorporating GNN-based encoding, improving anomaly localization and robustness in complex or noisy environments.
8. Conclusions
This study presents GraphDiffusion, a novel method combining the conditional DDPM and GNN for generative anomaly detection in DAS data. The proposed method overcomes the primary limitations of earlier approaches that either ignore the spatial structure or rely heavily on labeled data by modeling the spatial layout of DAS channels as a graph and learning the temporal dynamics via a diffusion-based generative process.
This work demonstrates that a two-row topology-based graph, representing physical relationships, such as horizontal, vertical, and diagonal proximity, significantly improves performance. Experimental results show that the proposed GraphDiffusion achieved the highest F1K-AUC and ROCK-AUC scores, corresponding to 98.2% and 98.0%, respectively, outperforming the comparative models.
Additionally, the ablation study reveals that practical trade-offs between locality and connectivity are possible when the spatial distance threshold is tuned during edge construction. The two-row topology achieved optimal performance when the spatial distance threshold d was set to 1.5. Reducing d from 1.5 to 1 resulted in a 5.2%p drop in the F1K-AUC and a 14.4%p drop in the ROCK-AUC, indicating insufficient connectivity. Conversely, increasing d to 3 resulted in a marginal 0.1%p increase in the F1K-AUC but a 7.1%p decrease in the ROCK-AUC, suggesting that excessive neighborhood aggregation may lead to oversmoothing.
Furthermore, an ablation study that removed the GNN component from the proposed GraphDiffusion revealed that the DDPM without spatial modeling significantly degraded the F1K-AUC and ROCK-AUC scores by 16.6%p and 29.2%p, respectively. This finding highlights the crucial role of spatial modeling in capturing interchannel dependencies of anomalies in DAS signals.
However, this study has two limitations. First, since this model relies on a pre-determined graph structure based on the physical layout of the sensors, it may be difficult to generalize deployments with irregular or unknown topologies. In addition, the computational overhead caused by the repeated noise-removal steps in the diffusion model makes it difficult to deploy it in real time.
In future work, we aim to explore an adaptive graph-construction method that can dynamically reflect spatial relationships in the data itself and a lightweight diffusion transformation method to reduce the inference latency and improve the scalability of real-world monitoring systems.