1. Introduction
Residential fire safety is a cornerstone of urban sustainability, touching on issues of human safety, infrastructure durability, and economic stability [
1,
2]. The impacts of residential fires are far-reaching, including loss of life, damage to infrastructure, and long-term financial strain [
3,
4]. Central to mitigating these consequences is the early detection of ignition and early combustion stages. Early detection governs evacuation timing, suppression system engagement, and emergency service mobilization [
5]. A shorter detection latency substantially increases the likelihood of containment and life preservation. As such, recent efforts in fire protection are directed toward minimizing the time lag between fire onset and alarm activation [
6,
7,
8,
9].
Residential fire detection generally relies on devices designed to identify early combustion indicators based on heat or smoke [
10,
11]. While their performance is validated and their usage widespread, they operate on threshold-based activation models that may not pick up the very earliest signs of combustion, sometimes delaying an alarm or triggering a false alarm [
12,
13]. Moreover, despite being a foundational component of fire protection, fire detection infrastructure remains alarmingly deficient across residential properties. A considerable proportion of residences either lacks detection systems or contains devices that are inoperative or fail to meet minimum standards, posing significant risks to occupants and responders alike [
4]. Data from the U.S. Fire Administration indicate that smoke alarms were present in just 33% of fatal residential fires, absent in 25%, and unconfirmed in 42% of cases. Within the subset of homes where alarms were confirmed, only 16% functioned correctly during the fire event. Alarm failure occurred in 7% of those cases, while 11% had unknown operational status. Notably, incidents where fires were too minor to activate the alarm comprised less than 1% of failures [
14]. These figures underscore systemic limitations in both the deployment and performance fidelity of residential fire alarm systems.
Although fire codes stipulate alarm installations across residential types from single-family units to multi-dwelling structures, real-world enforcement and device operability are highly variable and difficult to ensure. For larger complexes, requirements extend to centralized, interconnected systems [
15]. However, enforcement is especially inconsistent in aging or non-retrofitted dwellings. Older residential buildings, which constitute a substantial fraction of global housing, often face architectural constraints that hinder effective fire detection [
16]. Narrow corridors, closed ceilings, and preservation laws make alarm installation disruptive or expensive. Moreover, features like high ceilings and enclosed layouts distort smoke pathways, delaying detector activation [
17].
Advancements in video-based fire detection have expanded the field of contactless fire monitoring [
18,
19]. Vision-based solutions, often integrated into security cameras, utilize deep learning to detect visual signatures of fire, such as smoke trails or flame oscillations [
19,
20,
21,
22]. These are particularly effective in expansive or tall spaces where conventional detectors may be delayed. Remote visual confirmation also enhances situational awareness for emergency response. However, vision models demand continuous illumination, unobstructed views, and significant computational resources, often processed on edge or cloud platforms [
23,
24,
25].
Modern frameworks emphasize interoperability, integrating fire detection with existing systems, including HVAC management, and automated suppression systems [
26,
27,
28]. These frameworks employ internet connectivity to link multiple detectors, such as smoke, heat, or multi-sensor, each uniquely identified within the network [
6,
29,
30]. Ming [
31] introduced a smart fire intervention suite with optical sensors, all coordinated via Wi-Fi. Ehsan et al. and Lakshmi et al. [
9,
32] developed wireless fire alarm networks with mobile notification capabilities.
A novel direction in fire detection utilizes ambient Wi-Fi signals as non-contact sensors. The application of Wi-Fi signal analysis has seen significant growth in various fields beyond communications, most notably in security, healthcare, human–computer interaction, and occupancy monitoring. Channel State Information (CSI), derived from pervasive Wi-Fi signals, has demonstrated capabilities in motion tracking [
33], human identification [
34], and respiratory monitoring [
34,
35,
36]. Its widespread applicability stems from the ubiquity of Wi-Fi and its cost-efficient infrastructure.
In construction, Wi-Fi-enabled systems have addressed safety through unauthorized access monitoring [
37] and contactless fall detection via deep learning [
38]. Material analysis has similarly benefited, with Convolutional Neural Network (CNN) processing CSI magnitude and phase to identify material thickness, compaction, and moisture content with very high accuracy [
39,
40]. Studies have confirmed the capabilities of Wi-Fi detection systems for structural health monitoring [
41]. Advances in hardware and embedded computing have enabled the deployment of low-cost structural monitoring solutions based on single-board computers such as Raspberry Pi. These platforms support continuous, high-resolution monitoring without degrading data fidelity, representing a practical substitute for high-cost proprietary systems [
42].
In parallel, Wi-Fi sensing is being incorporated into fire detection systems to provide efficient, infrastructure-wide hazard alerts. Wi-Fi-based fire detection employs Radio Frequency (RF) emissions from Wi-Fi transmitters as probes of the surrounding environment [
43,
44]. It leverages changes in CSI, which encapsulates amplitude and phase variations in RF propagation, to infer environmental perturbations caused by fire. These systems offer rapid, non-invasive installation, an advantage particularly relevant for complex or old structures where physical rewiring may be impractical or prohibited. Integration with smart home ecosystems adds convenience. The inherent modularity of Wi-Fi-based platforms allows for seamless expansion and relocation of detection units, avoiding structural disruption. Battery-powered configurations further ensure operational continuity during electrical outages, addressing a critical reliability criterion. Moreover, such systems are well-suited to cost-constrained or spatially restricted environments [
45,
46], broadening access to high-quality fire monitoring.
Despite its ubiquity, research exploring the utility of Wi-Fi infrastructure for fire detection in residential settings remains in its early stages. The Wi-Fire system by [
43] marked one of the earliest demonstrations of passive, device-free fire detection. Leveraging 802.11n hardware, the system extracted thermal signal features through PCA-reduced, denoised CSI streams and employed a Random Forest classifier to detect fire events. Subsequent work by [
47] advanced this field by demonstrating temperature profiling through CSI acquired from Raspberry Pi 4B devices, revealing that CSI amplitude tightly correlates with air temperature. Subsequent studies expanded into real combustion scenarios, with CSI collected at up to 1350 fps on both 2.4 and 5 GHz bands. Using Raspberry Pi platforms to sample CSI at high frame rates (up to 1350 fps) across 2.4 and 5 GHz bands, the system recorded RF behavior over flame exposure. By analyzing changes in CSI amplitude and variance, the authors identified distinct amplitude patterns for fire and smoke stages. Their findings support the potential of CSI as a low-cost fire detection tool [
44]. In another track [
48], ESP32 transceivers were used in a constrained metal-box configuration to localize heat sources with CSI captured at 20 Hz over 64 subcarriers. The top 40 high-variance subcarriers were vectorized and classified using a linear support vector machine to determine active heater positions.
Although current research has contributed to developing Wi-Fi CSI-based fire detection systems, key challenges must be addressed for effective real-world implementation. Most studies have relied on narrowly defined ignition sources, typically flame sources with limited smoke production or thermal variability. These fail to represent the RF-interaction complexity posed by diverse fuel classes such as synthetic polymers or biomass, all of which emit distinct combustion by-products that influence CSI differently. Moreover, many of the current detection systems rely on supervised learning frameworks or fixed-threshold classifiers, which introduce rigidity. These models exhibit lower resilience to untrained fire scenarios and are not fully equipped for dynamic environments. Addressing the challenges of current fire detection systems [
49], this study proposes a more robust and scalable solution via unsupervised anomaly detection using autoencoder networks operating on CSI streams from ESP32 Wi-Fi modules. Four deep learning architectures, namely Variational Autoencoder (VAE), Convolutional Autoencoder (CNN-AE), Long Short-Term Memory Autoencoder (LSTM-AE), and a hybrid CNN–LSTM model, are explored due to their proven capabilities in encoding temporal, spatial, and statistical aspects of time-series data [
50,
51,
52,
53]. Each model learns to compress normal signal behavior into a latent space and reconstruct input data with minimal loss [
54].
The VAE adopts a probabilistic generative framework that learns latent variable distributions, enabling principled anomaly detection through reconstruction likelihoods [
55]. The 1D-CNN is optimized for extracting localized spatial regularities within individual time frames, capturing granular sub-patterns. The LSTM-AE, with its recurrent design, captures inter-window temporal structure and long-range dependencies [
55,
56]. The hybrid CNN–LSTM structure unifies local feature extraction and sequence modeling into a layered hierarchy [
53,
57], effectively capturing anomalies during the ignition phase, where perturbations are minimal but meaningful. Given the uniquely subtle distortions induced by thermal gradients and scattering effects in the ignition phase [
58], the performance of these four architectures is compared to select the most responsive model for early-stage fire detection in indoor environments. The end-to-end pipeline for the CSI-based early fire detection system is illustrated in
Figure 1.
As shown in
Figure 1, the proposed pipeline for unsupervised anomaly detection begins with CSI signal ingestion, wherein raw CSI sequences are captured under both normal and fire-affected conditions across three distinct scenarios. These high-dimensional sequences are processed using deep autoencoder networks with bottleneck layers to extract compressed latent patterns. During training, reconstruction loss is minimized exclusively on fire-free data, learning the characteristics of the signal under normal conditions. Dropout, smoothing, and adaptive thresholding are incorporated to regularize the training process, suppress transient noise, and dynamically adjust detection sensitivity. Using the CSI data collected under combustion conditions, subtle differences in signal behavior were identified and used to calibrate thresholds. These thresholds were adaptively adjusted to reflect the degree of deviation from the learned normal pattern, thereby improving detection precision under different ignition scenarios. When deployed, the decoder compares each reconstructed sequence with its original counterpart, and the resulting error becomes the key signal for anomaly detection [
59,
60]. If this reconstruction loss surpasses a threshold across multiple time windows, the input is flagged as anomalous, signaling a combustion event. Based on this analysis, the model demonstrating optimal sensitivity to early fire signatures and superior robustness is selected for the early fire detection alarm system.
2. Methodology
The methodology adopted in this study followed a three-stage pipeline: CSI data acquisition, signal preprocessing, and anomaly-based fire detection using unsupervised deep learning models, as illustrated in
Figure 2.
In the first stage, CSI data were collected using two ESP32 modules configured as transmitter (Tx) and receiver (Rx), positioned approximately 2.0 m apart under line-of-sight conditions. Both nodes were elevated 0.5–1.0 m above the floor, and the combustion source was placed near the midpoint of the direct propagation path between them. CSI streams were recorded under two settings:
Baseline conditions: collecting CSI under normal conditions with no fire activity, allowing models to learn the normal propagation characteristics of indoor wireless signals.
Fire scenarios: recording RF behavior during three controlled combustion events involving gasoline, wood, and plastic that were introduced between the devices to simulate real-world ignition phases.
In the second stage, the raw CSI signals were preprocessed to reduce noise and enhance interpretability. A Hampel filter was used to eliminate outliers caused by hardware or environmental jitter. The cleaned signals were segmented into overlapping time windows, and statistical features, such as mean, standard deviation, and skewness, were computed for each segment. This enabled localized analysis of signal behavior and prepared the data for input into the detection models.
In the final stage, four deep learning models, namely VAE, CNN-AE, LSTM-AE, and a hybrid CNN-LSTM, were trained on fire-free baseline data. These models learned the normal patterns of wireless signal propagation and were tested on fire data to identify anomalies through reconstruction errors. Detection thresholds were dynamically set based on baseline error distributions, enabling the system to flag deviations likely caused by combustion. Each model’s performance was evaluated in terms of detection accuracy, false alarm rate, and robustness across different fire types.
2.1. Data Acquisition and Experimental Setup
The CSI sensing system was configured using two ESP32 microcontrollers and a LattePanda operating on Windows 11. The ESP32 was chosen for its built-in CSI acquisition capabilities, low cost, dual-core 240 MHz processor, and integrated Wi-Fi/BLE functionality, which support both data transmission and on-board inference. It allows access to CSI, which provides subcarrier-level information of amplitude and phase, eliminating the need for specialized hardware to access CSI and allowing real-time data processing or transmission [
61]. This helps in capturing the subtle variations in RF propagation induced by thermal gradients and soot particles, and refractive-index fluctuations during combustion events.
The experimental setup, as shown in
Figure 3, consists of a Wi-Fi transmitter and receiver placed within a room such that a potential fire source lies in the propagation path between them. Both devices were raised to approximately 0.5–1 m above the ground to minimize ground-induced multipath. One ESP32 device operates as a Wi-Fi transmitter while another ESP32 serves as the CSI receiver. The ESP32 receiver captures CSI for each received Wi-Fi frame by measuring the complex-valued Channel Frequency Response (CFR) across all Orthogonal Frequency Division Multiplexing (OFDM) subcarriers [
62]. Formally, the CFR at frequency f and time t, denoted as H(f,t), can be expressed as Equation (1) [
44]:
Here, denotes the complex CFR at frequency f and time t, and denotes its static component due to stationary reflectors. P is the set of dynamic propagation paths. For each path , denotes the complex amplitude, the initial propagation-path length at , and the effective rate of path-length change. Finally, denotes the carrier wavelength and the imaginary unit. Combustion perturbs the propagation channel through heat, smoke, and refractive-index fluctuations, which alter the dynamic multipath terms and hence the measured CSI.
Fire affects CSI by modifying the RF propagation environment through several coupled physical mechanisms. The heat released during combustion creates strong local temperature and density gradients in air, which alter its refractive index and perturb the propagation paths between the transmitter and receiver. At the same time, smoke particles and combustion by-products introduce additional attenuation and scattering, while the resulting turbulent plume continuously reshapes the multipath structure of the channel. These effects cause measurable changes in CSI amplitude and phase by altering path loss, reflection strength, and path stability [
43,
63]. Different fuels generate distinct signatures due to variations in heat release and smoke properties, supporting fuel-dependent anomaly detection.
Each CSI sample, per the 802.11n specification, comprises amplitude and phase readings for subcarriers [
62]. The analysis in this work prioritizes amplitude data because it provides a more stable and direct indication of fire-induced scattering and attenuation effects in the wireless channel, while avoiding the additional calibration complexity associated with phase data [
43,
64]. The system incorporated no additional sensing units, relying exclusively on CSI readings from existing Wi-Fi signals. CSI data collection was divided into two phases: baseline and fire event recordings. For baseline data, the environment was monitored under normal conditions (no flame or smoke) for an extended period of 3 h to capture the typical CSI variability from benign environmental changes and hardware noise. This baseline established the “normal” CSI profile for the area.
Subsequently, three controlled fire events were introduced between the ESP32 transmitter and receiver: (1) a gasoline flame for 10 min, (2) a smoldering wood fire for 5 min, and (3) a plastic fire for 10 min (
Figure 4). The differing combustion types were selected to induce varying environmental effects such as thermal intensity, particulates, and smoke, and to capture a diverse range of fire-induced signal disturbances. The experimental model assumed the absence of human occupants, representing conditions under which the fire alarm system must operate autonomously to detect an active flame. Each experiment provided a time series of CSI amplitude and phase across subcarriers.
Figure 5 provides a dense visual encoding of combustion-induced RF perturbations, contrasting baseline CSI profiles with those measured during different fire conditions. Unprocessed CSI amplitude readings offered detailed insight into the physical effects of fire on wireless propagation. These raw packets preserved the true per-packet fading signature of the environment. Each vertically offset trace illustrates the subcarrier profile of a single packet, highlighting deviations in amplitude patterns attributable to flame propagation and smoke dynamics.
Across all tested fire types, raw CSI traces revealed how RF propagation responded to combustion properties. The rapid and localized gasoline flame primarily reduced the amplitude of higher-index subcarriers, with minimal overall impact. Wood combustion caused more consistent amplitude suppression across the spectrum and introduced jitter at lower frequencies, likely due to sustained smoke and heat. The plastic fire produced the broadest and most severe signal degradation, consistent with heavy soot and material off-gassing. These unprocessed overlays served as visual indicators of the combustion-induced channel transformations, validating that wood and plastic fires produce clearer spectral signatures within CSI data. The results emphasized that distinct fire types impose materially different propagation dynamics. They underscored the advantage of CSI in identifying broadband fires like plastic and wood, making it promising for early fire detection in residential buildings.
2.2. Preprocessing
The CSI time series underwent preprocessing to reduce noise while maintaining subtle signal deviations associated with combustion events. The initial phase involved application of the Hampel filter, a non-parametric technique recognized for its robustness in outlier detection within Wi-Fi sensing literature [
43,
44,
47]. The Hampel filter is particularly suited for denoising CSI amplitude signals, where short-duration spikes often arise from hardware jitters or RF noise, rather than environmental changes. This method preserves sustained trends while removing impulsive outliers [
65]. This distinction is vital in fire detection tasks, where combustion-induced anomalies appear as gradual, low-frequency signal drifts rather than instantaneous transitions.
The Hampel filter operates over a sliding temporal window, where each data point is evaluated against the local median and flagged as an outlier if it deviates by more than a predefined threshold, typically three times the scaled median absolute deviation (MAD). These outliers are substituted with the median value of the corresponding window, thereby mitigating high-frequency noise without disrupting the long-term signal structure [
66,
67] This method is well-suited for time-series sensor data as it can preserve sustained trends while removing impulsive outliers [
67]. Hence, Hampel filtering was used to obtain an interpretable CSI amplitude trace. Once denoised, the CSI data stream is partitioned into fixed-duration windows, enabling localized analysis of temporal dynamics and standardized input to deep learning models. The sampling frequency was set to 1 Hz. Segmentation captured the temporal dynamics of the signal in discrete and manageable segments and standardized the input dimensions for deep autoencoder models. The obtained mean CSI amplitudes after denoising are shown in
Figure 6.
To capture statistical irregularities induced by combustion, the probability mass function (PMF) was calculated for CSI amplitude values in each temporal segment. The core premise is that fire-related activity causes measurable shifts in these distributions, driven by changes in the physical channel, especially from thermal scattering, absorption, and dynamic multipath behavior. In higher-temperature fires, ionization may occur, forming localized plasma that disturbs the RF channel further [
43].
Figure 7 illustrates the environmental dynamics encoded by the PMF, shift, and spread of CSI responses across subcarriers.
In baseline conditions, the CSI amplitude distributions exhibited compact, unimodal PMFs with minimal variance, indicative of a stable, undisturbed multipath field. However, combustion events introduced substantial spectral deviations. Gasoline broadened the PMF at higher-index subcarriers, reflecting signal degradation from short bursts of combustion gases. Wood fire distributions broadened gradually, demonstrating slow-forming thermal turbulence. Dense smoke and high thermal loads from plastic resulted in more widespread, chaotic PMF structures, which can serve as strong indicators for anomaly detection.
Figure 8 shows four subcarriers whose amplitude PMFs undergo notable shifts under combustion. These subcarriers were selected based on the degree of amplitude distribution divergence between the no-fire and three fire conditions, independent of time order. PMF analysis was used for exploratory interpretation of combustion-induced CSI distribution shifts and was not directly used as model input.
In addition, a set of summary statistical features from each window of CSI data was extracted. These included the mean (capturing overall attenuation), standard deviation, interquartile range (IQR) (reflecting spread and signal volatility), skewness and kurtosis (encoding distribution asymmetry and tail behavior), as well as the median and range. Each of these metrics maps to a structural feature of the RF signal, enabling interpretable, low-dimensional embeddings of complex channel dynamics [
68]. To encode a stable and informative representation of the wireless channel, CSI amplitudes were temporally averaged across all subcarriers at each timestamp. This leveraged the correlation across tightly spaced subcarriers in 802.11n, reducing dimensionality while retaining global multipath behavior.
2.3. Deep Unsupervised Learning Models
To detect fire-induced anomalies from indoor wireless signals, four unsupervised deep learning architectures including VAE, CNN-AE, LSTM-AE, and CNN-LSTM were implemented. A key technique underlying these models is autoencoder-based anomaly detection, where the model is trained exclusively on normal (non-fire) CSI inputs to accurately reconstruct typical signal patterns. Autoencoder-based anomaly detection was used to learn the normal CSI pattern. For an input sample
, the encoder
maps the input to a latent representation
, and the decoder
reconstructs the input as
:
The reconstruction loss used for training the deterministic autoencoder models is defined as Equation (3) [
69]:
where
is the number of training samples. The encoder and decoder parameters are optimized by minimizing
on fire-free baseline data only.
In practical deployment, when CSI patterns are perturbed by combustion, the reconstruction loss increases. To find the best configuration, a 20-epoch pilot grid search explored combinations of hidden sizes, latent dimensions, and learning rates. Inputs exceeding a predefined threshold were classified as fire-related anomalies. During inference, anomalies were detected by computing the mean squared error (MSE) between the original input and its reconstruction. A statistical threshold, based on a 95% confidence level, separated nominal deviations from true anomalies. During inference, reconstruction errors were smoothed using a moving average, and anomaly scores were thresholded at μ + 1.96 σ of the training reconstruction MSE.
The VAE model was designed to capture probabilistic latent structures in CSI time series. During encoding, input windows were flattened and passed through dense layers that output μ and log σ
2. The decoder mirrored this architecture, expanding latent vectors back into the original CSI shape. The model leveraged the reparameterization trick to ensure smooth latent interpolation and prevent overfitting. The decoder reconstructed the original input shape symmetrically by expanding the latent code through mirrored fully connected layers. All input features were standardized based on statistics computed from the initial 80% of a baseline dataset, while the remaining 20% was allocated for validation. Additionally, three independent traces were withheld for final testing. A two-stage grid search was conducted across hidden layer sizes {16, 32, 64}, latent dimensionalities [
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
23,
24,
25,
26,
27,
28,
29,
30,
31,
32] and learning rates {1 × 10
−3, 1 × 10
−4}. The final model utilized 64 hidden units, a 17-dimensional latent space, and a learning rate of 0.001. This configuration was trained for 200 epochs with a constant KL-divergence weight of 1.0 and 20% dropout.
The CNN-AE consisted of convolutional layers that act as an encoder, compressing the input into a lower-dimensional latent feature map, followed by deconvolution layers as a decoder that reconstructs the input matrix. The encoder consisted of two convolutional layers (kernel size 3, padding 1), reducing the input down to latent feature map depths of 16, 32, or 64. These layers were interleaved with ReLU activations and 20% dropout to encourage robustness. A mirrored decoder architecture, using transpose convolutions, reconstructed the original input sequence. Sliding windows were constructed identically to prior models. A grid search over parameters similar to VAE guided architectural selection. The final configuration included 64 hidden units, a latent space dimensionality of 17, and a 0.001 learning rate.
The LSTM-AE model was structured around a recurrent architecture. The encoder processed each temporal window through an LSTM layer, after which the final hidden state was projected into a latent vector via a fully connected layer. Consistent with the grid search parameters utilized for the VAE, the latent representation was optimized over combinations of hidden layer sizes, latent dimensionalities, and learning rates. During decoding, this latent vector was replicated across time steps and passed through a second LSTM decoder, culminating in a final linear layer that reconstructed the feature values for each time step. Training followed the same 80/20 data partitioning, normalization on training data, and windowing strategy as previous models. The final model was trained using a 17-dimensional latent space, 64 hidden units, and a learning rate of 0.001.
The CNN–LSTM autoencoder integrated a convolutional encoder for subcarrier feature extraction and an LSTM encoder for sequential modeling. Two convolutional layers with a kernel size of 3 and padding of 1, each followed by a ReLU, acted as the spatial encoder. The resulting map was transposed and fed into an LSTM encoder to capture the sequential dynamics across windows of length 5. The LSTM decoder outputs were then reshaped and passed through the convolutional decoder, a mirrored architecture of the encoder. The model, optimized using Adam with 1 × 10−3 weight decay, achieved stable training with 64 hidden units, 32 latent dimensions, and a learning rate of 0.001.
3. Results and Discussion
Post-training evaluation aimed to assess how well the models generalized to unseen sequences and captured fire-induced signal deviations. The analysis included reconstruction-error patterns (
Figure 9), MSE over 200 epochs (
Figure 10), and formal anomaly detection metrics, indicative of model utility under deployment conditions, as compared in
Table 1.
Figure 9 illustrates the smoothed reconstruction error signals for each model, contrasting their behavior under baseline (training/validation) conditions with test sequences involving different fire conditions. The horizontal dashed line represents the data-driven threshold defined as Equation (4):
As demonstrated in
Figure 9a, VAE effectively captured wood and plastic fires but exhibited insufficient sensitivity to gasoline-based perturbations. The higher reconstruction noise floor and smoothed latent representation likely contributed to this limitation. Only transient error peaks were observed, resulting in a 14.3% detection rate in that context due to subtle error fluctuations. The CNN-AE maintained a tightly bound baseline error well below its τ, and all fire scenarios produced clear and sustained error excursions above this threshold. This consistent pattern enabled perfect detection across all fire types. LSTM-AE was similarly effective for wood and plastic sequences but missed a small part of gasoline-related anomalies. This can be attributed to its recurrent structure, which smoothed rapid changes and reduced responsiveness to short-lived spikes.
The configured CNN–LSTM attempted to balance spatial and sequential processing, but the model exhibited the highest and most variable baseline error. While combustion events from wood and plastic were detectable, they lacked sufficient error separation for gasoline signals. While the CNN component successfully localized spatial channel perturbations, the LSTM module appears to have over-smoothed these signals during temporal modeling, causing transient anomalies to be suppressed.
Figure 10 depicts training and validation losses across the full training cycle, highlighting generalization, error minimization, and overfitting risk for all models.
As shown in
Figure 10, the CNN-AE exhibited rapid convergence, reducing training loss from approximately 0.90 to 0.30 and validation loss to 0.35 within the first 50 epochs. Both curves maintained a consistent, narrow gap, indicating rapid feature learning and minimal overfitting. The ability of the CNN-AE to extract local spatial and temporal features with minimal parameter overhead allowed for both fast convergence and high reconstruction accuracy, confirming its suitability for the task. In contrast, the VAE plateaued at a higher error (0.94 train/0.96 val), which is expected due to its prioritization of latent structure over raw reconstruction fidelity. This yielded smooth but less precise reconstructions, reducing sensitivity to short fire bursts. The LSTM-AE showed gradual convergence, with loss curves descending to ~0.80 (train) and ~0.85 (val). The widened gap is indicative of emerging overfitting to specific sequences. The CNN–LSTM exhibited suboptimal learning behavior, with both training and validation losses remaining nearly static throughout the optimization process, indicating that the model failed to meaningfully reduce reconstruction errors.
Model performance in detecting fire-related anomalies and suppressing false positives was evaluated using three indicators commonly used in unsupervised anomaly detection: False Alarm Rate (FAR), Overall Alarm Rate (OAR), and Average Run Length (ARL). FAR is the proportion of baseline (non-fire) windows that are incorrectly classified as anomalies, quantifying the frequency of false positives during normal system operation [
70]. OAR denotes the percentage of fire windows correctly identified as anomalies [
71]. This metric, which is equivalent to recall, reflects the detection sensitivity. ARL is the mean number of normal windows between two consecutive false alarms [
72]. ARL serves as a reliability index and indicates how long the system can run without triggering false alerts. These metrics jointly measured how well each model detected fire-induced anomalies while avoiding false positives during normal conditions. The detection performance of the models is compared in
Table 1.
As shown in
Table 1, detection performance varied significantly across the four tested models when exposed to fire events from gasoline, wood, and plastic. CNN demonstrated perfect detection (100%) in all test cases, confirming its sensitivity to flame-induced signal disruptions. It also recorded the lowest FAR (3.6%), reflecting excellent stability under normal conditions. With an ARL of 24.3, the model maintained extended intervals between false positives. This performance was likely driven by the ability of the convolutional architecture to encode localized temporal–spectral structures with low reconstruction noise, creating sharp contrasts when anomalies occurred. LSTM-AE captured persistent anomalies (wood 100%, plastic 96.2%) but missed transients (gasoline 71.4%), revealing limited responsiveness to brief, low-intensity events. Its higher baseline jitter (FAR 4.9%, ARL 20.8) further indicated increased baseline noise and reduced stability. Nevertheless, its recurrent design allowed it to effectively capture prolonged anomalies through temporal modeling. In contrast, CNN-LSTM and VAE performed conservatively, excelling on wood and plastic, but underperforming on gasoline anomalies (28.6 and 14.3%, respectively).
Among all scenarios, gasoline fires presented the most challenging classification task due to their transient and subtle CSI distortions. Among all evaluated architectures, only CNN-AE reliably captured these anomalies. Its robust spatial filtering mechanisms yielded clear separation between normal and anomalous reconstructions, unlike the VAE and CNN–LSTM models, which failed to differentiate weak combustion signatures. In scenarios involving wood and plastic, the intensity and duration of CSI disturbances were sufficiently pronounced for all architectures to achieve successful detection. This is physically consistent with combustion behavior that hot gases, smoke aerosols, localized ionization, and turbulence modify the refractive properties of the propagation medium and disrupt the multipath field between the transmitter and receiver. This causes measurable CSI changes through attenuation, scattering, and path instability. The magnitude and duration of these effects vary with fuel type, producing distinct CSI signatures. In the present study, wood and plastic showed more persistent anomalies, likely due to greater smoke production, whereas gasoline caused weaker and shorter-lived channel disturbances.
Although these findings support the sensitivity of CSI to combustion-induced thermal and particulate effects, the individual physical contributions were not isolated in this study. Moreover, the experiments were performed under controlled line-of-sight conditions, with the fire source placed directly along the propagation path and without human activity, obstacles, or external RF interference. Although this configuration was suitable for validating the proposed concept, it represents an idealized environment and therefore limits direct extrapolation to practical residential settings. Future investigations should broaden the framework to include non-line-of-sight and multi-room configurations and incorporate human activity and environmental variability.
4. Conclusions
This study introduced a Wi-Fi-based early fire detection architecture that operated solely on CSI data, eliminating the need for thermal or gas sensors. By leveraging the ubiquity of wireless infrastructure, the framework reduced deployment complexity, expanded spatial coverage, and enabled early detection of combustion events. Exploiting ambient Wi-Fi signals, four unsupervised deep learning models were configured as fire detection systems to identify various combustion types based on perturbations in CSI signals induced by fire dynamics. The detection capabilities of these models were evaluated under three distinct combustion conditions: gasoline, wood, and plastic, each exhibiting different smoke and flame characteristics.
The CNN-AE was the most responsive architecture across all fire conditions. It achieved the best overall detection accuracy, maintained the lowest false alarm rate (3.6%), and posted the highest ARL among the evaluated models. Its 1D convolutional architecture effectively captured localized distortions in CSI signals with high precision. LSTM-AE was effective on slow-developing fires but less responsive to rapid anomalies. The VAE, while stable in normal operations, showed limited responsiveness to abrupt combustion signatures.
The CNN-AE model established a robust, low-latency framework for unsupervised fire detection using CSI from ambient Wi-Fi signals. Its generalization capacity enabled consistent performance across combustion sources with varying thermal and spectral dynamics. The architecture generalized well to diverse fuel types and demonstrated anomaly-resilient encoding properties, while its unsupervised formulation enhanced the scalability of RF-based anomaly systems. The results supported its application in residential safety systems and building automation frameworks.
This study was conducted under controlled conditions, focusing on autonomous fire detection in unoccupied environments. While this setup supports the intended use case, it limits applicability in more complex scenarios. Future work should extend the baseline dataset to include occupant motion and routine activities, thereby improving the precision of the system’s ability to differentiate fire-related anomalies from other environmental disturbances. Additional directions include evaluation across diverse living environments and long-term deployment in real residential buildings to assess robustness under varying structural and interference conditions.