A Wi-Fi Fingerprinting Indoor Localization Framework Using Feature-Level Augmentation via Variational Graph Auto-Encoder

Kim, Dongdeok; Park, Jae-Hyeon; Suh, Young-Joo

doi:10.3390/electronics14142807

Open AccessArticle

A Wi-Fi Fingerprinting Indoor Localization Framework Using Feature-Level Augmentation via Variational Graph Auto-Encoder

by

Dongdeok Kim

¹

,

Jae-Hyeon Park

²

and

Young-Joo Suh

^3,*

¹

Department of Computer Science and Engineering, Pohang University of Science and Technology (POSTECH), 77 Cheongam-ro, Nam-gu, Pohang 37673, Republic of Korea

²

POSTECH Institute of Artificial Intelligence (PIAI), Pohang University of Science and Technology (POSTECH), 77 Cheongam-ro, Nam-gu, Pohang 37673, Republic of Korea

³

Graduate School of Artificial Intelligence, Pohang University of Science and Technology (POSTECH), 77 Cheongam-ro, Nam-gu, Pohang 37673, Republic of Korea

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(14), 2807; https://doi.org/10.3390/electronics14142807

Submission received: 29 May 2025 / Revised: 8 July 2025 / Accepted: 11 July 2025 / Published: 12 July 2025

(This article belongs to the Special Issue Wireless Sensor Network: Latest Advances and Prospects)

Download

Browse Figures

Versions Notes

Abstract

Wi-Fi fingerprinting is a widely adopted technique for indoor localization in location-based services (LBS) due to its cost-effectiveness and ease of deployment using existing infrastructure. However, the performance of these systems often suffers due to missing received signal strength indicator (RSSI) measurements, which can arise from complex indoor structures, device limitations, or user mobility, leading to incomplete and unreliable fingerprint data. To address this critical issue, we propose Feature-level Augmentation for Localization (FALoc), a novel framework that enhances Wi-Fi fingerprinting-based localization through targeted feature-level data augmentation. FALoc uniquely models the observation probabilities of RSSI signals by constructing a bipartite graph between reference points and access points, which is then processed by a variational graph auto-encoder (VGAE). Based on these learned probabilities, FALoc intelligently imputes likely missing RSSI values or removes unreliable ones, effectively enriching the training data. We evaluated FALoc using an MLP (Multi-Layer Perceptron)-based localization model on the UJIIndoorLoc and UTSIndoorLoc datasets. The experimental results demonstrate that FALoc significantly improves localization accuracy, achieving mean localization errors of 7.137 m on UJIIndoorLoc and 7.138 m on UTSIndoorLoc, which represent improvements of approximately 12.9% and 8.6% over the respective MLP baselines (8.191 m and 7.808 m), highlighting the efficacy of our approach in handling missing data.

Keywords:

indoor localization; Wi-Fi fingerprinting; data augmentation; variational graph auto-encoder; missing data imputation

1. Introduction

The proliferation of location-based services (LBSs) and the rise of smart environments have fueled a significant demand for accurate and reliable indoor localization systems. Among various indoor positioning technologies, Wi-Fi-based localization has become a widely adopted solution due to its low deployment cost and ease of implementation by leveraging existing Wi-Fi infrastructure in wireless local area networks (WLANs). In particular, Wi-Fi fingerprinting, which utilizes standard signal measurements such as the received signal strength indicator (RSSI) from each Access Point (AP), has gained considerable popularity for its practicality, cost-effectiveness, and relatively straightforward implementation [1].

Wi-Fi fingerprinting typically consists of two phases: an offline phase and an online phase. In the offline phase, RSSI measurements are collected from wireless access points (WAPs) at predefined reference locations, known as reference points (RPs). The collected RSSI values, in combination with the location information of the RPs, are utilized to construct a radio map that characterizes the signal environment of the target area. In the online phase, when a user attempts to estimate their location, their device measures current RSSI values from the surrounding WAPs. These measurements are then compared with the pre-constructed radio map to determine the user’s location.

Machine learning (ML) techniques have been extensively employed in the online phase for location determination. Support vector machine (SVM)-based localization methods [2,3,4,5] learn signal patterns from RSSI data and classify them using hyperplanes; however, they can suffer from slow training and inference times, particularly with large-scale or high-dimensional datasets. Random forest (RF) has been utilized in studies [6,7,8,9] due to its proficiency in handling large-scale fingerprint datasets by leveraging multiple decision trees, though it still requires significant computational resources during training. Weighted k-nearest neighbor (WKNN)-based methods [10,11,12] offer advantages in terms of computational efficiency and implementation simplicity. However, they are sensitive to environmental changes and are often limited in their ability to capture the complex, non-linear relationship between RSSI values and physical locations [13]. Recent advancements in computational capabilities, coupled with improvements in data availability, have paved the way for deep learning (DL)applications in Wi-Fi fingerprinting. Techniques employing autoencoders, convolutional neural networks (CNN), and recurrent neural networks (RNN) have been effectively applied to learn intricate and sequential patterns in RSSI data, thereby significantly enhancing localization accuracy and flexibility [14,15,16,17,18].

Effective application of ML and DL techniques requires both sufficient quantity and high quality of training data. However, in real-world deployment scenarios, maintaining a sufficient volume of up-to-date fingerprint data is challenging, as the periodic recollection process is labor-intensive and time-consuming. This makes it difficult to sustain high-quality signal maps in dynamic environments, especially under real-time or batch processing constraints. To address this, prior research has primarily focused on two strategies. The first involves employing learning algorithms adept at handling data scarcity, such as semi-supervised [19] and unsupervised learning techniques [20]. The second strategy leverages data augmentation to synthetically expand the dataset by increasing the number of RPs. Common approaches include interpolation [21], pathloss model calibration [22], and generative models such as generative adversarial networks (GANs) [23]. For example, Njima et al. [23] proposed a GAN-based method that generates synthetic RSSI data under limited labeled scenarios by combining semi-supervised learning with a filtering mechanism to remove unrealistic samples. Sugasaki et al. [24] introduced a Between-Location augmentation method—adapted from between-class learning—which linearly mixes two samples and applies normalization; to better suit Wi-Fi fingerprinting, they incorporated a neural network-based generative model reflecting the physical continuity of labels and RSSI characteristics. Tang et al. [25] developed an RNN-based localization model enhanced with multi-output Gaussian process (MOGP)-based augmentation to improve interpretability and robustness in complex indoor environments, while these approaches contribute to dataset expansion and model generalization, they often assume relatively complete seed data and do not explicitly address the issue of missing data within individual fingerprint vectors.

A critical aspect of Wi-Fi fingerprint data quality is the pervasive problem of missing data. These missing entries can arise from numerous non-human factors, such as temporary sensor or communication module failures, signal interference, packet loss during transmission, and intermittent system outages [26]. Furthermore, environmental dynamics, including unstable power supplies to WAPs, alterations in indoor structures or obstacles, and even user mobility patterns, can significantly compromise the stability of RSSI collection, frequently leading to incomplete fingerprint vectors where expected signals from certain WAPs are not recorded. Furthermore, missing RSSI readings can occur due to transient signal fluctuations, interference, or other unknown factors, leading to inconsistent sets of observed WAPs even at the same RP across different data collection instances. Such inconsistencies and incompleteness in the fingerprint vectors significantly impair a model’s ability to accurately learn the underlying mapping between RSSI patterns and physical locations. While some general data augmentation techniques might incidentally model or smooth over missingness (e.g., through learned representations in GANs), or rely on simple pre-imputation steps, they often fail to directly and probabilistically model the missing data mechanism itself. This can lead to the propagation of errors, introduction of bias, or failure to capture the true signal characteristics necessary for robust localization.

To illustrate the prevalence of this missing data problem, we analyzed the UJIIndoorLoc dataset [27], a widely used benchmark collected in a multi-building, multi-floor environment with multiple devices over an extended period. Let

C_{i}

denote the set of RPs within a 5 m radius of the i-th RP, and let

O_{i}

and

O_{j}

represent the sets of WAPs observed at the i-th and j-th RPs, respectively. We denote each RP as

{RP}_{j}

, where j indicates the index of the j-th reference point. In an ideal scenario with no missing data, one would expect

| O_{i} |

and

| O_{j} |

to be highly similar for

{RP}_{j} \in C_{i}

. To quantify the discrepancy, we define the missing rate

m_{i}

for each RP as follows:

m_{i} = \frac{1}{| C_{i} |} \sum_{{RP}_{j} \in C_{i}} \frac{| O_{i} \cup O_{j} | - | O_{i} |}{| O_{i} \cup O_{j} |}

(1)

Figure 1 demonstrates that a significant number of RPs exhibit non-zero missing

m_{i}

, indicating the prevalence of incomplete RSSI observations. Only approximately 18.2% of the dataset shows a low missing rate (below 0.2). The majority of RPs, accounting for 59.1% of the dataset, fall within the moderate missing rate range of 0.2 to 0.4. Although the proportion decreases in the higher missing rate range (0.4 to 0.8), it still constitutes a non-negligible 22.4%. In rare cases, RPs exhibit extremely high missing rates, with values reaching up to the 0.9–1.0 range.

These findings underscore that substantial discrepancies exist in the observed WAP sets even among spatially proximate RPs. This highlights a fundamental limitation in real-world fingerprint datasets, especially those collected over extended durations and under diverse conditions. The missing components might be naively ignored, poorly imputed, or their absence misinterpreted by a generative process that is not specifically designed for data completion. This can diminish the quality and realism of the augmented data, potentially reducing the generalization performance and robustness of the localization model, especially in environments characterized by high signal variability or sparse WAP coverage. Therefore, a significant gap exists for augmentation strategies that shift the primary focus from mere dataset expansion to the fundamental task of feature-level data completion and quality enhancement. Such strategies would explicitly recognize and address the missing data problem at its core, aiming to create more complete and veridical training samples as a prerequisite or integral part of any subsequent augmentation or model training.

To directly tackle this specific challenge of data incompleteness, we introduce FALoc, a novel framework designed to train robust localization models. Critically, FALoc re-envisions data augmentation for Wi-Fi fingerprinting not merely as a means of dataset expansion, but as a targeted process of feature-level data conditioning and completion. The core innovation of FALoc lies in its sophisticated mechanism for understanding and rectifying data incompleteness before or as an integral part of the augmentation process. It begins by constructing a bipartite graph representing the relationships between RPs and WAPs. A variational graph auto-encoder (VGAE) [28] is then employed to model not only the RSSI values themselves but, crucially, the observation probabilities of these RSSIs. Based on these learned probabilities, FALoc intelligently decides whether to impute missing features with values generated by the VGAE or, in some cases, to remove features that are deemed unreliable or uninformative. This explicit modeling of missingness and the subsequent principled imputation or deletion at the feature level distinguishes FALoc from conventional augmentation methods that primarily generate new data points or transform existing ones without such focused handling of missing entries. These augmented data variants, which offer a more complete and consistent representation of the signal landscape, are then used to enrich the training data for the downstream localization model. We evaluated the performance of FALoc using the UJIIndoorLoc dataset [27] and UTSIndoorLoc dataset [16], a large-scale benchmark for Wi-Fi fingerprinting indoor localization. Our experimental results demonstrate that FALoc effectively handles missing data, leading to a significant improvement in localization accuracy.

The main contributions of this paper are as follows:

The identification and emphasis of missing data as a critical, yet often inadequately addressed, quality defect in Wi-Fi fingerprinting, distinct from general data scarcity.
The proposal of FALoc, a novel framework that leverages a bipartite graph representation and a VGAE to perform a distinct form of feature-level data augmentation, primarily focused on the principled imputation and handling of missing RSSI values, rather than solely on dataset expansion.
A comprehensive evaluation of a real-world dataset demonstrating the effectiveness of FALoc in enhancing localization performance specifically in the presence of prevalent missing data.

The remainder of this paper is organized as follows: Section 2 details the proposed FALoc framework. Section 3 presents the experimental setup and performance evaluation. Section 4 discusses the findings and suggests future work. Finally, Section 5 concludes the paper.

2. Proposed Framework

This section details the architecture and components of our proposed framework, FALoc (Feature-level Augmentation for Localization), specifically designed to address the challenge of missing RSSI values in Wi-Fi fingerprint data and thereby enhance indoor localization accuracy. FALoc leverages a graph-based representation of fingerprints, a VGAE to model RSSI observations and values, and a novel augmentation module to generate enriched training data for a neural network-based localization model. An overview of the FALoc framework is presented in Figure 2.

The process begins with the input fingerprint data, consisting of a feature matrix

X

(containing normalized RSSI values) and a corresponding binary mask matrix

M

(indicating observed RSSI measurements). Due to the inherent sparsity and dynamic nature of real-world Wi-Fi environments, many entries in

X

are typically missing, as reflected by zeros in

M

. To capture the structural relationships within these data, a bipartite graph is constructed where RPs and WAPs form two distinct sets of nodes. An edge exists between an RP and a WAP if an RSSI measurement is observed (i.e.,

M_{i k} = 1

), with the normalized RSSI value

X_{i k}

serving as an edge feature.

This graph is then processed by a VGAE, which learns latent embeddings for both RP and WAP nodes. The VGAE aims to reconstruct both the graph structure (i.e., the mask matrix

\hat{M}

, representing predicted RSSI observation probabilities) and the edge features (i.e., the RSSI values

{\hat{e}}_{i k}

). The encoder part of the VGAE utilizes graph attention mechanisms to aggregate neighborhood information effectively. The augmentation module subsequently leverages the outputs of the VGAE’s decoder (

\hat{M}

and

{\hat{e}}_{i k}

) to stochastically generate augmented fingerprint data

\tilde{X}

. This involves deciding whether to preserve an existing RSSI value, impute a missing one using

{\hat{e}}_{i k}

if

{\hat{M}}_{i k}

suggests a high likelihood of existence, or potentially remove an existing but seemingly unreliable observation. Finally, the augmented fingerprint data

\tilde{X}

serves as input to a localization module, typically a neural network, which is trained to estimate the location labels

{\hat{Y}}^{l}

. By training on these enriched and more complete fingerprints, the localization model is expected to achieve improved robustness and accuracy, particularly in scenarios with sparse or missing measurements.

2.1. Bipartite Graph Representation of Wi-Fi Fingerprints

The foundation of FALoc is the representation of Wi-Fi fingerprint data as a bipartite graph, which naturally models the interactions between two distinct types of entities: RPs and WAPs. The bipartite graph structure is known to effectively capture the underlying relationships in highly sparse and irregular wireless signal environments, as it enables expressive graph-based latent representations [29]. The dataset, comprising RSSI measurements from

N_{a p}

WAPs at

N_{r p}

RPs, is initially represented by a feature matrix

X \in R^{N_{r p} \times N_{a p}}

and an observation mask matrix

M \in {0, 1}^{N_{r p} \times N_{a p}}

. Each element

M_{i k}

in

M

indicates the presence (1) or absence (0) of an RSSI measurement from the k-th WAP at the i-th RP. The corresponding normalized RSSI value

X_{i k}

is defined as follows:

X_{i k} = \{\begin{matrix} \frac{r_{i k} - r_{min}}{r_{max} - r_{min}} & if M_{i k} = 1, \\ 0 & if M_{i k} = 0 \end{matrix}

(2)

In Equation (2),

r_{i k}

is the raw RSSI value from the k-th WAP at the i-th RP, and

r_{min}

and

r_{max}

are the minimum and maximum RSSI values observed in the entire dataset, used for scaling. Each RP is associated with three types of location labels: its building identifier, floor level, and specific coordinates (longitude, latitude). These are encoded as one-hot vector

Y_{i}^{b}

for building,

Y_{i}^{f}

for floor, and min–max normalized coordinate pairs

Y_{i}^{l}

for location.

The bipartite graph

G = (U, V, E)

is then constructed, where

U = {u_{i}}_{i = 1}^{N_{r p}}

represents the set of RP nodes and

V = {v_{k}}_{k = 1}^{N_{a p}}

represents the set of WAP nodes. An edge

(u_{i}, v_{k}) \in E

exists if and only if

M_{i k} = 1

. The initial features for RP nodes,

U_{i}^{(0)} \in R^{D}

, are derived from their location labels, aiming to provide the model with prior spatial information as follows:

U_{i}^{(0)} = {(Y_{i}^{b})}^{T} B + {(Y_{i}^{f})}^{T} F + {(Y_{i}^{l})}^{T} W^{l}

(3)

In Equation (3),

B \in R^{N_{b} \times D}

,

F \in R^{N_{f} \times D}

, and

W^{l} \in R^{2 \times D}

are trainable embedding matrices for building, floor, and location coordinates, respectively, and D is the dimension of the initial node features. WAP nodes

v_{k}

are initialized with trainable feature vectors

V_{k}^{(0)} \in R^{D}

, forming an embedding matrix

V^{(0)} \in R^{N_{a p} \times D}

. Each edge

(u_{i}, v_{k})

in

E

carries a feature vector

e_{i k} = [X_{i k}]

, where the scalar

X_{i k}

is converted into a 1D vector, representing the normalized RSSI value connecting the RP and WAP.

2.2. Variational Graph Auto-Encoder

To learn meaningful representations from the potentially sparse and incomplete fingerprint graph, we employ a VGAE [28]. The VGAE is well-suited for this task due to its ability to learn probabilistic latent embeddings for nodes and its generative nature, which allows for the reconstruction of graph structure (RSSI observation likelihoods) and features (RSSI values). VAE-based models have been shown to be effective in similar scenarios involving fluctuating communication data, demonstrating their suitability for such modeling tasks [30]. It is trained using a self-supervised learning paradigm. The VGAE takes the bipartite graph

G

as input and outputs a reconstructed graph

\hat{G}

, comprising reconstructed edge existence probabilities and edge features.

2.2.1. Node Embedding Modules and Modified GAT

The core of our VGAE’s encoder lies in its node embedding modules, which update RP and WAP node representations iteratively using a modified graph attention network (GAT) layer [31]. This modified GAT layer is designed to incorporate information from both node features and edge features effectively. We describe the layer using the RP node embedding module as an example. For each layer l, the modified GAT layer processes the node embeddings from the previous layer (

U^{(l - 1)}

for RPs,

V^{(l - 1)}

for WAPs) along with the edge features

e_{i k}

for each connected pair

(u_{i}, v_{k}) \in E

. This module produces the current layer’s RP embeddings (

U^{(l)}

). The attention mechanism utilizes trainable weight matrices (

W_{u}^{(l)}

,

W_{v}^{(l)}

,

W_{e}^{(l)}

) and attention vectors (

a_{u}^{(l)}

,

a_{v}^{(l)}

,

a_{e}^{(l)}

).

The attention coefficient

α_{i, k}^{(l)}

between the i-th RP node

u_{i}

and its neighboring k-th WAP node

v_{k}

in layer l is computed as follows:

\begin{matrix} {\hat{α}}_{i, k}^{(l)} = & {(a_{u}^{(l)})}^{T} W_{u}^{(l)} U_{i}^{(l - 1)} + {(a_{v}^{(l)})}^{T} W_{v}^{(l)} V_{k}^{(l - 1)} \\ + {(a_{e}^{(l)})}^{T} W_{e}^{(l)} e_{i k} \end{matrix}

(4)

α_{i, k}^{(l)} = \frac{exp (σ^{'} ({\hat{α}}_{i, k}^{(l)}))}{\sum_{k \in N_{i}} exp (σ^{'} ({\hat{α}}_{i, k}^{(l)})) + exp (σ^{'} ({\hat{α}}_{i, i}^{(l)}))}

(5)

In Equations (4) and (5),

N_{i}

denotes the set of WAP neighbors of the i-th RP node,

σ^{'}

is the LeakyReLU activation function (with a negative slope of 0.2), and

{\hat{α}}_{i, i}^{(l)}

represents a self-attention component (implicitly defined by the formulation, often involving only the RP node’s own features). The updated embedding

U_{i}^{(l)}

for the i-th RP node is then computed as a weighted sum of transformations of its own features and its neighbors’ features as follows:

\begin{matrix} U_{i}^{(l)} = α_{i, i}^{(l)} W_{u}^{(l)} U_{i}^{(l - 1)} + \sum_{k \in N_{i}} α_{i, k}^{(l)} W_{v}^{(l)} V_{k}^{(l - 1)} \end{matrix}

(6)

The WAP node embeddings

V^{(l)}

are updated through a symmetric process, attending to their neighboring RP nodes.

2.2.2. Encoder Architecture

The encoder of our VGAE consists of a two-layer Graph Neural Network (GNN) architecture. The first GNN layer comprises separate embedding modules (as described above) for RP and WAP nodes. These modules take the initial node features (

U^{(0)}

,

V^{(0)}

) and the original set of edge features (

{e_{i k}}

) as input, producing updated first-layer embeddings

U^{(1)}

and

V^{(1)}

.

The second GNN layer processes the activated embeddings from the first layer (e.g., after applying an ReLU activation function,

σ (\cdot)

). Similarly to the first layer, it contains distinct embedding modules for RP and WAP nodes. These modules operate on

σ (U^{(1)})

,

σ (V^{(1)})

, and the set of edge features

{e_{i k}}

. This layer outputs parameters for the probabilistic latent embeddings: the means (

μ^{r p}

,

μ^{w a p}

) and the logarithms of the standard deviations (

log σ^{r p}

,

log σ^{w a p}

) for RP and WAP nodes, respectively.

Finally, the encoder utilizes the reparameterization trick to sample the latent node embeddings

Z^{r p}

and

Z^{w a p}

from the learned distributions:

Z^{r p} = μ^{r p} + ϵ ⊙ σ^{r p}, Z^{w a p} = μ^{w a p} + ϵ ⊙ σ^{w a p}

(7)

In Equation (7),

ϵ

is a vector of random samples drawn from a standard normal distribution

N (0, I)

, and ⊙ denotes element-wise multiplication. These latent embeddings capture the essential characteristics of RPs and WAPs in a lower-dimensional space.

2.2.3. Decoder

The decoder part of the VGAE uses the latent embeddings

Z^{r p}

and

Z^{w a p}

to reconstruct the original graph properties: the mask matrix

M

(predicting the likelihood of an RSSI observation between an RP-WAP pair) and the set of edge features

{e_{i k}}

(predicting the RSSI values themselves).

For reconstructing the mask matrix, indicating the probability of an edge (RSSI observation), a bilinear decoder is employed. This choice is common for link prediction tasks as it efficiently models pairwise interactions. The reconstructed mask matrix

\hat{M}

is as follows:

\hat{M} = sigmoid (Z^{r p} W^{m} {(Z^{w a p})}^{T})

(8)

In Equation (8),

W^{m}

is a trainable weight matrix. Each entry

{\hat{M}}_{i k}

represents the predicted probability that an RSSI value is observed between the i-th RP and the k-th WAP.

For edge feature reconstruction (predicting RSSI values), a Multi-Layer Perceptron (MLP) decoder is used, offering flexibility to model complex relationships. The reconstructed edge feature (RSSI value)

{\hat{e}}_{i k}

for a potential or existing edge

(u_{i}, v_{k})

is computed as follows:

{\hat{e}}_{i k} = | Z_{i}^{r p} - Z_{k}^{w a p} | W^{e f} + b^{e f}

(9)

In Equation (9),

W^{e f}

and

b^{e f}

are the trainable weight and bias parameters for this edge feature reconstruction network. The output

{\hat{e}}_{i k}

is the predicted normalized RSSI vector.

2.3. Augmentation Module

A key innovation of FALoc is its augmentation module, which leverages the VGAE’s outputs (

\hat{M}

and

{\hat{e}}_{i k}

) to generate augmented versions of the fingerprint data. This module aims to create more complete and robust training samples by stochastically imputing likely missing RSSI values or removing potentially spurious ones. The decoder’s output

{\hat{M}}_{i k}

serves as the predicted probability of an RSSI observation between the i-th RP and k-th WAP.

The augmentation logic is based on comparing the original observation mask

M_{i k}

with the VGAE’s predicted probability

{\hat{M}}_{i k}

:

If $M_{i k} = 1$ (observed) and ${\hat{M}}_{i k}$ is high (e.g., ${\hat{M}}_{i k}$ $\geq α$ , where $α$ is a predefined threshold), the original RSSI is likely valid and should be preserved with high probability.
If $M_{i k} = 0$ (missing) and ${\hat{M}}_{i k}$ is low (e.g., ${\hat{M}}_{i k}$ $< α$ ), the absence of RSSI is likely genuine, and it should remain missing with high probability.
If $M_{i k} = 1$ (observed) but ${\hat{M}}_{i k}$ is low (e.g., ${\hat{M}}_{i k}$ $< α$ ), the observed RSSI might be an outlier or unreliable. This suggests potential feature deletion.
If $M_{i k} = 0$ (missing) but ${\hat{M}}_{i k}$ is high (e.g., ${\hat{M}}_{i k}$ $\geq α$ ), the original RSSI was likely missing, not genuinely absent. This suggests potential feature imputation using ${\hat{e}}_{i k}$ .

To implement this stochastic imputation and deletion, we first define a probability

P_{i k}

for the existence of an augmented link

{\tilde{M}}_{i k}

as follows:

P_{i k} = \{\begin{matrix} \frac{M_{i k} + {\hat{M}}_{i k}}{2}, & if {\hat{M}}_{i k} \geq α \\ 0, & otherwise \end{matrix}

(10)

The augmented mask entry

{\tilde{M}}_{i k}

is then sampled from a Bernoulli distribution with probability

P_{i k}

. To enable end-to-end training through this discrete sampling process, we use the Gumbel–softmax reparameterization trick [32] as follows:

{\tilde{M}}_{i k} = \frac{1}{1 + exp (- (log (P_{i k}) + G) / τ)}

(11)

In Equation (11), G is a sample drawn from the

G u m b e l (0, 1)

distribution, and

τ

is the temperature parameter of the Gumbel–softmax, controlling the smoothness of the approximation.

{\tilde{M}}_{i k} \approx 1

indicates the presence of an RSSI value in the augmented fingerprint, and

{\tilde{M}}_{i k} \approx 0

indicates its absence.

Finally, the augmented feature matrix

\tilde{X}

is constructed based on the original features

X

, the VGAE-reconstructed features

\hat{e}

, and the sampled augmented mask

\tilde{M}

as follows:

{\tilde{X}}_{i k} = \{\begin{matrix} {\hat{e}}_{i k}, & if {\tilde{M}}_{i k} = 1 and M_{i k} = 0 (Imputation) \\ 0, & if {\tilde{M}}_{i k} = 0 and M_{i k} = 1 (Deletion) \\ X_{i k}, & if {\tilde{M}}_{i k} = 1 and M_{i k} = 1 (Preservation) \\ 0, & if {\tilde{M}}_{i k} = 0 and M_{i k} = 0 (Remains absent) \end{matrix}

(12)

This process generates augmented fingerprints where missing values are probabilistically imputed and unreliable existing values are potentially removed, leading to a richer and more robust training set.

2.4. Localization Module

The localization module is responsible for predicting the user’s location based on the (augmented) Wi-Fi fingerprints. We employ a standard neural network architecture for this task, formulated as follows.

{\hat{Y}}_{i}^{l} = F_{θ} ({\tilde{X}}_{i})

(13)

In Equation (13),

{\tilde{X}}_{i}

is the i-th row of the augmented feature matrix (the fingerprint for the i-th RP),

F_{θ}

denotes the localization neural network with trainable parameters

θ

, and

{\hat{Y}}_{i}^{l}

represents the estimated location labels (e.g., coordinates) for the i-th RP. The network

F_{θ}

is typically an MLP, chosen for its ability to model complex non-linear mappings. The model is trained to minimize an estimation loss (e.g., Mean Absolute Error or Mean Squared Error) between the true labels

Y_{i}^{l}

and the predicted labels

{\hat{Y}}_{i}^{l}

. By training this module with the augmented data

\tilde{X}

, we aim to improve its generalization capability, especially its robustness to missing RSSI values encountered during online localization.

2.5. Model Training and Inference

The training of the FALoc framework is performed in two main stages to ensure stability and effective learning. The procedure for training and inference is shown in Algorithm 1.

Algorithm 1 FALoc Training and Inference Procedure

Require: Training data $D_{train} = {(X, M, Y)}$ , learning rates $η_{1}$ , $η_{2}$ , balance weight $λ$
Ensure: Trained localization model $F_{θ}$
Pretraining (VGAE only):
for each epoch do
Apply edge dropout to graph E
Reconstruct $\hat{M}, \hat{X}$ via VGAE
Compute loss $L_{pre} = L_{M} + L_{X}$
Update VGAE with learning rate $η_{1}$
end for
Main Training (End-to-End):
for each epoch do
Generate augmented data using Gumbel-softmax
Predict $\hat{Y} = F_{θ} (augmented data)$
Compute loss $L_{main} = L_{M} + L_{X} + λ L_{Y}$
Update all modules with learning rate $η_{2}$
Apply early stopping if validation degrades
end for
Inference:
Given query $x_{query}$ , return ${\hat{y}}_{query} = F_{θ} (x_{query})$

Pretraining Phase: Initially, only the VGAE component is pretrained. The objective during this phase is to enable the VGAE to accurately learn to reconstruct the graph structure (mask matrix

M

) and the edge features (RSSI values

X

). The loss function for pretraining is as follows.

L_{p r e t r a i n} = L_{M} (M, \hat{M}) + L_{X} (X, \hat{X})

(14)

In Equation (14),

L_{M} (M, \hat{M})

is the binary cross-entropy reconstruction loss for the mask matrix, encouraging

\hat{M}

to predict true observation likelihoods.

L_{X} (X, \hat{X})

is the reconstruction loss for the feature matrix, typically the Mean Absolute Error (MAE) calculated only over observed entries (where

M_{i k} = 1

), penalizing deviations between original and reconstructed RSSI values. To enhance generalization during VGAE training and prevent the model from simply memorizing the input graph, edge dropout is applied: a certain percentage of edges from

E

are randomly removed before being fed into the VGAE at each training epoch.

Main Training Phase (End-to-End): After pretraining the VGAE, the entire FALoc framework, including the VGAE, the augmentation module, and the localization module, is trained end-to-end. The overall loss function in this phase incorporates the VGAE reconstruction losses and the localization task loss as follows:

L_{m a i n} = L_{M} (M, \hat{M}) + L_{X} (X, \hat{X}) + λ L_{Y} (Y^{l}, {\hat{Y}}^{l})

(15)

In Equation (15),

L_{Y} (Y^{l}, {\hat{Y}}^{l})

is the loss for the location label prediction (e.g., MAE for coordinates), and

λ

is a hyperparameter balancing the reconstruction tasks with the primary localization task. During this phase, the gradients from the localization loss

L_{Y}

flow back through the augmentation module (enabled by the Gumbel–softmax trick) and into the VGAE, allowing the VGAE to learn representations that are not only good for reconstruction but also beneficial for the downstream localization task. Early stopping is employed based on the localization performance on a validation set to prevent overfitting.

Inference Phase: During the online location inference phase, the system processes a new Wi-Fi fingerprint observed by the user’s device to estimate their current location. Unlike the training phase where data augmentation is crucial, the inference phase in FALoc is straightforward. The raw fingerprint vector

x_{q u e r y}

, as collected from the environment (which may contain missing RSSI values), is directly fed as input to the trained localization module

F_{θ}

. The localization network then outputs the predicted location labels

{\hat{Y}}_{q u e r y}^{l}

as follows:

{\hat{Y}}_{q u e r y}^{l} = F_{θ} (x_{q u e r y})

(16)

The rationale behind this direct approach is that the localization module

F_{θ}

has already been trained on a diverse set of augmented fingerprints generated by the VGAE and augmentation module. This comprehensive training is intended to equip

F_{θ}

with the robustness to effectively handle incomplete or varied fingerprint data encountered during real-world inference, without requiring explicit data augmentation or completion steps at inference time.

3. Performance Evaluation

To assess the effectiveness of the proposed FALoc framework, we conducted comprehensive experiments using a well-established public benchmark dataset. This section details the dataset, evaluation metrics, experimental setup, and a comparative analysis of the results.

3.1. Dataset Description

We utilized the UJIIndoorLoc dataset [27] and the UTSIndoorLoc dataset [16], which are large-scale benchmarks for Wi-Fi fingerprinting-based indoor localization. The UJIIndoorLoc dataset was collected from three multi-floor buildings at Jaume I University, covering approximately 108,703 m². It includes RSSI measurements from 520 distinct WAPs and provides additional metadata such as FLOOR, BUILDINGID, SPACEID, RELATIVEPOSITION, USERID, PHONEID, TIMESTAMP, LONGITUDE, and LATITUDE. The UTSIndoorLoc dataset was collected in a multi-floor building at the University of Technology Sydney. Each sample consists of a 589-dimensional RSSI vector along with metadata including Pos_x, Pos_y, Floor_ID, Building_ID, User_ID, Phone_type, and Time.

The distribution of observed RSSI values spans from −104 dBm to 0 dBm in the UJIIndoorLoc dataset, and from −96 dBm to −37 dBm in the UTSIndoorLoc dataset. In UJIIndoorLoc and UTSIndoorLoc dataset, RSSI values that were not observed at a given location are marked with +100 dBm. We used these +100 dBm entries to identify unobserved signals, setting the corresponding elements in our mask matrix

M

to 0 (i.e.,

M_{i k} = 0

). Subsequently, for the feature matrix

X

used as model input, these unobserved signals were represented by a normalized value of 0, following Equation (2) in Section 2.1. This process results in fingerprint vectors that are often incomplete. The UJIIndoorLoc and UTSIndoorLoc datasets exhibit considerable variation in observed WAPs and RSSI values, making it a suitable testbed for evaluating FALoc, which is designed to handle such incomplete data.

3.2. Evaluation Metrics and Baselines

The primary performance metric used for evaluating localization accuracy is the mean Euclidean distance error, calculated between the predicted 2D coordinates (longitude and latitude) and the ground truth coordinates of the test samples. In addition to the mean error, we report several other statistical metrics to provide a comprehensive understanding of the error distribution: MAE, Mean Squared Error (MSE), Root Mean Square Error (RMSE), and the coefficient of determination (

R^{2}

). Lower values for mean error, MAE, MSE, and RMSE indicate better performance, while a higher

R^{2}

value signifies a better fit of the model to the data.

Our proposed method, MLP with FALoc, involves training an MLP-based localization module using the augmented fingerprints generated by FALoc as described in Section 3. As a baseline for comparison, we implemented a standard MLP without FALoc. For this baseline, the MLP was trained directly on the original fingerprint data, where missing RSSI values (identified from the +100 dBm markers and subsequently masked in

M

) were represented as 0 after the same normalization process applied to FALoc’s input. This zero-filling approach is a common practice for handling missing data when no specialized mechanism is employed. This ensures a fair comparison, isolating the impact of the FALoc feature augmentation and imputation strategy. In addition to the zero-filled baseline, we also implemented a variant of FALoc using a traditional variational autoencoder (VAE) instead of the proposed VGAE, in order to assess the benefits of graph-based modeling. This VAE-based model was trained under the same augmentation and MLP pipeline, but without exploiting the graph structure of the fingerprints. While our framework also processes building and floor labels as part of the RP node features in the VGAE, the quantitative evaluation presented in this section focuses on the accuracy of 2D coordinate prediction, which is the most common primary metric for these dataset.

3.3. Experimental Setup

The UJIIndoorLoc and UTSIndoorLoc datasets was divided into a training set and a testing set. The training samples were further partitioned, with 80% used for actual model training and 20% reserved as a validation set. This validation set was used for hyperparameter tuning and to implement an early stopping mechanism to prevent overfitting. Early stopping was triggered if the localization error on the validation set did not show improvement for a patience period.

A MLP served as the core architecture for the localization module (

F_{θ}

) in both the FALoc-enhanced setup and the baseline. We implemented FALoc and the baseline MLP model using PyTorch 2.6.0 and PyTorch Geometric hl2.4.0 libraries. To ensure optimal performance for both configurations, an extensive grid search was conducted to find the best hyperparameters for the MLP models and for the FALoc-specific components (e.g., VGAE architecture,

α

,

τ

). The optimized hyperparameters used for the final evaluation are detailed in Table 1.

In Table 1, D represents the dimensionality of the initial RP and WAP node features. The VGAE Encoder Architecture indicates the number of units in successive GAT layers leading to the parameters of the latent distributions; for instance, ’First, Layer(256 → 128), Mean&Log Std.(128 → 64)’ means the initial 256-dimensional features are processed by a GAT layer outputting 128 features, followed by another GAT layer outputting 64 features for both mean and standard deviation vectors, resulting in a latent dimension 64 per node type. The VGAE Decoder for the mask is bilinear, and for RSSI values, it is an MLP. The Localization MLP Architecture ’520 → 500 → 500 → 500 → 500 → 2’ describes an MLP with an input layer (size corresponding to the number of WAPs, 520), four hidden layers each with 500 neurons, and an output layer predicting 2D coordinates.

3.4. Results

Table 2 summarizes the positioning error statistics for both the baseline MLP and the MLP enhanced with our FALoc framework. The corresponding cumulative distribution function (CDF) of positioning errors is illustrated in Figure 3, providing a visual representation of the error distributions across all test samples.

The results clearly demonstrate that integrating FALoc significantly improves the localization performance of the MLP model across both datasets. On the UJIIndoorLoc dataset, the mean positioning error decreased from 8.191 m for the baseline MLP to 7.135 m with FALoc (VGAE), representing a relative improvement of approximately 12.9%. Similarly, the MAE dropped from 5.214 to 4.561 m, while the MSE was reduced from 66.294 to 49.603

m^{2}

. The RMSE also improved from 8.142 to 7.043 m, a reduction of about 13.5%. These improvements in squared error metrics suggest that FALoc effectively mitigates large outliers in positioning error, likely by generating more reliable fingerprint representations in place of missing or noisy signals. The

R^{2}

value also increased slightly from 0.991 to 0.993, indicating that the model with FALoc better explains the variance in true locations.

On the UTSIndoorLoc dataset, which features a different environment and Wi-Fi topology, a similar trend is observed. The mean positioning error improved from 7.808 m (baseline) to 7.138 m (FALoc-VGAE), and the MAE dropped from 4.819 to 4.232 m. The MSE decreased from 43.435 to 39.121

m^{2}

, and RMSE was reduced from 6.591 to 6.255 m. Although the absolute errors on UTSIndoorLoc are lower than those on UJIIndoorLoc, the relative improvements remain consistent, confirming that FALoc generalizes well across different deployment settings. The increase in

R^{2}

from 0.605 to 0.722 further supports that FALoc enhances the model’s ability to predict accurate positions even in more challenging or diverse environments.

In addition to outperforming the baseline MLP, the proposed FALoc framework based on VGAE also demonstrates superiority over the prior VAE-based variant across both datasets. On the UJIIndoorLoc dataset, the VGAE model achieved a lower mean error (7.135 m vs. 7.808 m) and MAE (4.561 m vs. 4.934 m) compared to the VAE model. Furthermore, VGAE showed substantial improvements in MSE (49.603 vs. 61.878

m^{2}

) and RMSE (7.043 m vs. 7.866 m), indicating its effectiveness in reducing both average and large deviations in localization. Even the

R^{2}

score, already high with VAE (0.9978), was slightly better with VGAE (0.993), highlighting consistent model generalization. On the UTSIndoorLoc dataset, which presents a more diverse environment, similar patterns are observed. VGAE outperformed VAE across all metrics: mean error (7.138 m vs. 7.382 m), MAE (4.232 m vs. 4.413 m), MSE (39.121 vs. 40.074

m^{2}

), RMSE (6.255 m vs. 6.330 m), and

R^{2}

(0.722 vs. 0.694). These results confirm that VGAE-based FALoc not only advances beyond the traditional zero-filled baseline, but also improves upon earlier augmentation techniques, offering more robust and generalizable localization performance across heterogeneous indoor environments.

Figure 3 further corroborates these findings. The CDF curve for the FALoc-enhanced MLP consistently lies above that of the baseline MLP across nearly all error ranges. This is especially pronounced in the lower error region (e.g., 0–10 m), where a notably larger fraction of test samples achieve lower positioning errors when FALoc is employed. For instance, with FALoc, over 75% of the predictions fall within a 10 meter error margin, compared to approximately 65% for the baseline. This upward shift in the CDF curve highlights not only a reduction in average error but also an improvement in the overall reliability and robustness of the localization model.

In summary, the empirical evidence strongly supports the efficacy of FALoc. By explicitly modeling RSSI observation probabilities and performing feature-level data augmentation to handle missing values, FALoc enables the localization model to learn more effectively from incomplete real-world data. This results in statistically and practically significant improvements across all standard performance metrics, leading to more accurate and consistent indoor localization outcomes. The enhanced performance, particularly the reduction in larger errors, can be attributed to FALoc’s ability to provide the localization model with intelligently imputed and more complete feature vectors, mitigating the adverse effects of naively handled (e.g., zero-filled) missing data that the baseline model contends with.

4. Discussion and Future Work

4.1. Feasibility of Real-World Deployment

One important consideration for real-world deployment is the sensitivity of the threshold parameter

α

, which determines the confidence margin used in the neighbor filtering step. As

α

directly influences the trade-off between recall and precision, its optimal value is not universal but must be adjusted according to the characteristics of the deployment environment, such as signal fluctuation, density of reference points, and physical obstructions. Practitioners should conduct environment-specific calibration to fully leverage the benefits of our framework. Future work could explore adaptive thresholding strategies that dynamically adjust

α

based on signal statistics based on the feature of datasets.

Another important consideration is the computational complexity. The proposed method utilizes deep learning only during the offline training phase, which is typically conducted on server-side infrastructure, where resource constraints are less critical. In contrast, the online inference phase, which runs on the user device, requires a computational load comparable to that of conventional fingerprinting-based inference methods. Therefore, the proposed framework is sufficiently feasible even on resource-constrained devices and does not impose significant computational overhead during real-time inference. In addition, while the use of graph-based modeling may raise concerns regarding scalability, particularly in large-scale graphs, it is important to note that such operations are confined to the offline training phase. Moreover, in our current datasets, the number of WAPs is around 500, which reflects realistic deployment scenarios and remains well within the capacity of standard training infrastructure. Thus, scalability does not pose a practical limitation in the context of our framework.

To evaluate the real-world feasibility of real-time deployment, we measured the computation time during the online inference phase. Table 3 presents the computation times of both baseline methods and the proposed approach. As shown in Table 3, the inference time of our method is significantly lower than that of conventional WKNN-based techniques [13], which typically operate in the tens of milliseconds, thereby demonstrating strong potential for real-time application even under strict latency constraints.

4.2. Future Work

While FALoc demonstrates promising results, we acknowledge certain areas for future exploration. For instance, the current VGAE architecture and augmentation policy, though effective, might benefit from further optimization for even greater efficiency or adaptability to different types of missingness.

Building upon this work, our future research will focus on several key directions. We plan to investigate more advanced graph neural network architectures. For instance, recent work has applied self-attention mechanisms to graph learning to improve representational capacity [33]. Following such examples, we intend to explore architectures incorporating temporal dynamics or attention mechanisms tailored for heterogeneous graph structures, to further refine the modeling of RP-WAP interactions.

Additionally, we aim to develop adaptive augmentation policies that can dynamically adjust the imputation and deletion strategy based on data characteristics or even feedback from the localization model during training. In particular, since WiFi signals are highly sensitive to environmental changes, recent studies have explored models that adapt to variations across different data collection domains. For example, some works have focused on improving feature extractors to alleviate domain imbalance [34], while others have enhanced the mapping between signal variations and target locations to achieve robust localization [35]. Exploring the framework’s scalability and performance across a wider range of diverse indoor environments, including those with extremely high levels of data sparsity, also remains a priority.

In the current study, all experiments were conducted using a single benchmark dataset, while this allowed us to verify the effectiveness of the proposed method under controlled conditions, we acknowledge that further evaluation of additional datasets is necessary to validate its generalizability across different domains. However, due to differences in label definitions, feature spaces, and data collection protocols, applying the same experimental setting across multiple datasets was considered out of the current study’s scope. As part of our future work, we plan to expand our evaluation to include multiple datasets collected under varying conditions, which will allow a more comprehensive assessment of domain robustness and practical applicability.

Another important direction for future research is to conduct a detailed ablation study to isolate and quantify the contribution of each component in the proposed framework. The current design was developed as an integrated pipeline, where all components are intended to operate synergistically rather than independently. Therefore, we focused on validating the effectiveness of the full system rather than optimizing individual parts in isolation, while we acknowledge that such synergy may not be fully captured through ablation analysis, we recognize the importance of understanding the internal contribution of each module. A comprehensive ablation study will be considered in future work to further clarify the role of each component in enhancing overall localization performance.

While our proposed method demonstrates promising performance compared to the MLP baseline, it has not yet been benchmarked against other state-of-the-art models such as GAN-based, CNN-based, or RNN-based approaches. This limitation stems from practical constraints in the current study, such as time and scope. Nonetheless, we acknowledge that such comparisons are essential to more rigorously assess the generalizability and competitiveness of the proposed framework. As part of our future work, we plan to conduct extensive comparisons with a broader range of advanced models, including GANs, CNNs, and RNNs, which have shown strong performance in data augmentation and localization tasks. Such benchmarking will provide deeper insights into the strengths and limitations of our method in comparison to alternative architectures and strategies.

5. Conclusions

This paper introduced FALoc, a novel framework for Wi-Fi fingerprinting-based indoor localization, designed to effectively address the critical challenge of missing RSSI values—a common issue that degrades localization accuracy in real-world environments. Our core contribution lies in a feature-level data augmentation technique rooted in a graph-based representation of Wi-Fi fingerprints. By employing a bipartite graph of RPs and WAPs and utilizing a VGAE, FALoc learns to model not only the RSSI values themselves but also their observation probabilities. Our empirical evaluation of the UJIIndoorLoc and UTSIndoorLoc datasets, a benchmark known for its data inconsistencies, validated the effectiveness of FALoc. The framework achieved mean localization errors of 7.137 m (12.9% gain) on UJIIndoorLoc and 7.138 m (8.6% gain) on UTSIndoorLoc, consistently outperforming the MLP baselines. This improvement underscores the practical benefits of directly tackling data incompleteness, potentially leading to more reliable and robust indoor localization systems with reduced susceptibility to dynamic signal environments.

Author Contributions

Conceptualization, D.K. and J.-H.P.; methodology, D.K.; software, D.K.; validation, D.K., J.-H.P. and Y.-J.S.; formal analysis, D.K.; investigation, J.-H.P.; resources, D.K.; data curation, D.K.; writing—original draft preparation, D.K.; writing—review and editing, D.K. and J.-H.P.; visualization, J.-H.P.; supervision, Y.-J.S.; project administration, Y.-J.S.; funding acquisition, Y.-J.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partly supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (RS-2022-NR070870), and Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. RS-2019-II191906, Artificial Intelligence Graduate School Program (POSTECH)).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data supporting the results presented in this paper are openly available in UCI Machine Learning Repository at https://doi.org/10.24432/C5MS59.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

FALoc	Feature-level Augmentation for Localization
RSSI	Received Signal Strength Indicator
MLP	Multi-Layer Perceptron
MAE	Mean Absolute Error
MSE	Mean Squared Error
RMSE	Root Mean Square Error
CDF	Cumulative distribution function
WLAN	Wireless Local Area Network
RP	Reference Point
GAN	Generative Adversarial Network
WAP	Wireless Access Point
VGAE	Variational Graph Auto-Encoder
VAE	Variational Autoencoder
LBS	Location-based Service
AP	Access Point
ML	Machine Learning
DL	Deep Learning
RF	Random Forest
WKNN	Weighted K-nearest Neighbor
CNN	Convolutional Neural Network
RNN	Recurrent Neural Network
SVM	Support Vector Machine
MOGP	Multi-Output Gaussian Process
GAT	Graph Attention Network
GNN	Graph Neural Network

References

He, S.; Chan, S.H.G. Wi-Fi Fingerprint-Based Indoor Positioning: Recent Advances and Comparisons. IEEE Commun. Surv. Tutor. 2016, 18, 466–490. [Google Scholar] [CrossRef]
Wu, Z.; Fu, K.; Jedari, E.; Shuvra, S.R.; Rashidzadeh, R.; Saif, M. A fast and resource efficient method for indoor positioning using received signal strength. IEEE Trans. Veh. Technol. 2016, 65, 9747–9758. [Google Scholar] [CrossRef]
Rezgui, Y.; Pei, L.; Chen, X.; Wen, F.; Han, C. An efficient normalized rank based SVM for room level indoor WiFi localization with diverse devices. Mob. Inf. Syst. 2017, 2017, 6268797. [Google Scholar] [CrossRef]
Zhang, S.; Guo, J.; Wang, W.; Hu, J. Indoor 2.5 D positioning of WiFi based on SVM. In Proceedings of the 2018 Ubiquitous Positioning, Indoor Navigation and Location-Based Services (UPINLBS), Wuhan, China, 22–23 March 2018; pp. 1–7. [Google Scholar]
Bi, J.; Zhao, M.; Yao, G.; Cao, H.; Feng, Y.; Jiang, H.; Chai, D. PSOSVRPos: WiFi indoor positioning using SVR optimized by PSO. Expert Syst. Appl. 2023, 222, 119778. [Google Scholar] [CrossRef]
Guo, X.; Ansari, N.; Li, L.; Li, H. Indoor localization by fusing a group of fingerprints based on random forests. IEEE Internet Things J. 2018, 5, 4686–4698. [Google Scholar] [CrossRef]
Gomes, R.; Ahsan, M.; Denton, A. Random forest classifier in SDN framework for user-based indoor localization. In Proceedings of the 2018 IEEE International Conference on Electro/Information Technology (EIT), Rochester, MI, USA, 3–5 May 2018; pp. 537–542. [Google Scholar]
Maung, N.A.M.; Lwi, B.Y.; Thida, S. An enhanced RSS fingerprinting-based wireless indoor positioning using random forest classifier. In Proceedings of the 2020 International Conference on Advanced Information Technologies (ICAIT), Yangon, Myanmar, 4–5 November 2020; pp. 59–63. [Google Scholar]
Narasimman, S.C.; Alphones, A. DumbLoc: Dumb Indoor Localization Framework Using Wi-Fi Fingerprinting. IEEE Sens. J. 2024, 24, 14623–14630. [Google Scholar] [CrossRef]
Bahl, P.; Padmanabhan, V. RADAR: An in-building RF-based user location and tracking system. In Proceedings of the IEEE INFOCOM, Tel Aviv, Israel, 26–30 March 2000; pp. 775–784. [Google Scholar] [CrossRef]
Wang, B.; Gan, X.; Liu, X.; Yu, B.; Jia, R.; Huang, L.; Jia, H. A novel weighted KNN algorithm based on RSS similarity and position distance for Wi-Fi fingerprint positioning. IEEE Access 2020, 8, 30591–30602. [Google Scholar] [CrossRef]
Hu, J.; Hu, C. A WiFi Indoor Location Tracking Algorithm Based on Improved Weighted K Nearest Neighbors and Kalman Filter. IEEE Access 2023, 11, 32907–32918. [Google Scholar] [CrossRef]
Park, J.H.; Kim, D.; Suh, Y.J. WKNN-Based Wi-Fi Fingerprinting with Deep Distance Metric Learning via Siamese Triplet Network for Indoor Positioning. Electronics 2024, 13, 4448. [Google Scholar] [CrossRef]
Kim, K.S.; Lee, S.; Huang, K. A scalable deep neural network architecture for multi-building and multi-floor indoor localization based on Wi-Fi fingerprinting. Big Data Anal. 2018, 3, 4. [Google Scholar] [CrossRef]
Wang, R.; Li, Z.; Luo, H.; Zhao, F.; Shao, W.; Wang, Q. A robust Wi-Fi fingerprint positioning algorithm using stacked denoising autoencoder and multi-Layer perceptron. Remote Sens. 2019, 11, 1293. [Google Scholar] [CrossRef]
Song, X.; Fan, X.; Xiang, C.; Ye, Q.; Liu, L.; Wang, Z.; He, X.; Yang, N.; Fang, G. A novel convolutional neural network based indoor localization framework with WiFi fingerprinting. IEEE Access 2019, 7, 110698–110709. [Google Scholar] [CrossRef]
Kargar-Barzi, A.; Farahmand, E.; Taheri Chatrudi, N.; Mahani, A.; Shafique, M. An Edge-Based WiFi Fingerprinting Indoor Localization Using Convolutional Neural Network and Convolutional Auto-Encoder. IEEE Access 2024, 12, 85050–85060. [Google Scholar] [CrossRef]
Ayinla, S.L.; Aziz, A.A.; Drieberg, M. SALLoc: An Accurate Target Localization in WiFi-Enabled Indoor Environments via SAE-ALSTM. IEEE Access 2024, 12, 19694–19710. [Google Scholar] [CrossRef]
Sorour, S.; Lostanlen, Y.; Valaee, S.; Majeed, K. Joint Indoor Localization and Radio Map Construction with Limited Deployment Load. IEEE Trans. Mob. Comput. 2015, 14, 1031–1043. [Google Scholar] [CrossRef]
Jung, S.h.; Moon, B.c.; Han, D. Unsupervised Learning for Crowdsourced Indoor Localization in Wireless Networks. IEEE Trans. Mob. Comput. 2016, 15, 2892–2906. [Google Scholar] [CrossRef]
Talvitie, J.; Renfors, M.; Lohan, E.S. Distance-Based Interpolation and Extrapolation Methods for RSS-Based Localization with Indoor Wireless Signals. IEEE Trans. Veh. Technol. 2015, 64, 1340–1353. [Google Scholar] [CrossRef]
Bi, J.; Wang, Y.; Li, Z.; Xu, S.; Zhou, J.; Sun, M.; Si, M. Fast Radio Map Construction by using Adaptive Path Loss Model Interpolation in Large-Scale Building. Sensors 2019, 19, 712. [Google Scholar] [CrossRef]
Njima, W.; Chafii, M.; Chorti, A.; Shubair, R.M.; Poor, H.V. Indoor Localization Using Data Augmentation via Selective Generative Adversarial Networks. IEEE Access 2021, 9, 98337–98347. [Google Scholar] [CrossRef]
Sugasaki, M.; Shimosaka, M. Robustifying Wi-Fi Localization by Between-Location Data Augmentation. IEEE Sens. J. 2022, 22, 5407–5416. [Google Scholar] [CrossRef]
Tang, Z.; Li, S.; Kim, K.S.; Smith, J.S. Multi-Dimensional Wi-Fi Received Signal Strength Indicator Data Augmentation Based on Multi-Output Gaussian Process for Large-Scale Indoor Localization. Sensors 2024, 24, 1026. [Google Scholar] [CrossRef] [PubMed]
Deng, Y.; Han, C.; Guo, J.; Sun, L. Temporal and Spatial Nearest Neighbor Values Based Missing Data Imputation in Wireless Sensor Networks. Sensors 2021, 21, 1782. [Google Scholar] [CrossRef] [PubMed]
Torres-Sospedra, J.; Montoliu, R.; Martínez-Usó, A.; Avariento, J.P.; Arnau, T.J.; Benedito-Bordonau, M.; Huerta, J. UJIIndoorLoc: A new multi-building and multi-floor database for WLAN fingerprint-based indoor localization problems. In Proceedings of the 2014 International Conference on Indoor Positioning and Indoor Navigation (IPIN), Busan, Republic of Korea, 27–30 October 2014; pp. 261–270. [Google Scholar] [CrossRef]
Kipf, T.N.; Welling, M. Variational graph auto-encoders. arXiv 2016, arXiv:1611.07308. [Google Scholar]
Yang, L.; Wu, N.; Li, B.; Yuan, W.; Hanzo, L. Indoor Localization Based on Factor Graphs: A Unified Framework. IEEE Internet Things J. 2023, 10, 4353–4366. [Google Scholar] [CrossRef]
Chen, X.; Li, H.; Zhou, C.; Liu, X.; Wu, D.; Dudek, G. Fidora: Robust WiFi-Based Indoor Localization via Unsupervised Domain Adaptation. IEEE Internet Things J. 2022, 9, 9872–9888. [Google Scholar] [CrossRef]
Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lió, P.; Bengio, Y. Graph attention networks. In Proceedings of the International Conference on Learning Representations (ICLR), Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar] [CrossRef]
Maddison, C.J.; Mnih, A.; Teh, Y.W. The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables. In Proceedings of the International Conference on Learning Representations (ICLR), Toulon, France, 24–26 April 2017. [Google Scholar]
Wu, Z.; Wu, X.; Long, Y. Multi-Level Federated Graph Learning and Self-Attention Based Personalized Wi-Fi Indoor Fingerprint Localization. IEEE Commun. Lett. 2022, 26, 1794–1798. [Google Scholar] [CrossRef]
Zhou, Z.; Wang, F.; Gong, W. i-Sample: Augment Domain Adversarial Adaptation Models for WiFi-based HAR. ACM Trans. Sens. Netw. 2024, 20, 1–20. [Google Scholar] [CrossRef]
Zhang, J.; Xue, J.; Li, Y.; Cotton, S.L. Leveraging Online Learning for Domain-Adaptation in Wi-Fi-based Device-Free Localization. IEEE Trans. Mob. Comput. 2025, 24, 7773–7787. [Google Scholar] [CrossRef]

Figure 1. Histogram showing the distribution of

m_{i}

for all RPs in the UJIIndoorLoc dataset, highlighting the prevalence of missing data.

Figure 1. Histogram showing the distribution of

m_{i}

for all RPs in the UJIIndoorLoc dataset, highlighting the prevalence of missing data.

Figure 2. Architecture of the FALoc framework. It comprises several key stages: (1) bipartite graph construction from fingerprint data (feature matrix

X

and mask matrix

M

); (2) a VGAE that learns node embeddings and reconstructs the graph, predicting RSSI observation likelihoods (

\hat{M}

) and values (

{\hat{e}}_{i k}

); (3) an augmentation module that stochastically imputes or removes RSSI features based on VGAE outputs to generate augmented fingerprints (

\tilde{X}

); and (4) a localization module that uses these augmented fingerprints to predict location labels (

{\hat{Y}}^{l}

).

Figure 2. Architecture of the FALoc framework. It comprises several key stages: (1) bipartite graph construction from fingerprint data (feature matrix

X

and mask matrix

M

); (2) a VGAE that learns node embeddings and reconstructs the graph, predicting RSSI observation likelihoods (

\hat{M}

) and values (

{\hat{e}}_{i k}

); (3) an augmentation module that stochastically imputes or removes RSSI features based on VGAE outputs to generate augmented fingerprints (

\tilde{X}

); and (4) a localization module that uses these augmented fingerprints to predict location labels (

{\hat{Y}}^{l}

).

Figure 3. CDF of positioning errors for MLP with and without FALoc.

Table 1. Hyperparameters used for evaluation.

Parameter	Value
D (Initial node feature dim.)	256
VGAE Encoder Arch. (GAT layers)	First, Layer (256 → 128), Mean & Log Std. (128 → 64)
VGAE Latent Dim. ( $D_{z}$ )	64
VGAE Decoder Arch. (Mask)	Bilinear ( $64 \times 64$ )
VGAE Decoder Arch. (RSSI)	Linear ( $64 \to 1$ )
VAE Input Dim.	1040 (RSSI: 520 + Mask: 520)
VAE Encoder Arch.	1040 → 512 → 256 → 128
VAE Decoder Arch.	128 → 256 → 512 → 1040
$α$ (Augmentation threshold)	0.96
$τ$ (Gumbel-softmax temp.)	0.05
Localization MLP Arch. ( $F_{θ}$ , UJIIndoorLoc)	520 → 500 → 500 → 500 → 500 → 2
Localization MLP Arch. ( $F_{θ}$ , UTSIndoorLoc)	589 → 64 → 256 → 128 → 64 → 2
MLP dropout ( $F_{θ}$ , UJIIndoorLoc)	0.5
MLP dropout ( $F_{θ}$ , UTSIndoorLoc)	0.0
Edge dropout (VGAE training)	0.3
Activation function	ReLU
Optimizer	Adam (lr = 0.001)

Table 2. Positioning error statistics and metrics on UJIIndoorLoc and UTSIndoorLoc datasets.

UJIIndoorLoc Dataset
Method	Mean Error (m)	MAE (m)	MSE (m²)	RMSE (m)	$R^{2}$
MLP with FALoc (VGAE)	7.135	4.561	49.603	7.043	0.993
MLP with FALoc (VAE)	7.808	4.934	61.878	7.866	0.9978
MLP without FALoc (Baseline)	8.191	5.214	66.294	8.142	0.991
UTSIndoorLoc Dataset
Method	Mean Error (m)	MAE (m)	MSE (m²)	RMSE (m)	$R^{2}$
MLP with FALoc (VGAE)	7.138	4.232	39.121	6.255	0.722
MLP with FALoc (VAE)	7.382	4.413	40.074	6.330	0.694
MLP without FALoc (Baseline)	7.808	4.819	43.435	6.591	0.605

Table 3. Computation time of the proposed method in training and inference phases (in ms).

Dataset	Training Time (ms)	Inference Time (ms)
UJIIndoorLoc	140.0	0.261
UTSIndoorLoc	117.0	0.226

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kim, D.; Park, J.-H.; Suh, Y.-J. A Wi-Fi Fingerprinting Indoor Localization Framework Using Feature-Level Augmentation via Variational Graph Auto-Encoder. Electronics 2025, 14, 2807. https://doi.org/10.3390/electronics14142807

AMA Style

Kim D, Park J-H, Suh Y-J. A Wi-Fi Fingerprinting Indoor Localization Framework Using Feature-Level Augmentation via Variational Graph Auto-Encoder. Electronics. 2025; 14(14):2807. https://doi.org/10.3390/electronics14142807

Chicago/Turabian Style

Kim, Dongdeok, Jae-Hyeon Park, and Young-Joo Suh. 2025. "A Wi-Fi Fingerprinting Indoor Localization Framework Using Feature-Level Augmentation via Variational Graph Auto-Encoder" Electronics 14, no. 14: 2807. https://doi.org/10.3390/electronics14142807

APA Style

Kim, D., Park, J.-H., & Suh, Y.-J. (2025). A Wi-Fi Fingerprinting Indoor Localization Framework Using Feature-Level Augmentation via Variational Graph Auto-Encoder. Electronics, 14(14), 2807. https://doi.org/10.3390/electronics14142807

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Wi-Fi Fingerprinting Indoor Localization Framework Using Feature-Level Augmentation via Variational Graph Auto-Encoder

Abstract

1. Introduction

2. Proposed Framework

2.1. Bipartite Graph Representation of Wi-Fi Fingerprints

2.2. Variational Graph Auto-Encoder

2.2.1. Node Embedding Modules and Modified GAT

2.2.2. Encoder Architecture

2.2.3. Decoder

2.3. Augmentation Module

2.4. Localization Module

2.5. Model Training and Inference

3. Performance Evaluation

3.1. Dataset Description

3.2. Evaluation Metrics and Baselines

3.3. Experimental Setup

3.4. Results

4. Discussion and Future Work

4.1. Feasibility of Real-World Deployment

4.2. Future Work

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI