1. Introduction
Indoor localization plays a crucial role in diverse applications such as smart cities, smart healthcare, industry and logistics, sales and retail, travel and tourism, and public security [
1]. In this context, the Global Positioning System (GPS) has been widely adopted for outdoor environments, where receivers operate passively by capturing satellite signals. However, owing to non-line-of-sight propagation, weak signal strength, and low positioning accuracy, GPS is not suitable for indoor applications [
2]. Consequently, indoor environments require dedicated indoor positioning techniques capable of estimating user locations within buildings. Over the past decade, various technologies, algorithms, and methodologies have been proposed for indoor positioning, aimed at enhancing positioning accuracy and system reliability.
Indoor localization techniques can be divided into geometry-based ranging methods (e.g., time of arrival, time difference of arrival, and angle of arrival) and signal strength-based approaches [
3,
4]. Although ranging methods can achieve high accuracy, they typically require multiple synchronized reference nodes [
5] or specialized antenna arrays [
6], rendering their practical deployment challenging and expensive. Among signal strength-based methods, fingerprinting is a widely used approach that circumvents multipath effects by constructing a mapping between spatial locations and observable features such as WiFi, Bluetooth, or other radio-frequency signals [
7]. In typical fingerprinting systems, received signal strength indicators (RSSIs) are collected at reference points (RPs) during the offline phase and used to train a pattern recognition algorithm. In the online phase, the trained algorithm estimates the user location based on newly observed RSSIs [
8]. Notably, accurate estimation of the building and floor (for multi-building and multi-floor environments) is essential for reducing positioning errors [
9]. The present work focuses on floor classification.
RSSI-based fingerprinting methods encounter several challenges: RSSI values are highly unstable owing to multipath propagation, shadowing, and device heterogeneity, resulting in inconsistent measurements even at the same location [
10]. Conventional algorithms such as Bayesian inference [
11], k-nearest neighbour (kNN) [
12], and support vector machines [
13] offer low computational complexity but remain sensitive to signal fluctuations, leading to unstable accuracy in practice. Recent advancements in deep learning have been explored to reduce computational complexity and storage, shorten training time, and mitigate RSSI fluctuations, thus addressing the limitations of traditional machine learning algorithms [
14]. However, achieving high accuracy often comes at the cost of significant training overhead. Existing studies [
9,
15,
16,
17,
18] have typically treated floor classification as an independent task, relying on conventional machine learning methods or additional hardware (e.g., barometers or channel state information). However, the performance of these techniques deteriorates when labelled data are limited.
Graph neural networks (GNNs) have emerged as powerful deep learning models for graph-structured and non-Euclidean data. Leveraging message-passing mechanisms, GNNs can effectively capture neighbourhood dependencies and learn expressive node and graph representations, achieving superior performance in node and graph classification tasks [
19]. Recently, GNNs have been integrated into the FIS-ONE framework for indoor floor identification [
20]. Moreover, IndoorGNN offers a supervised end-to-end classification approach by modelling WiFi RSSI vectors as homogeneous graph nodes with kNN-based dynamic edges [
21]. Notably, most existing GNNs focus on homogeneous graphs with single-type nodes and edges, even though heterogeneous graphs contain richer structural and semantic information. Advances in heterogeneous graph [
22] representation learning can effectively enhance the performance of complex network analysis [
23].
To address the limitations of conventional fingerprinting methods in floor classification, this paper proposes a heterogeneous graph–based learning framework for floor classification. Instead of relying solely on raw RSSI vectors, the proposed approach transforms WiFi fingerprints into a heterogeneous graph and applies a graph convolutional network for relation-aware message passing. The feature representations associated with nodes are ultimately mapped to discrete floor predictions. Extensive experiments validate the effectiveness and practicality of the proposed framework. The main contributions of this work can be summarized as follows:
We propose a heterogeneous graph construction framework for floor classification, in which RPs and MAC addresses are explicitly represented as distinct node types connected through relation-specific edges. This formulation captures the multi-level dependencies inherent in WiFi fingerprinting data more effectively than traditional homogeneous graphs.
Building on this structure, we develop a novel heterogeneous graph neural network (HeteroGNN) that introduces edge-type-conditioned message passing on a heterogeneous graph. This design enhances the discriminative capability of nodes.
We perform a comprehensive comparison with conventional convolutional neural networks (CNNs), showing that the proposed approach maintains strong robustness even on small datasets. The model shows robust noise resistance. Moreover, the proposed method is computationally efficient, requiring less training time and memory than CNNs, while achieving the smallest peak memory footprint among all evaluated models.
In addition, the proposed method outperforms conventional homogeneous GNNs on the UJIIndoorLoc dataset. It also consistently outperforms traditional machine learning methods such as Random Forest, AdaBoost, XGBoost, LightGBM, and CatBoost.
The remaining paper is organized as follows.
Section 2 reviews related work in indoor localization and floor classification.
Section 3 describes the experimental environment, data collection, and preprocessing.
Section 4 outlines the proposed methodology, detailing the heterogeneous graph construction and HeteroGNN model.
Section 5 presents the experimental results and provides a comprehensive comparison with baseline methods. Finally,
Section 6 presents the concluding remarks and highlights potential directions for future research.
3. System Design and Data Collection
This section describes the proposed floor-classification system, including the experimental environment, setup, hardware and software configurations, data collection methods, database structure, and complete floor-classification workflow.
3.1. Environment and Experimental Setup
This study adopted the Communication Research Laboratory (CRL) database, established at Dongguk University, Seoul Campus. RSSI fingerprint data collection and real-time localization experiments were conducted within two academic buildings. A total of 296 RPs were defined, with 74 RPs per floor across four floors. Each RP block measured approximately 2 m × 2 m, with an inter-RP spacing of 4 m. As shown in
Figure 1, the experimental environment included the 7th and 8th floors of the New Engineering Building and 3rd and 4th floors of Wonheung Hall.
The RPs were numbered as follows: New Engineering Building 7th floor (RPs 1–74) and 8th floor (RPs 75–148), and Wonheung Hall 4th floor (RPs 223–296) and 3rd floor (RPs 149–222).
Figure 1 also illustrates the spatial layout of the indoor test environment, which comprises two areas with distinct geometric characteristics.
Figure 1a,b correspond to a regular rectangular region measuring 45 × 34 m, where 74 RPs were uniformly distributed on each floor. In contrast,
Figure 1c,d represent an elongated and less regular floor layout with overall dimensions of 106 × 90 m, also configured with 74 uniformly arranged RPs per floor, but spanning a considerably larger spatial extent.
3.2. Hardware and Software Configurations
Hardware: An MSI GE75 laptop (MSI, New Taipei City, Taiwan) was used as the server for data processing. A custom-built robot was used for data collection and real-time localization. The robot included a Jetson Nano B01 (NVIDIA, Santa Clara, CA, USA), a dual-band wireless network card, a display screen, a cooling fan, a ROS robotic driver board, motors with encoders, a tracked chassis, a battery, and a firmware base module, along with other essential components.
Software: The robot operated on Ubuntu 18.04 with Python 3.11.7. The server ran TensorFlow 2.14.0 and Python 3.11.7 on an Intel Core i7-8700K @ 3.70 GHz CPU and NVIDIA GeForce GTX 1080 Ti GPU. In terms of software functionality, the robot incorporated a custom-designed RSSI data collection program, while the server executed the data preprocessing module, a heterogeneous GNN-based floor classification program, and a real-time floor classification program.
Figure 2 offers a clearer visualization of the localization scenario. Specifically,
Figure 2a,d illustrate the New Engineering Building and Wonheung Hall, respectively. Numerical indicators 3, 4, 7, and 8 denote the corresponding floor levels within each building.
Figure 2b depicts the 7th floor of the New Engineering Building, while
Figure 2c illustrates the experiment conducted on the 7th floor. During data acquisition, the robot collected RSSI signals from multiple Access Points (APs). These data were transmitted to the server for preprocessing and database construction. Notably, the robot sequentially moved from RP1 to RP74 to complete data collection. For real-time floor classification, the robot collected RSSI signals from different APs at an unknown RP and transmitted them to the server, which performed real-time floor classification. The computed classification results were then displayed on the computer.
3.3. Robot-Assisted Data Collection
The robot was remotely controlled, and a specialized program was run to initiate data collection. Once executed, the program automatically collected RSSI data for a predefined number of samples. Each data file contained the MAC addresses and corresponding RSSI values of all detectable WiFi signals at the given RP. The collected data were then automatically saved in a text (.txt) file within a designated folder. Following data collection, the program entered a verification phase. If a .txt file was found to be empty, it was automatically deleted, and a new file was generated at the same location. If the file contained data, the program was terminated. The data collection process explicitly considered temporal variations in RSSI measurements, defining the morning as 0:00–12:00 and the afternoon as 12:00–24:00. To capture potential time-of-day effects, data were collected independently in both periods. Moreover, the entire data collection campaign was conducted over a continuous period of four days.
After data collection, data preprocessing was performed, followed by the generation of the RSSI database. The database consisted of four main components: Floor Labels, Location Labels (representing RPs), MAC Addresses, and RSSI Values. The first column contained the floor labels, which were categorized into four levels: 3rd, 4th, 7th, and 8th. Similarly, the second column contained location labels, ranging from 1 to 296, with 74 locations per floor. The first row represented the MAC addresses, while the lower-right section of the table contained the RSSI values. The database was stored as an .xlsx file using Microsoft Excel.
3.4. Classification Workflow
Figure 3 illustrates the robot-assisted WiFi-based floor classification system using a heterogeneous GNN. The operational workflow consisted of two main phases: offline training and online real-time localization.
Offline Phase:
RSSI Data Collection: The robot sequentially collected RSSI signals and floor information from all available APs at each RP to construct the RSSI database.
Data Preprocessing and Model Training: Prior to being input into the heterogeneous GNN for training, the collected data underwent preprocessing. The heterogeneous GNN was trained to perform floor classification based on the processed RSSI data.
Online Phase (Real-Time Floor Classification):
Real-Time RSSI Data Acquisition: The robot scanned nearby WiFi signals and transmitted the data to the server. During this process, it recorded the floor information along with the MAC addresses and corresponding RSSI values.
Graph Construction: A heterogeneous graph was constructed using MAC and RP nodes, with edge features encoding their multi-level relationships. This structured representation served as the input to the heterogeneous GNN model.
Heterogeneous-GNN-Based Floor Classification: The constructed heterogeneous graph was fed into the pre-trained heterogeneous GNN to perform floor-level classification. The system output the predicted floor label in real time.
This structured workflow enabled efficient and accurate floor classification through a heterogeneous-GNN-based approach.
Figure 3.
Workflow of the robot-assisted WiFi-based floor classification system using heterogeneous GNN: (a) Offline phase. (b) Online phase.
Figure 3.
Workflow of the robot-assisted WiFi-based floor classification system using heterogeneous GNN: (a) Offline phase. (b) Online phase.
6. Conclusions
We established a novel heterogeneous GNN (HeteroGNN) for indoor floor classification. This approach explicitly modelled the relationships among RPs and MAC addresses as a heterogeneous graph. Extensive experiments were conducted to evaluate our GNN-based models against a conventional CNN and HomoGNN on two distinct datasets: our custom CRL database and the public UJIIndoorLoc database. HeteroGNN frequently demonstrated superior performance over the CNN and HomoGNN. Specifically, the proposed model achieved a classification accuracy of 93.88% on the UJI dataset. Its effectiveness was further validated in a real-time experiment on an unseen subset of the CRL database, where it achieved a high classification accuracy of 97.3%. Moreover, HeteroGNN demonstrated strong robustness to noise and maintained high accuracy even with small-scale samples, requiring only one-tenth of the data required by the CNN. It also offered clear computational advantages, including faster training and a substantially smaller memory footprint compared with the CNN-based baseline. In addition, the proposed model outperformed five traditional machine learning classifiers—Random Forest, AdaBoost, XGBoost, LightGBM, and CatBoost—on the UJIIndoorLoc dataset, further confirming its strong discriminative capability. Overall, these results underscore the advantages of using heterogeneous graph structures to capture the complex relational features of WiFi RSSI data. The proposed framework, outperforming both CNN-based methods and homogeneous GNNs, offers a practical solution for scenarios where data collection is costly and signals are unstable. Future work will be aimed at exploring alternative graph designs and large-scale deployment to further enhance the capabilities of the proposed framework.