An Infrastructure-Free Magnetic-Based Indoor Positioning System with Deep Learning

Infrastructure-free Indoor Positioning Systems (IPS) are becoming popular due to their scalability and a wide range of applications. Such systems often rely on deployed Wi-Fi networks. However, their usability may be compromised, either due to scanning restrictions from recent Android versions or the proliferation of 5G technology. This raises the need for new infrastructure-free IPS independent of Wi-Fi networks. In this paper, we propose the use of magnetic field data for IPS, through Deep Neural Networks (DNN). Firstly, a dataset of human indoor trajectories was collected with different smartphones. Afterwards, a magnetic fingerprint was constructed and relevant features were extracted to train a DNN that returns a probability map of a user’s location. Finally, two postprocessing methods were applied to obtain the most probable location regions. We asserted the performance of our solution against a test dataset, which produced a Success Rate of around 80%. We believe that these results are competitive for an IPS based on a single sensing source. Moreover, the magnetic field can be used as an additional information layer to increase the robustness and redundancy of current multi-source IPS.


Introduction
The improvements in Indoor Positioning Systems (IPS) technology have greatly impacted localisation-based services, consequently increasing their popularity and number of users [1]. Moreover, IPS have a large number of applications, such as navigation in stations and airports, emergency response services, guides in museums and shopping centres, and localisation-based advertising [1,2].
To overcome the poor performance of the Global Navigation Satellite System (GNSS) in indoor environments, several authors propose different IPS [3,4]. They usually resort to technologies such as the Radio Frequency Identification (RFID), Wi-Fi, Bluetooth, or Ultra-Wide Band (UWB) [5,6]. However, these technologies still present limitations that compromise the scalability of location-based services, as they require additional dedicated infrastructure [5]. For example, RFID requires tags, UWB uses anchors, and Bluetooth searches for beacons at various landmarks to locate the users [7]. Although Wi-Fi is available in most buildings, this source lacks performance as its Radio Frequency (RF) signal suffers from shadowing and multi-path problems [7,8]. Furthermore, RF signals can cause electromagnetic interference with critical systems, forbidding their use in some locations, such as planes or hospitals.
On the other hand, infrastructure-free solutions dismiss the need for additional infrastructure. Fingerprinting is one of such techniques, which often leverages the pervasively available geomagnetic field and the Wi-Fi networks. It uses fingerprints, which are maps of the buildings with the annotation of the expected sensor reading at each location. They can then be compared to new data, to localise a user [9]. In general, fingerprinting-based solutions require an extensive process to collect and update buildings information, hindering their implementation in large-scale environments [10]. Moreover, these IPS can either be based on a single source, as the Wi-Fi signals, or on the combination of different sources of information, applying probabilistic methods such as particle filters. Still, the combination of different sources depends on Wi-Fi availability for the system's initialisation. This Wi-Fi dependence is expected to become a major limitation for existing IPS, following recent scanning restrictions from Android devices (https://developer.android.com/guide/topics/connectivity/wifi-scan). Moreover, as the next generation of mobile networking technology, i.e., 5G, covers both the inside and outside of buildings, Wi-Fi may suffer a drop in usage. Therefore, an alternative solution that works in the absence of Wi-Fi is necessary.
We propose to replace the current dependency on Wi-Fi scans in infrastructure-free solutions by exploring the magnetic field inside buildings. In indoor environments, the magnetic field is highly disturbed by electromagnetic interference caused by ferromagnetic architectural features and electronic equipment. These disturbances create unique patterns that can be utilised for localisation. Some solutions have been developed concerning the use of the magnetic data in this context [5,7]. Magnetic field-based IPS often offer satisfactory levels of accuracy. Nevertheless, their performance may be compromised by technical considerations. Manufacturers install magnetometers with different characteristics in the devices, e.g., level of noise tolerance and sensitivity [7], which affect the collected signals. Moreover, additional issues might arise as the same magnetic field value can be measured at multiple locations [11].
We intend to overcome the limitations of magnetic field-based methods by providing more precise information over time. The proposed system is independent of the user and the device, as it does not solely rely on the absolute values of the magnetic field. We employ fingerprinting-based methods, together with data augmentation, to reduce the extensive data collection for fingerprinting.
This novel methodology also leverages the potential of Deep Neural Networks (DNN), to replace the current dependence on Wi-Fi. Although the proposed system relies on the magnetic field only, our motivation is to extend its use to multi-source systems, either as an alternative method for the systems' initialisation or to improve the positioning accuracy over time. For this study, we collected data with different smartphone models, to avoid device-specific dependency. Moreover, we based our work on the hypothesis that although consecutive magnetic field values may vary between two devices, the differences along a path remain similar, as illustrated in Figure 1, which shows the magnetic field measured by two different smartphone models. It can be noticed that although the absolute values are distinct, the signal shape is very similar.
As such, we propose to extract different features from magnetic sequences: (1) magnetic absolute values, (2) magnetic differences and (3) combinatorial magnetic differences. These features significantly reduce the ambiguity of magnetic data reported in indoor environments [2]. We use a DNN to estimate, given a new magnetic field reading, the corresponding probability of matching the correct position on a magnetic map. Then, two postprocessing techniques, one based on a threshold and the other based on the highest probability pixels, we produce the most probable regions for the user position.
The proposed indoor localisation methodology explores DNN capacities and magnetic features, providing an alternative for typical Wi-Fi-based approaches. Herewith, our major contributions are the following: • An infrastructure-free indoor localisation system that relies on magnetic data and is robust to the sensors, variability between smartphones.
• A novel data augmentation algorithm that infers all possible magnetic sequences, within the building's magnetic fingerprint, with only a set of predefined collected trajectories. • A DNN architecture that is able to train different networks in each epoch, to be then merged into a consolidated output. • Two different postprocessing methods to evaluate the performance of the magnetic field-based localisation system. This paper is organised as follows: In Section 2, the related research on magnetic field-based IPS is discussed. In Section 3, the proposed method is described. In Section 4, the performance of our solution is presented and discussed, and in Section 5 the main conclusions are drawn and the future work envisioned.

Magnetic eld in z-axis
Time (s) Magnetic eld (μT) Figure 1. Magnetic field signals of the same trajectory perceived in the z-axis by two different smartphones: Pixel and Nexus 6P depicted in blue and green, respectively.

Related Work
Magnetic field data has been explored by some authors, as an alternative source for IPS. RF-based localisation methods may be infeasible in large-scale scenarios, since the signal becomes more unstable as the buildings' dimensions increase [11].
Often, particle filters are employed in the context of magnetic field-based IPS to estimate the user's consecutive positions. Kim and Seo [4] proposed a system that contained both a particle filter and an encoder system. The particles' weights were updated using multiple magnetic sensors and also three magnetic field maps, i.e. horizontal and vertical intensity maps, and a direction information map. Furthermore, in the propagation step, the particles were expanded by the velocity of the encoder. Some researchers focused on providing solutions to known issues related to magnetic data. For instance, the work proposed in [7] used magnetic data from several smartphones, creating Magnetic Patterns (MP) to overcome the limitation associated with different characteristics of magnetometers embedded in smartphones. The results showed an increased performance when compared to other MP approaches. Additionally, in the same study, an effort was also directed into analysing the impact in the localisation accuracy using distance metrics, such as the Euclidean distance and the Dynamic Time Warping (DTW) to match the MP. Contrarily, in our framework, we leverage DNN to learn the magnetic field patterns.
DNNs have been used in Wi-Fi-based indoor localisation [12,13]. However, few works report their use on magnetic field data. Lee and Han [2] propose a magnetic approach for indoor positioning using DNN to classify magnetic landmarks. In this study, recurrence plots were used as the input for the DNN, composed of convolutional and fully connected layers. While performing magnetic landmark classification, the authors also discovered that the magnetic data in the z axis was the most accurate when compared to xyz and xy alternatives. In the same context, an IPS called AMID was proposed in [8], which locates by recognising magnetic sequence patterns using a DNN. The authors remarked that the positioning error was related to the magnetic map resolution. Bae and Choe [11] presented a magnetic field-based IPS relying on Recurrent Neural Networks (RNN). The proposed method was compared with RF fingerprinting techniques showing better average positioning errors. However, for the experimental set-up, the authors used the magnetometer of a single smartphone to collect data, which may harm the generalisation to other devices.
The fusion of different sources has also emerged to overcome their individual limitations. In [14], the magnetic data was fused with visual images, to continuously track the user on a per-step basis using a particle filter that updates the location estimation after each user step. Additionally, a convolutional network was also considered as an approach to extract features from the used images. Finally, in [15], the fusion of magnetic and light data, using the smartphone's sensors, was explored. This IPS used DNN, specifically Long-Short-Term-Memory (LSTM) layers, and analysed the impact of different design parameters, such as the test data size, the number of considered layers, and hidden units.
Some works in the literature leverage DNN for indoor location, but only a few use them with the magnetic field. Although large datasets are required to train DNN, data augmentation techniques can mend this requirement. In our research, we explore DNN composed of convolutional and LSTM layers with magnetic field data to locate the users. Our approach can later be integrated into multi-source systems towards a more robust localisation.

Framework for Indoor Positioning Using Magnetic Field
This Section describes the proposed magnetic field-based IPS, which provides the most probable regions for the users' location. The solution is depicted in Figure 2 and includes seven main stages: data acquisition, preprocessing, magnetic fingerprinting, magnetic sequences, feature extraction, neural network, and postprocessing. In the first step, human trajectories were collected from the inertial sensors embedded in smartphones. In the preprocessing, the magnetic data is projected onto the Earth coordinates frame, for later use in the fingerprinting method. The created magnetic fingerprint allows defining several magnetic sequences. Additionally, more trajectories can be considered for building sequences through an implemented data augmentation algorithm. A feature extraction process retrieves the inputs for the DNN, which then outputs probability maps of possible locations. After postprocessing, our solution estimates the most probable regions for the specific users' locations.

Preprocessing
Smartphones are typically equipped with a set of sensors, such as the Inertial Measurement Unit (IMU). IMUs incorporate an accelerometer, a gyroscope and a magnetometer, which collect data over three orthogonal axes.
The accelerometer is sensitive to both linear and gravitational accelerations, measured in m/s 2 . The gyroscope assesses the angular velocity in rad/s. At last, the magnetometer measures the strength of the local magnetic field in µT, which is composed of the geomagnetic and environmental fields.
As the readings from these sensors get severely affected by changes in the devices' orientation and position, we apply the sensor fusion algorithm described in [16]. It translates the collected data to the global Earth frame, which is independent of the orientation and position of the device. A second-order complementary filter is applied, which takes into account the long-term reference of the gravity direction to the North, and the short-term accuracy of the gyroscope in measuring the angular rotation of the device. The collected signals were resampled to 100 Hz to remove the sampling rate differences between smartphones. Moreover, a second-order low pass filter with a cutting frequency of 1 Hz was applied to remove noise from signals.
By applying these preprocessing mechanisms to all collected data, we ensure that the proposed framework will work independently from the position and orientation of the devices.

Magnetic Fingerprinting
Buildings are characterised by multiple features, which relate in space to form a relatively unique, multi-dimensional feature spot map. The building map includes environmental features such as the Earth's magnetic field fluctuations. Although the magnetic field remains naturally stable around the same area, its pattern is highly affected indoors. The presence of metallic construction materials and electrical equipment causes disturbances in the magnetic pattern, which commonly produce unique patterns that can be used to identify a specific location [10].
In this work, magnetic fingerprints were built with the traditional approach, where a user walks throughout the building while manually annotating reference points. The sensor fusion mechanism of Section 3.1 is applied to the data, and the temporal-dependence of values is eliminated by converting them to the spatial domain. This process stores location-dependent characteristics of the magnetic field signal into a multi-dimensional fingerprint, to provide an absolute reference for the localisation. The fingerprint is then characterised by a set of two-dimensional coordinates (x, y), in meters, associated with the magnetic field vector in the Earth frame (v x , v y , v z ), in µT.
One major concern of fingerprinting-based solutions is the time-consuming data collection process, especially in large buildings. Different techniques that may be applied, ranging from point-by-point calibration to walking surveys between reference points with posterior interpolations and extrapolations. Nevertheless, this process does not limit our solution, as instead of searching for an exact position, we search for the most probable region for the user's location. Thus, we dismiss the need to cover the buildings' entire area during data collection.

Magnetic Sequences
The absolute localisation of a user using only the magnetic field is not trivial, as different locations inside a building may have similar magnetic field values. Hence, we cannot rely exclusively on the absolute value of the magnetic field in a specific location to determine the user's position. On the other hand, a set of magnetic field values along a trajectory is more distinctively representative of the user's final location. Moreover, different devices show different absolute values in the same locations, but when considering a sequence, the differences between consecutive values remain similar for the same trajectory.
Thus, each user trajectory is segmented on short sequences of length L based on the temporal magnetic field data conversion to the distance domain.

Data Augmentation
To ease the data collection process, we designed a set of predefined trajectories to cover the entire building locations, but not all possible trajectories neither directions were recorded. Instead, we developed an augmentation algorithm that finds all possible trajectories of length L in the magnetic field fingerprint. The constructed fingerprint creates connections between some positions of the collected trajectories, which allows the extraction of trajectories that were not acquired during the data collection procedure and the extraction of trajectories in opposite directions. For instance, considering the two trajectories represented in Figure 3, the connections between Route 1 and Route 2 create artificial trajectories as represented in point A. Thus, the augmentation algorithm is able to retrieve six different magnetic sequences of length L at point A. Moreover, to avoid overfitting to the fingerprint data, noise was added to the magnetic sequences in the data augmentation process. Two levels of random noise were applied from Normal distribution of mean 0 and variance 1, and then scaled by 0.1 and 0.2 multiplication factors, together with one level of random noise, retrieved from the Normal distribution of mean 0.02 and variance 0.2. The implemented data augmentation process is one of the key advantages of this work, as it greatly simplifies the effort necessary for collecting data. From this approach, it is possible to construct fingerprints with a limited set of trajectories without covering the entire space.

Feature Extraction
A magnetic field sequence is represented by a 4-dimensional vector given by S = {v (1) , . . . , v (L) } where v (i) denotes the i-th element in the set of the magnetic sequence values on the three orthogonal axes, (x, y, z), and magnitude, m , and L represents the length of the magnetic sequence in pixels.
where v m is given by Equation (1): The feature set is composed of the magnetic absolute values, magnetic differences and combinatorial magnetic differences of the three orthogonal axes and magnitude. A description of each feature set is provided below. In Equations (2)-(4), v (i) ax denotes the i-th element in the set of the magnetic sequence values on a specific ax component, where ax ∈ {x, y, z, m }.
Magnetic absolute values: The magnetic absolute values are the magnetic sequence values, with length L, without any transformation, given by the set in Equation (2).
Magnetic differences: The magnetic differences correspond to the pixel-pixel difference along a magnetic sequence resulting in a set with L − 1 length given by the set in Equation (3).
Combinatorial magnetic differences: The combinatorial magnetic differences include all possible differences without repetition in a magnetic sequence resulting in a set with L! 2!(L − 2)! length given by Equation (4).
The feature set includes the magnetic differences and combinatorial magnetic differences, due to the variability of the absolute magnetic field values and different offset levels that are present in different smartphone models. The differences between magnetic field values are typically the same and are not influenced by different smartphones. The absolute values were also considered for the feature set as they can contain some meaningful information, apart from the differences, that can be extracted by deep learning techniques.
The labels used in this experiment are probability maps. For each prediction, we have a probability map with the same shape as the building map, where the user's location is set with the highest probability (=1) and the remaining map is set to zero probability. The probability map sums to 1.

Deep Neural Network
Deep Learning techniques, as DNN, have been successfully applied in different contexts due to their strong ability to detect complex patterns. DNNs are composed of multiple non-linear functions mimicking the human brain. They can automatically learn features from higher levels of abstraction and complex functions that map the input to the output. Due to the complex nature of indoor localisation, DNNs are a competitive strategy for this application [17].
The DNN architecture proposed in this work contains the following layers: • Convolutional Neural Network (CNN) layer: CNNs are commonly applied to image data, as they learn the two-dimensional representation of an image by successfully capturing spatial information through the application of filters. CNNs can be one-dimensional and be applied to time series, where they instead explore the temporal dependence between values. • Long Short-Term Memory: LSTM is a type of RNN capable of learning long-term sequential or temporal dependencies [18]. • Flatten layer: A flatten layer converts a multidimensional input into a single dimension. This is usually used in the transition to a dense layer. • Dense layer: A dense layer is a regular fully connected layer that densely connects all input neurons, from the previous layer, to all output neurons. • Max-Pooling layer: A max-pooling layer is a down-sampling process that transforms the input, by reducing its dimension space, while storing the maximum value of (usually) non-overlapping regions. • Dropout layer: A dropout layer, as the name suggests, drops out units in a neural network by randomly setting inputs units to zero, with a specified frequency, at each step in the training stage.

Neural Network Architecture
The neural network architecture used in this experiment is composed of four neural networks. The three sets of computed features have different lengths, so they were trained independently. In the end, a final network combines their results. For this process, three different types of architectures were used, namely DNN Type 1, Type 2, and a final Neural Network Combination. The illustrations in Figures 4-6 were produced with the support of the NN-SVG Tool [19] .

DNN Type 1
The DNN Type 1 architecture was used twice, to learn both the magnetic differences and the combinatorial magnetic differences feature sets. This neural network is depicted in Figure 4 and contains four one-dimensional CNNs, followed by a one-dimensional max-pooling layer, a flatten layer, and a final dense layer. l1-l2 regularisation of factor 0.03 was applied to the third, fourth, and fifth layers, to prevent overfitting. Table 1 describes the designed layers' details for this architecture.

Conv1D
MaxPooling 1D Dense INPUT OUTPUT Figure 4. Neural Network Architecture for magnetic differences and combinatorial magnetic differences. Table 1. Deep Neural Networks (DNN) Type 1 layers' details, including the type of layer, the number of hidden units, the kernel size and the activation function. The kernel size of three was employed in the magnetic differences feature set, while the kernel size of five was used for the combinatorial magnetic differences set. The DNN Type 2 was used to train the magnetic absolute values. Its architecture is depicted in Figure 5, and contains four LSTM layers, followed by one dense layer. Moreover, two dropout layers were introduced to reduce overfitting. The layers description is available in Table 2.

Neural Network Combination
The final architecture aims to combine the individual results of the three individual trained networks. This network first concatenates the outputs of DNN Types 1 and 2. Then, a dense layer and a final reshape layer are applied to produce the output probability maps, as depicted in Figure 6. The dense layer description is available in Table 3.

Neural Network Combination Loss
The training of neural networks uses loss functions to iteratively improve the prediction, by updating the weights of neurons. This update is done by subtracting to each weight their gradient with respect to the loss, depending on a predefined learning rate. In our approach, the Adam optimisation algorithm conducts this process, together with a categorical cross-entropy loss function, which is suitable for multi-class classification. The categorical cross-entropy loss quantifies the difference between two probability distributions, given by Equation (5): where N is the output size, y is the target value andŷ is the predicted value.

Data Collection
To the best of our knowledge, there are no publicly available datasets that match the requirements to evaluate the proposed IPS. These requirements include human trajectories containing both inertial and magnetic measurements, along with ground-truth positions of the building's map for the fingerprint creation. Hence, we created a dataset for the development and evaluation of the presented framework. The data were collected in a floor plan of an office building with 35 m by 12 m. The dataset contains a diversity of human trajectories acquired by different users and smartphones on distinct days to demonstrate the robustness of our framework. The annotation of the trajectories was done for validation purposes.
The data collection was conducted with a logger app that records data from the accelerometer, magnetometer and gyroscope. Additionally, this app allows for the annotation of events by the registry of their timestamps when a user taps the screen.
To ensure consistency between acquisitions, some guidelines were provided to users for the data collection procedure: • Before starting each trajectory collection, the user was asked to calibrate the smartphone's magnetometer. The tri-axis magnetometer calibration was guaranteed by performing the movement of drawing a big eight in the air with the smartphone; • The initial and final positions, together with all direction changes, were annotated on the smartphone, saving the corresponding timestamps. This procedure allowed to create a set of reference points during the acquisition where the real position and the associated magnetic field values are well known; • The user was asked to maintain a constant velocity during the acquisition so that the distance between annotated positions could be linearly interpolated. We simplified this analysis since this framework will be combined with an existing indoor location solution [16] that already covers the automatic estimation of the distance travelled by the user, with step detection and dead reckoning methods; • Finally, during the data collection process, the smartphone was being held in front of the user to facilitate the annotation procedure.

Dataset
The created dataset is composed of 27 trajectories across the building, which are split into two groups. The data were collected on different days and times of the day to ensure the variability of a real scenario.
The Fingerprint Routes are available in Figure 7c and were designed to construct the magnetic fingerprint of the building. These routes follow the process of the traditional construction of fingerprints, where a user covers the main halls and rooms of the building, from a start to an endpoint. The criterion was to design the strictly necessary trajectories to optimise as possible the usual time-consuming process. Each corridor should have a single passage, except in large corridors, where more than one trajectory may be designed. Some rooms were left out for their small dimensions.
The Test Routes are depicted in Figure 7a,b and cover the area of the constructed fingerprints to assert the robustness of the proposed framework. These trajectories may overlap with Fingerprint Routes, but some conditions differ, as in some cases the direction of movement. Data were collected by four users with five different smartphones, ranging from older to newer Android versions: LG G7, Xiaomi Mi A2 Lite and Mi8, Huawei Nexus 6P and Google Pixel. The magnetic sensors characteristics of the different devices are presented in Appendix A Table A1. 117 acquisitions were collected, which include a total of 73 min of data. The complete dataset attributes are available in Appendix B Table A2.

Train, Validation and Test Sets
All collected data from Fingerprint Routes and Test Routes were preprocessed, as described in Section 3.3. During our experiments, we found out that the z-axis from the magnetic data, together with the computed magnitude, were the components that provided the most valuable information. Therefore, they were the only leveraged components.
The fingerprint construction process, described in Section 3.2, was performed using a single set of Fingerprint Routes, collected by user 1 with the LG G7 smartphone. The resulting fingerprint is displayed in Figure 8 and includes the main building corridors and walking paths inside most rooms. It has a resolution of 0.2 m per pixel, which is possible after the linear interpolation of magnetic field values between the annotated coordinates in the designed trajectories, i.e., the initial and final positions, in addition to all direction changes. This results in a set of coordinates with a maximum difference of 0.2 m on both the x and y axes.

Magnetic eld Z-axis X coordinates (pixels) Y coordinates (pixels)
Magnetic Field (μT) The training set was obtained from the constructed fingerprint, with the data augmentation process of Section 3.4. The length of each sequence L was defined as 4 m, each containing L = length (in meters) resolution = 4 0.2 = 20 magnetic field values. Followed by the noise addition process, a total of 5688 sequences were created for training the neural network architecture.
Regarding the validation and test sets, the magnetic sequences were extracted from all used acquisitions, following the process of Section 3.3. However, to effectively evaluate the performance close to real scenarios, the noise addition process was not applied. As such, the validation set was constructed using data from Fingerprint Routes of user 2, collected with the Google Pixel. Lastly, the test set contained the Fingerprint Routes of user 2 that were collected with the Xiaomi Mi8, together with the Test Routes from all users with all smartphones. After obtaining the magnetic sequences, the inputs for the neural network were the extracted features from Section 3.5.

Evaluation
After training the model, an experimental evaluation using the test set was performed. Our method follows the common usage of fingerprinting-based IPS, where the most probable regions are used either to provide a user's location or to at least initialise the localisation procedure. As such, the output of our DNN is a probability map, from which the most probable regions are selected. This allows reducing the localisation search area and might provide further improvement to Wi-Fi dependent multi-source systems, e.g., Wi-Fi-based IPS contain errors varying from 5-15 m [20].
In order to evaluate our results, we defined a new metric, the Success Rate, given by Equation (6), that was applied to the most probable regions from postprocessing.

Success Rate = Correct Predictions All Predictions
A Correct Prediction was defined as a prediction that contains at least one probable pixel in a distance to the true position shorter than 4 m. The distance was measured by Dijkstra's algorithm [21], which finds the shortest path between nodes in a graph, in this case between filled pixels in the probability map. The 4 m value was empirically chosen since it maximises the relationship of Success Rate versus distance to prediction value and it is a value that we consider acceptable for IPS. Our goal was to maximise the Success Rate while reducing the distance error.
Additionally, a distance error between the true position and the closest probable pixel using Dijkstra's distance for each predicted position was also considered for evaluation purposes.

Postprocessing
In a real case scenario, either for magnetic-based single-source or multi-source localisation systems, it is necessary to determine a set of regions where the user/device may be located. For this purpose, postprocessing was applied to the predicted probability maps to discover the most probable regions.
In this work, we introduce two postprocessing techniques, one based on a threshold and other on the selection of the N-pixels with the highest probability. With the combination of both techniques, the final result is a map containing the most likely regions.

Threshold Approach
In the first approach, we defined a threshold to select the first set of possible regions from the probability map, containing all pixels (i.e., positions) above the select value. This threshold cannot be selected based solely on the improvement of the Success Rate, as in an extreme case, it would be so small that all possible pixels would be included. Therefore, we searched for the threshold the maximises the Success Rate but minimises the number of pixels included. For this purpose, a range of thresholds was tested on the training set (from 0.002 to 0.04). The relationship between the achieved Success Rate and the mean number of included pixels is plotted in Figure 9a. The optimal threshold value corresponds to the knee of the curve, here obtained from the KneeLocator package [22]. The optimal threshold was found to be 0.006, which achieved a Success Rate of 96.4%, with a mean number of pixels included of 14.2 (2.84 m). This means that by only considering the pixels with a probability higher than 0.006, a mean of 14.2 pixels have a probability above the threshold, from which we can achieve a Success Rate of 96.4%.

N-Pixel Approach
In the second approach, from the probability map, the N pixels (i.e., positions) with the highest probability were retrieved. For this purpose, we tested different N pixels values (ranging from 1 pixel (0.2 m) to 30 pixels (6 m)) to assert the value that gives the highest mean Success Rate, from all routes of the training set. The relationship between the number of pixels and the mean Success Rate is represented in Figure 9b. The optimal N, obtained with the KneeLocator package, was 26 pixels (5.2 m), with a Success Rate of 96.9%. In other words, by considering only the 26 pixels with the highest probability we can achieve a Success Rate of 96.9%.

Combination
The threshold approach provided a probability threshold of 0.006, and the N-pixel approach defined an optimal value of 26 best pixels. To evaluate the performance of the DNN, we decided to combine both approaches, considering a maximum of 26 pixels (5.2 m) and a minimum probability of 0.006. With both conditions, we obtained the results of Table 4, which includes the routes from Test Routes (Figure 7a,b) and Fingerprint Routes (Figure 7c) that were not used for training.

Experimental Results
The experimental results for the test set are presented in Table 4. The Success Rate was calculated using Equation (6) and the distance error refers to the mean Dijkstra's distance within each route from the ground truth position to the closest predicted region. The results demonstrate that the proposed method has a Success Rate of around 80% and a distance error of 2.5 m to the correct position.
Moreover, the results show consistency between different users and smartphones. Hence, it was possible to overcome the difficulty of employing different devices.
Further analysis of the results allows concluding that in all acquisitions of the same designed trajectory, the Success Rate remains similar. For instance, TestRoute10 Success Rate ranges from 69 to 76% and TestRoute13 Success Rate ranges from 82 to 95%.
Some trajectories consistently obtain an excellent classification, as in the case of TestRoute11. By comparing its design in Figure 7b with the magnetic fingerprint of Figure 8, it is visible that this route covers regions with high magnetic field disturbances, as a handrail, which can be seen by the red area. On the other hand, FgRoute6 and FgRoute7 achieved a Success Rate of 0%. From Table A2, we can notice that these two routes correspond to small routes that contain a total of 27 and 33 classified sequences, respectively. Furthermore, by analysing Figure 7a and the magnetic fingerprint in Figure 8, we notice that these regions do not have distinct magnetic disturbances. It is reasonable that our method does not classify them correctly since regions with no magnetic disturbances are more frequent around a building and thus, ambiguous for the DNN model.
As we are using a non-standard evaluation criterion (see Equation (6)), three different scenarios using real acquisitions were selected to better demonstrate the possible position outputs from our method. The three scenarios are the following: • True position within the obtained regions ( Figure 10); • True position within a maximum distance of 4 m from one of the obtained regions ( Figure 11); • True position beyond 4 m from one of the obtained regions ( Figure 12).
In Figures 10-12 the obtained regions are depicted in different colours (green, blue, or orange), the correct device position is marked in red and the shorter distance from each region to the true position using Dijkstra's algorithm is displayed close to each region.
In Figure 10 we can perceive that a single region is obtained, in green, and the true position, in red, is overlaid with the green region. This Figure exhibits the best positioning scenario, where our localisation system only estimates one region containing the correct user position. This result was obtained considering the TestRoute11 outcome from user 4 using the Xiaomi Mi A2 Lite smartphone. Regarding Figure 11, the combined postprocessing method was applied to TestRoute10 from user 4 while using Xiaomi Mi A2 Lite. This figure highlights three regions depicted in colours green, blue, and orange. Although none of the regions contains the correct position, the green region is within 4 m from the true position. Our algorithms considers this region as correct, since the difference can be associated with errors in the true position estimation. The true position is estimated based on the assumption that users walked in a constant velocity during data acquisition as described in the data collection guidelines (see Section 4.1). However, it is very likely that the velocity changed during the acquisition, compromising the estimation of the true position. Additionally, inaccuracies inherent to the process of manual annotation may also have a negative impact on the true position estimation. Finally, Figure 12 illustrates the worst scenario where only one region is estimated far from the correct position. In order to understand this result, we can observe the magnetic fingerprint presented in Figure 8 and verify that the magnetic field pattern around the true position and the obtained region are at least similar in the z-axis. Probably, the correct region has also a high probability but it was lower than the probability threshold used. We anticipate that this case shall not occur frequently as we obtained generally good Success Rates (see Table 4). Regarding the number of the returned regions, ideally our approach should always return only one region, as in the example of Figure 10. However, when a magnetic field sequence shows a similar pattern in various regions of the building, it is expected that more than one region is returned (see Figure 11). This limitation may be reduced by adding a longer magnetic field sequence instead of the used 4 m. However, in this case, the system initialisation will need to wait until the user walked more than 4 m to return the first position. As we previously mentioned, our main motivation is to replace the dependency of RF signals to the system initialisation process. This way, instead of reducing the number of regions we decided to reduce the length of magnetic field sequences. The limitation of obtaining different regions could be tackled by the integration of our methodology with a multi-source localisation system as the one of Guimaraes et al. [16]. Therefore, providing more than one region might not be considered a disadvantage but a strategy to reduce the localisation search to certain areas of the map.
Regarding the possibility of overfitting, due to the lack of data variability while training the model (one user and one smartphone), we believe that it is low after analysing the obtained test results. The test results showed that the model provides satisfactory performance results for different users and smartphones.

Conclusions
Indoor location offers undoubtedly great potential for a wide range of applications. Most infrastructure-free methods rely only on Wi-Fi signals, and when a combination of multiple data sources is used, systems' initialisation is frequently dependent on Wi-Fi availability. Nevertheless, there is a need to find alternatives, as Wi-Fi scans are becoming limited in recent Android versions, or not available at all in the case of iOS. Furthermore, the rise of 5G technology may also compromise Wi-Fi usage.
Herewith, we proposed an infrastructure-free IPS, that is user and usage-independent, and only relies on the magnetic field sensing information for the localisation. We designed a dataset for conducting this experiment with different users and smartphones. This dataset was collected in an office building and contains over 117 acquisitions, fulfilling a total of 73 min of data. With data from one user and one smartphone, we created a fingerprint of the magnetic field values. Moreover, to produce a sufficiently large training dataset, we developed a data augmentation algorithm based on the created fingerprint.
To better describe existing magnetic patterns inside buildings, significant features were extracted from the training data. Afterwards, an architecture with four DNNs was fed with the absolute magnetic field and the extracted features. The implemented DNN provides, as output, a probability map of the user's location, which then undergoes two postprocessing methods, resulting in the most probable location regions.
After applying the postprocessing methods, we achieved considerably good results since the acquisitions Success Rate is about 80%, within a 4 m distance, and the average distance error corresponds to 2.5 m. Considering that we used mainly magnetic data, the results support the potential of this source to provide complementary information for current IPS.
The magnetic field as a single source is unreliable for some applications, as it is easily affected by temporary interferences. Still, most multi-source IPS that leverage the magnetic field do not fully explore its potentialities. Thus, with this work, we aimed to study how this source can complement other solutions.
Comparing to existing IPS, our approach leverages the DNN capabilities by presenting an alternative to typical Wi-Fi based methodologies obtaining satisfactory positioning results. Although our solution can be used as an independent indoor location system, we believe that its main potential is in the integration with other multi-source systems, during the initialisation phase as an alternative to RF signals. In future work, the integration of our approach with the work of Guimaraes [16] will be addressed, where our framework will be introduced as a new layer for the fused positioning mechanism and for initialisation purposes.
Due to the magnetic field dependency of our method, we envisage a challenge regarding the possible changes of buildings environmental characteristics, which can create different magnetic field patterns, requiring recurrent data collections to remap the fingerprints. However, using crowdsourcing techniques for fingerprints updates can mitigate this limitation. In addition, since magnetometer sensors can easily decalibrate, our system would need an automatic magnetometer calibration method to avoid inconsistent location previsions.
In future directions, our methodology could be employed alongside other data sources, thus, providing a more robust system. In situations in which there is more than one high probability region, the combination of different data can provide a straightforward localisation response. Additionally, by providing the most likely regions, the magnetic field data can also help with the initialisation of other systems. Finally, to test the effectiveness and robustness of our methodology, we will employ it in different locations. Funding: This research was co-funded by Portugal 2020, framed under the COMPETE 2020 (Operational Programme Competitiveness and Internationalization) and European Regional Development Fund (ERDF) from European Union (EU) through the project Geolocation non-Assisted by GPS for Mobile Networks in Indoor and Outdoor Environment (GARMIO), with operation code POCI-01-0247-FEDER-033479.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript:  Table A1 contains the main technical characteristics of each magnetic smartphone sensor used in this study. Table A1. Technical characteristics of each magnetic smartphone sensor including magnetometer type, sensing range, sensitivity and resolution.

Magnetometer Sensing Range (mT) Sensitivity (µT/LSB) Resolution (bites)
LGE LM-G710 AKM AK09918C 1 ±4.9 0. 15 Table A2 contains the details of the dataset collected for this work. The dataset was collected by different users and smartphones, and it is composed by trajectories acquired for the fingerprint creation, FgRoute, and for testing purposes, TestRoute. Table A2. Dataset and their characteristics including the number of acquisitions and instances, users that participated in each acquisition and smartphones used. The number of instances corresponds to the total number of labels in each trajectory. Smartphones A, B, C, D and E are respectively LGE LM-G710, Mi A2 Lite, Mi8, Nexus 6P, Pixel.