5.1. Results Obtained Considering Different Scenarios for Generating and Evaluating the Models
In this section, we consider, on the one hand, the classification and mitigation algorithms obtained through the training, validating, and testing carried out using the data recorded in the ML model-generation scenario (see
Figure 2a). On the other hand, the data extracted from the measurements in the evaluation scenario shown in
Figure 2b were employed to assess the performance of the aforementioned algorithms. More specifically, a tag was moved between different points of the grid in
Figure 2b to simulate the movement of a real target. Notice that this evaluation process ensures that the generated models are employed in a scenario different from the one used for their generation, leading to the most general and realistic situation possible.
Figure 3 shows RSS versus ranging values measured in the ML model-generation scenario and corresponding to LOS (blue) or NLOS (yellow) propagation conditions.
Figure 3 also plots the values measured in the evaluation scenario after being classified as LOS (green) or NLOS (magenta) by the NN classifier. The superposition of the values from the two scenarios demonstrates that (1) the NN is able to distinguish correctly between LOS and NLOS, even though both scenarios are different, and (2) the relationship between RSS and ranging values is significantly different in both scenarios, especially in the case of NLOS.
In order to get a clear idea of the effect of each element on the final result, positioning error results were obtained using different combinations of the following algorithms:
The k-NN algorithm as the classifier and the GP regression model as the mitigator.
The NN as both classifier and mitigator.
The IEKF as the positioning algorithm.
The NLS with Gauss as the positioning algorithm.
In addition, for each of the considered combinations, different configurations of the positioning algorithms were established:
No ignore. This is the configuration that serves as the reference for the location error. In this case, the positioning algorithms employ all raw measurement data recorded in the scenario, without any previous classification or mitigation processes. That is to say, the measurement data are the direct input to the positioning algorithms.
Ignore NLOS and no mitigation. Either k-NN or NN is used to classify each measured value as LOS or NLOS. Subsequently, the positioning algorithms will only be fed with the measurement data in the LOS category, without any mitigation, and ignoring data classified as NLOS.
Ignore NLOS with at least four anchors and no mitigation. NLOS data are classified and ignored as in configuration 2, but while ensuring at least four ranging values for the positioning algorithms, enabling algorithms such as the NLS, which otherwise could not provide an estimate in the three dimensions. When less than four values are classified as LOS in one iteration, then the necessary values classified as NLOS are included. More specifically, the ranging values classified as NLOS with the lowest score, i.e., those which are less likely to belong to the NLOS category, are selected to replace the missing LOS values. Again, no mitigation mechanism is applied to the ranging data. This configuration was not tested with the IEKF, as this algorithm does not need to have at least four values to estimate a position.
Ignore NLOS with mitigation. NLOS ranging values are classified and removed from the positioning process. The errors in the remaining ranging values (classified as LOS) are mitigated with one of the algorithms before being sent to the positioning algorithms to finally produce position estimates.
Ignore NLOS with at least four anchors and mitigation. This is a combination of configurations 3 and 4: NLOS ranging values are ignored, but the ones more likely to be classified as LOS are considered to ensure four ranging values in each iteration. Before running the positioning algorithm, each measurement value is passed through an LOS mitigator. Again, this configuration was not tested with the IEKF, as this algorithm does not need to have at least four values to estimate a position.
No ignore with mitigation. The same as configuration 1, but each measurement is passed through a mitigator—modeled according to the category determined by the classifier (i.e., LOS or NLOS)—before being processed by the positioning algorithms.
Figure 4a shows the empirical cumulative distribution function (ECDF) for the location error obtained in the evaluation scenario with the IEKF algorithm, using the k-NN classifier and two GP-based models (LOS and NLOS) for mitigation. The results in
Figure 4a show that, when using all of the raw measured values (configuration 1) to estimate the positions (no classification or mitigation),
of the errors are below
m. In this case, with a reduced number of anchors and many ranging values obtained under NLOS propagation conditions, the error is very large compared to what is typically obtained with UWB technology when there is a good LOS. Obviously, this situation is due to the presence of several anchors simultaneously under NLOS propagation conditions. Although this situation could be improved considerably by placing these tags in more convenient positions, we opted for a more complex and realistic situation, since it is not always possible to place them in the best location. The margin of improvement achieved by the considered algorithms and configurations is assessed under this disadvantageous evaluation scenario.
Figure 4a shows that mitigating the measurements once they were classified (magenta line labeled as 4) does not reduce the error and even worsens it. This is because the relationship between the ranging and RSS values in the ML model-generation scenario is significantly different from that in the evaluation scenario, especially for the NLOS case, since it is very dependent on the geometry of the environment (see
Figure 3). Therefore, the model under consideration attempts to correct the data in an erroneous manner, since it is trained to conform to the values obtained in the ML model-generation scenario and exhibits difficulties when operating on the different values from the evaluation scenario. Additionally, wrong classifications lead to the application of NLOS mitigation models to LOS data and vice versa, also contributing to increase in the error.
The remaining curves in
Figure 4a show the effect of discarding ranging values classified as NLOS. The best result is obtained when no mitigation is applied (line labeled as 2) and all the measurements classified as NLOS are ignored by the location algorithm. On the other hand, when the mitigator (line labeled as 3) is applied, even if it is only on ranging values classified as LOS, there is a small performance degradation.
According to the results in
Figure 4a, ignoring NLOS without mitigation (configuration 2) leads to a significant improvement, in which
of the errors are below
m compared to
m when NLOS ranging values are not excluded (black line). This confirms that the classifier trained, validated, and tested with measurements from the ML model-generation scenario was generic enough to be used in another different scenario and produces satisfactory results.
Figure 4b shows the ECDF using IEKF, but this time employing the NN for both classification and mitigation. Again, the best result is also achieved when the ranging values are classified and the NLOS ones are discarded (configuration 2). The result in this configuration is almost the same as with the k-NN (see
Figure 4a), reaching
m of error for
of the estimates, compared to
m obtained when the k-NN is used (see
Figure 4a). When the mitigation is added (orange curve, labeled as 3), again, there is some performance degradation, yielding values lower than those produced with k-NN and GP mitigators. This is because NN—despite obtaining better results in mitigation during training—is more sensitive to changes in the environment during the evaluation due to certain overfitting associated with it.
To analyze the effect of a different positioning algorithm,
Figure 4c,d show the ECDF using the NLS-based algorithm to estimate the positions while considering k-NN/GP and NN, respectively. The main differences between these figures and the previous ones using IEKF are in configurations 2 (solid green) and 4 (solid orange), which correspond to the cases in which NLOS ranging values are ignored (and mitigated in the case of configuration 4).
Figure 4c,d show that with these configurations, the error increases with respect to the reference configuration (without ignoring NLOS or applying mitigation). This behavior is due to the way that the NLS-based algorithm works. This type of algorithm, like others that try to calculate a position by directly using the trilateration equations, cannot generate a new estimation if they do not have a minimum number of ranging values, hence requiring at least four values in the case of a 3D positioning. Thus, when NLOS ranging values are ignored, it can happen that in several iterations, the algorithm does not have that minimum number of ranging values, so a new position is not generated and the previous one is maintained.
However, when at least four ranging values are available (dashed curves in
Figure 4c,d), although some of them have been classified as NLOS, the results (dashed curves) again improve the error records obtained in the reference case (black curve, configuration 1). As with the IEKF, such improvement is more pronounced when mitigation is not applied and only the NLOS measurements are ignored (green dashed curve, configuration 3). We can also see how with this configuration, the results are almost the same in both cases, using NN or k-NN as classifiers. This means that the two sets (LOS and NLOS) can be clearly separated. In the case of configuration 5 (mitigation after ignoring NLOS), the solution based on k-NN seems to give the best results for most parts of the measurements, but there are outliers that introduce a large error.
Figure 5a,b show a comparison of the different configurations from the point of view of the localization mean absolute error (MAE). The figure shows the values together with their
confidence interval.
Figure 5a shows the results when k-NN is used for classification and GP for mitigation. The first configurations (1 and 2) correspond to the cases in which the location is carried out using all available measurements, without applying any type of filter or mitigation. It can be seen that the values corresponding to IEKF (label 1, gray) and NLS (label 2, crossed out, gray) are practically the same, around
m. These configurations are the basis of the comparison, as they mark the expected error before applying the techniques proposed in this work.
Configurations 3, 4, and 5—shown in
Figure 5a—are those in which a classification process is applied to identify the NLOS samples and remove them from the set used by the localization algorithms. When the IEKF is used (label 3, dark green), the error value decreases significantly, falling below 10 cm (
m). However, when the NLS is employed, the error increases above 75 cm because, when NLOS values are eliminated, the minimum number of samples required by this algorithm in each iteration is not guaranteed and, when this occurs, the algorithm is not able to generate a new position, and the previous value is maintained. If this situation holds through several consecutive iterations, the error grows quickly. In configuration 5 (crossed out light green), the rule is applied to always maintain at least 4
range measurements for each iteration,
filling in with NLOS measurements in case there are not enough LOS values to reach that number. With this strategy, the MAE value falls below that of the original error value. Obviously, the final value of (
) is higher than that obtained with the IEKF (
) because the NLS algorithm is forced to work with a small percentage of NLOS measurements.
The configurations 6 to 8—shown in
Figure 5a—correspond to the cases in which mitigation is applied. Thus, the values labeled as 6 and 7 correspond to the cases where, after ignoring the measurements classified as NLOS, a mitigation process is applied over the remaining ones. For the IEKF case (label 6, orange), the error value is
, which is better than the original one of
, but it does improve the
of configuration 3, in which no mitigation was applied. As mentioned above, this is due to the differences in the relationship between ranging and RSS values found in the ML model-generation and evaluation scenarios (see
Figure 3). The configuration labeled as 7 (crossed out, orange) for the NLS increases the error due to the same reasons as those explained for configuration 4. The approximation in which this minimum number is maintained (at the cost of introducing some NLOS measurements) corresponds to configuration 8 (crossed out, red). In this case, the MAE improves the original value, but less than in the IEKF case (
instead of
).
Finally, the configurations 9 and 10—shown in
Figure 5a—show the results of when NLOS values are included and mitigation is applied independently to LOS and NLOS measured values. The results are identical for both localization algorithms and are even worse than the error obtained in configurations 1 and 2, when no mitigation was applied (about 5 cm of worsening in the MAE). Once again, this result demonstrates that the mitigation strategy is strongly dependent on the ML model-generation scenario; hence, in a significantly different evaluation scenario, it is not able to improve the MAE.
Figure 5b shows the results corresponding to the previous configurations, but when NN is used as a classifier and mitigator. The values obtained are very similar to those provided by the k-NN and GP, especially when only the classification is applied to ignore NLOS measurements, yielding almost the same results. The main difference appears in the configurations where mitigation is applied. In this case, the version with IEKF performs slightly worse, whereas the version with NLS and at least four values performs slightly better. This indicates that LOS mitigation is better with GP (due to a lower fitting with the training set data), whereas classification is better with NN. This means that the NLS version with at least 4 values includes NLOS values in smaller proportions, and that those added values are sometimes values close to being classified as LOS.
Given these results, we can conclude that, in a scenario in which the propagation condition (LOS or NLOS) is unknown, the detection of NLOS ranging values is feasible using ML models that have been trained, validated, and tested with measurements from a different scenario in which we know beforehand, for a given measured value, whether the propagation is LOS or NLOS. This allows for reuse of the proposed ML classifiers in scenarios different from those considered for the training. However, the ML mitigation model considered in this work is strongly dependent on the training scenario; hence, reusing it in a significantly different scenario is counterproductive.
5.2. Results Obtained for Model Generation and Evaluation Based on the Same Scenario
Section 5.1 compared the location results obtained in a given evaluation scenario by applying a series of ML models to filter and mitigate measurements affected by NLOS. These models were trained with data from a measurement campaign carried out in a different scenario, the ML model-generation scenario. Because of the differences between both scenarios, the models were not able to improve the baseline error values in the case of mitigation. To confirm this assumption, a new study is presented below that replicates the experiments described in
Section 5, but in a new scenario. This scenario differs from the previously used evaluation scenario (see
Figure 2b) in that the measurements obtained in it are similar to those obtained in the ML model-generation scenario (see
Figure 2a).
To do this, since it would be impossible to replicate a configuration of beacons in the ML model-generation scenario in the exactly same way as in the evaluation scenario, it was decided to carry out a simulation-based approach. This simulation is based on the UWB simulator described in [
8] and developed by the authors. The simulator is built on the Gazebo simulation platform [
33], a multi-platform software consisting of several components and focused on the virtual simulation of real physical environments. Five UWB anchors were placed in the same positions as in the evaluation scenario described in
Section 2.2, shown in
Figure 2b. The final model of the scenario is drawn in
Figure 6.
The reason for why this simulator can be used as an approximation to the problem is that measurements from the ML model-generation scenario shown in
Figure 2a and considered in
Section 5.1 were employed to recreate the evaluation scenario shown in
Figure 2b. More specifically, the construction of the simulator was based on the following steps:
A measurement campaign was carried out in the same real-world ML model-generation scenario considered in this work (see
Figure 2a).
These measurements were used to train, validate, and test a series of ML models capable of providing estimates of ranging, ranging variance, RSS, and RSS variance from two input values: The distance between the tag and the simulated anchor, and the propagation conditions between them: LOS, NLOS-Soft, or NLOS-Hard.
Finally, a 3D ray-tracing model was developed to estimate the type of scenario between the tag and the anchors. More specifically, the simulator assigns the LOS type when the signal propagates from the tag to the anchor without touching any obstacles. It assigns NLOS-Soft when the signal encounters an obstacle between them, but is able to cross it (according to some configuration parameters of the simulator). Finally, the simulator assigns the NLOS-Hard type in the case of finding a big obstacle between the tag and the anchors, but it is capable of tracing a connection between both after bouncing off of a wall or the floor of the simulated scenario. Notice that, for the sake of simplicity, rebounds on the ceiling are not considered in the current version of the simulator.
Once the simulated scenario was built, ranging and RSS values were simulated for the same points in which measurements were taken in the real-world evaluation scenario. Thus, the simulated UWB tag was placed on the points of a
grid, and values were simulated during a period of 60
at each point, the same capture time as in the real case. After obtaining the simulated values, the same process as in the case with measurements was followed: NLOS classifiers and mitigators (based on k-NN, GP, and NN) were used to filter the simulated data; and finally, the results fed the localization algorithms (the one based on NLS and the IEKF). The same configurations described in
Section 5.1 were evaluated.
Figure 7 shows the error values obtained for each of the analyzed configurations (the same that can be seen in
Figure 4 for the measurements) when using the IEKF and the NLS. In the same way,
Figure 8 shows the MAE values for such configurations.
Figure 7 and
Figure 8 reveal that, in general, the values obtained by the simulator led to error values lower than those obtained from the real measurements and shown in
Figure 4 and
Figure 5. A reference case is the
no ignore configuration, since it analyzes the positioning error without considering any previous ML technique. As detailed in
Table 4 and
Table 5, the MAE difference between the measured and simulated environment is practically the same for both location algorithms (IEKF and NLS), i.e., about 15 cm.
It is also important to check the data in the other combinations, as they give us a hint about how similar the NLOS set of values obtained in both cases is. Thus, for the IEKF case, it can be seen how the differences in the MAE with respect to the data obtained from the measurements are quite small, always below 13 cm and falling below 2 cm for the configurations in which the measurements classified as NLOS are ignored (both with k-NN and with NN). Logically, in the case of the simulated environment, the results are a little better, as they consider similar evaluation and ML model-generation environments. This corroborates the results shown in
Section 5.1, where the good performance of the LOS/NLOS classification together with its benefit in the positioning algorithm were shown.
An important difference is that, for the simulated case, the configuration that applied mitigation after ignoring the NLOS values obtains a better result than the configuration in which only the NLOS values are eliminated and no mitigation is carried out. As already explained in the analysis of the measurement results in
Section 5.1, this behavior was expected when the training set covered the entire sampling space of the evaluation scenario. In the real measurements, there were certain values in the evaluation scenario that were not represented in the ML model-generation set (see
Figure 3) since they were obtained in a different scenario; hence, the mitigation did not improve the results. With the simulation, this is different, since the simulated values came from a model based on the ranging and RSS data obtained in the ML model-generation scenario. Therefore, in the simulation, both the classifiers and the mitigators were trained with similar data; hence, their performance was expected to be much better, as reflected by the MAE results.
With respect to the MAE results obtained by using the NLS algorithm (see
Table 5), the main differences are found in the configurations where the NLOS measurements were eliminated and mitigation was applied or not, but without reserving a minimum number of 4 anchors in each iteration of the algorithm. In the real measurements case, these configurations produced a very high error level—over 75 cm for MAE—mainly because too many samples were eliminated, and the algorithm could not generate new positions. In the simulation, although the same relation was maintained with respect to the version that reserves a minimum of 4 anchors (the latter reduces the MAE), the value of the MAE for the simulation is much lower (staying around 16 cm). This is explained in part by the better efficiency of the classifier in the simulation, which lead to the elimination of fewer erroneous samples; hence, the algorithm remains useful for few occasions without the minimum number of values required to perform the calculations.