Anomaly Detection for Urban Vehicle GNSS Observation with a Hybrid Machine Learning System

: In urban areas, the accuracy and reliability of global navigation satellite systems (GNSS) vehicle positioning decline due to substantial non-line-of-sight (NLOS) signals and multipath e ﬀ ects. Recently, positioning enhancement approaches with supervised GNSS signal type classiﬁcation based on 3D building model-aided labelling have received widespread attention. Despite the reduced computing costs and improved real-time performance, the strict requirements of 3D building models on accuracy and timeliness limit the popularization of the technology to some extent. Meanwhile, the diversity of anomalous observation sources is beyond the reach of NLOS / multipath detection methods. This paper attempts to construct an alternative framework for quality identiﬁcation of GNSS observations combining clustering-based anomaly detection and supervised classiﬁcation, in which the hierarchical density-based spatial clustering of applications with noise (HDBSCAN) algorithm is used to label the o ﬄ ine dataset as normal and anomalous observations without the aid of 3D building models, and the supervised classiﬁer in the online system learns the classiﬁcation rule for real-time anomaly detection. The experimental results based on the measured vehicle GPS / BeiDou data show that after excluding anomalous observations the single point positioning accuracy of the o ﬄ ine dataset is improved by 87.0%, 45.9%, and 69.6% in the east, north, and up directions, respectively, and the positioning accuracy of two online datasets is improved by 48.4% / 45.7%, 39.6% / 63.3%, and 49.6% / 49.1% in the three directions. Through a large number of comparative experiments and discussion on key issues, it is certiﬁed that the proposed method is highly feasible and has great potential in the practical application of urban GNSS vehicle positioning.


Introduction
Global navigation satellite systems (GNSS) robust positioning in urban areas has always been a hot and problematic issue in the navigation field. This is mainly because a large number of multipath and non-line-of-sight (NLOS) effects degrade the positioning performance of a GNSS receiver. Due to the complexity of urban environments, NLOS signal is often more harmful than multipath and its detection and processing are more difficult [1]. With the rise of unmanned technology, intelligent transportation 3D building model. On this basis, the semi-supervised model or supervised classifier can be used to detect anomalous GNSS observations in real time. The implication of our study is that GNSS observations follow a certain distribution pattern on a multidimensional feature space, and high-quality and available observations generally cluster together, away from anomalous observations caused by diverse interference sources. This paper focuses on the method of vehicle GNSS observation anomaly detection in urban environments, which is a basic and front-end work of GNSS robust positioning, and the purpose of single point positioning (SPP) after excluding anomalous observations is to verify the effect of anomaly detection. Subsequent enhancement methods such as auxiliary measurement information fusion and advanced filtering algorithms are not covered. Our research also provides a new perspective on vehicle GNSS receiver autonomous integrity monitoring (RAIM) in urban environments.
GNSS robust positioning based on 3D mapping is undoubtedly one of the most representative and outstanding techniques for improving urban GNSS positioning accuracy and reliability in the past decade. 3D building model-aided labelling has also been successful in supervised NLOS/multipath detection. The novelty and contribution of this paper is to propose an urban vehicle GNSS anomalous observation detection method mining the data itself when accurate and real-time 3D building models become inaccessible. The structure of the article is as follows: (1) an anomalous observation detection framework based on hybrid machine learning is given first, and the HDBSCAN algorithm is emphasized; (2) the measured vehicle GPS/BeiDou data is used to analyze and verify the innovative method proposed in this paper, and a large number of comparative experiments are conducted; (3) several key issues about the practicality of this method are discussed; (4) the research conclusions and future work are summarized.

Feature Extraction
Proper feature values are critical to the performance of machine learning algorithms. In this paper, we refer to the characteristic parameters of NLOS/multipath detection. Previous research has shown that data-level feature values are adequate for NLOS/multipath detection. The features in this article are extracted only from RINEX format files, which contain plenty of useful information about the quality of GNSS observations. By making full use of these feature values combined with reasonable algorithms, it is possible to effectively determine anomalous observations. There has been some literature about the effects of different features on NLOS or multipath detection [11,15,19]. Here, after carefully evaluating most of the above features, eight features are chosen for observation anomaly detection.
Satellite elevation angle: Assigning the weight of each observation based on the satellite elevation angle is the simplest and most common method to reduce the impact of multipath and NLOS signal reception on positioning results. In general, the satellite signal from the higher elevation angle are less likely to be blocked and reflected by a building. However, this is not a universal truth. Affected by the height and distribution of the buildings, high elevation signals may become NLOS, and low elevation signals may become LOS. Nevertheless, the elevation angle is still an important characteristic index to distinguish the NLOS signals.
Carrier-power-to-noise-density ratio (C/N 0 ) measurement: According to the signal propagation theory, additional propagation and reflection will increase the path loss of the GNSS signal. Similar to elevation angle, signal strength or C/N 0 has a general correspondence with the type of signal [22]. C/N 0 measurement is also a commonly used parameter to mitigate multipath effects. In a complex environment, the contribution of C/N 0 -based weighting to the positioning accuracy is greater than elevation-based weighting [33]. Multi-frequency C/N 0 -based multipath detection is implemented by comparing the difference of C/N 0 measurements between different frequencies with the value expected for the signal at that elevation angle. However, this indicator is not very suitable for applications where the user moves too fast, such as GNSS vehicle positioning, because the path delay changes with the movement of the receiver antenna, often causing multipath interference to oscillate between constructive and destructive faster than the bandwidth of the C/N 0 measurement algorithm [19].
Pseudorange residual: When there are more observation equations than unknown parameters and the position estimates are sufficiently accurate, the magnitude of pseudorange residual can reflect the inconsistency between the pseudorange measurement and the geometric distance. Multi-system fusion positioning increases the number of available observed satellites and observation redundancy. Consequently, pseudorange residual can be used as an indicator to detect GNSS signal quality.
PDOP, HDOP, and VDOP: GNSS positioning accuracy usually depends on dilution of precision (DOP) and measurement error. In the case where the user equivalent ranging error (UERE) is constant, the larger DOP value is, the larger positioning error is. In a dense urban environment, a large DOP value often means a large probability of multipath effect and NLOS reception.
The number of satellites involved in the position solution: The number of available satellites to some extent indicates the quality of the observation environment at the current location, which has a direct impact on the satellite signal quality.
Pseudorange rate consistency: This feature parameter is the difference between delta pseudorange and pseudorange rate, and its mathematical expression is [15] (1) where delta pseudorange ∆ρ and delta time ∆t indicate the change of pseudorange and the time interval between two epochs, respectively. The pseudorange rate . ρ is calculated by Doppler shift based on the principle of Doppler effect as .
where superscript (s) and subscript i denote the index of satellite and frequency; λ i is the carrier wavelength; f (s) D i is the Doppler shift in unit of Hz.

Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN)
HDBSCAN is an improvement based on hierarchical clustering for DBSCAN [34,35]. The principle of the DBSCAN algorithm is that for each cluster, the sample points within a given neighborhood radius must exceed a certain threshold. It is not sensitive to noise and can find clusters of arbitrary shapes [36,37]. However, DBSCAN has two major disadvantages. First of all, it is necessary to manually set the neighborhood radius Eps and the minimum number of samples around the core point MinPts. It is difficult to find appropriate parameters when the spatial density of the samples is uneven, so the parameters may not be universal on different datasets. Secondly, DBSCAN cannot be used for clustering of large-scale data because of the huge computational overhead. HDBSCAN optimizes these problems to reduce the running cost and sensitivity of the algorithm to parameters. What is more, HDBSCAN can handle clustering with different densities. According to the tutorial of the open-source code, the specific implementation steps of HDBSCAN are as follows [38]: (1) Transform the space according to the density/sparsity. The mutual reachability distance is used to represent the distance between two sample points, so that the distance between the sample points in the sparse area and other points is enlarged, which reduces the dependence of clustering on Eps. The expression of mutual reachability distance is where d(a, b) is the original metric distance between a and b; core k (a) denotes the distance of k th nearest neighbor from a. Therefore, dense points (with low core distance) remain the same distance from each other, while sparser points are pushed away to be at least core k away from any other point.
(2) Build the minimum spanning tree of the distance weighted graph. The sample data is treated as a weighted graph, with mutual reachability distance as the weight of the connection edge. A minimal set of edges is found so that removing any edges from the set will split the graph. This minimum set of edges is the minimum spanning tree of the graph, which can be achieved quickly and efficiently with Prim's algorithm [39].
(3) Construct a cluster hierarchy of connected components. The above minimum spanning tree is converted into the hierarchy of connected components at this stage. The implementation method is to sort the edges of the tree by distance in increasing order, and then traverse to create a new merged cluster for each edge. To obtain a set of flat clusters, we need to know the conditions for terminating the clustering. Therefore, the key to HDBSCAN is how to cut the tree at different places to select the clusters for variable density samples.
(4) Condense the cluster hierarchy based on minimum cluster size. As the most important parameter of HDBSCAN, once MinClusterSize is determined, the minimum spanning tree can be traversed from top to bottom. When each node is split, if the number of sample points of the sub-cluster is less than MinClusterSize, then the samples of this sub-cluster are marked as -1 for "outlier" and deleted. After traversing the entire cluster tree, a new tree with a small number of nodes is finally obtained.
(5) Extract the stable clusters from the condensed tree. Unlike the one-size-fits-all cluster selection method of DBSCAN, HDBSCAN introduces the stability indicator. Here the parameter λ is defined as the reciprocal of distance. Specifically, there are two measures for a node in the tree and a measure for a point in the node: • λ birth : the lambda value when the cluster is formed • λ death : the lambda value when the cluster is split into two sub-clusters • λ p : the lambda value when that point is separated from the cluster where λ birth < λ p < λ death . For each cluster compute the stability as The cluster selection follows this principle: if the sum of the stabilities of the sub-clusters is greater than the stability of the cluster, then the cluster stability is set to be the sum of the sub stabilities; otherwise, the cluster is declared as the selected cluster and all its descendants are deleted. When traversing to the root node, the current set of selected clusters is the flat clustering, namely, the final clustering result.
The HDBSCAN class has a large number of parameters that can be set during initialization, but in practice, few parameters have a significant practical impact on clustering. There are mainly two parameters that affect the results of anomaly detection, where MinClusterSize is the minimum size of clusters and MinSamples stands for the number of samples in a neighborhood for a point to be considered a core point. The number of clusters can be reduced by increasing MinClusterSize. The larger MinSamples is, the more points are considered outliers, and clusters will be restricted to more dense areas.

Hybrid Machine Learning Framework for GNSS Observation Anomaly Detection
Most individual anomaly detection algorithms, including clustering, are based on post-processing (autonomously learning rules from a large amount of unlabeled sample data), so it is difficult to apply them directly in real time. However, GNSS positioning is more applied in real-time scenarios, especially for vehicle dynamic positioning in the urban environment. To cope with this problem, we designed a real-time detection framework for anomalous GNSS observations, which consists of two major parts, namely an offline learning system and an online learning system. The former provides the latter with prior knowledge of learning. The specific algorithm flow is shown in Figure 1. In the offline system, considerable urban vehicle GNSS observations are collected, which naturally contains a certain number of anomalous observations. Considering that unsupervised anomaly detection algorithms always work better for the dataset with a small proportion of outliers, we performed a preliminary screening of the original observations based on the Chi-square test. The verification formulas of positioning results are as follows where is the standard deviation of observation ; is weighted residual; denotes the weighted sum of the squared errors (WSSE) based on residuals; is the number of estimated parameters and is the number of measurements. − is Chi-square distribution of the degree of freedom − and is false alarm rate. Herein, the value of is set to 0.1% [40]. When meeting the condition, the epoch is marked in the raw training set, otherwise it is assigned to the validation set, which can be verified using HDBSCAN prediction and the online classifier described later. Since the Chi-square test cannot completely exclude anomalous observations, the remaining anomalies help the implementation of the anomaly detection algorithm. After feature extraction and preprocessing, the training set is used as the input for HDBSCAN clustering. There are two ways to determine the ideal anomaly detection results, and the detailed description is given in Section 3.2.1. At this point, the offline labelled database is created. When ground truth values are employed to seek the best parameters here, we can call this process quasi unsupervised learning. Based on the previous research results, when the labels of the training set are accurate enough, the supervised classifiers can always show a good NLOS/multipath recognition performance. This confirms that the accuracy of the prior knowledge and the quality of the training set data are more important than subsequent classification algorithms [41]. Therefore, the establishment of a robust HDBSCAN clustering model for anomalies is the focus of this paper.
In the online system, the semi-supervised model or supervised binary classifier generates classification rules by learning offline labelled data. When the new GNSS observations are input into In the offline system, considerable urban vehicle GNSS observations are collected, which naturally contains a certain number of anomalous observations. Considering that unsupervised anomaly detection algorithms always work better for the dataset with a small proportion of outliers, we performed a preliminary screening of the original observations based on the Chi-square test. The verification formulas of positioning results are as follows v s = P s r − ρ s r + c·dt r − c·dT s + I s where σ s is the standard deviation of observation s; v s is weighted residual; v T v denotes the weighted sum of the squared errors (WSSE) based on residuals; n is the number of estimated parameters and m is the number of measurements. χ 2 α (m − n) is Chi-square distribution of the degree of freedom m − n and α is false alarm rate. Herein, the value of α is set to 0.1% [40]. When meeting the condition, the epoch is marked in the raw training set, otherwise it is assigned to the validation set, which can be verified using HDBSCAN prediction and the online classifier described later. Since the Chi-square test cannot completely exclude anomalous observations, the remaining anomalies help the implementation of the anomaly detection algorithm. After feature extraction and preprocessing, the training set is used as the input for HDBSCAN clustering. There are two ways to determine the ideal anomaly detection results, and the detailed description is given in Section 3.2.1. At this point, the offline labelled database is created. When ground truth values are employed to seek the best parameters here, we can call this process quasi unsupervised learning. Based on the previous research results, when the labels of the training set are accurate enough, the supervised classifiers can always show a good NLOS/multipath recognition performance. This confirms that the accuracy of the prior knowledge and the quality of the training set data are more important than subsequent classification algorithms [41]. Therefore, the establishment of a robust HDBSCAN clustering model for anomalies is the focus of this paper.
In the online system, the semi-supervised model or supervised binary classifier generates classification rules by learning offline labelled data. When the new GNSS observations are input into the classifier, the anomaly detection results can be obtained in real time. Several popular supervised classifiers without parameter fine-tuning are used to verify the system in this article. Finally, like the offline system, we evaluate the effectiveness of anomaly detection with the accuracy and availability of SPP after excluding anomalous observations. Generally, to obtain better real-time positioning accuracy, the Chi-square test can be continued on the final positioning result according to Equation (7). However, the positioning results obtained by the hybrid learning method usually have very few position solutions that fail to pass the verification. When the computing capacity of the device can meet the real-time requirements, for the results that do not pass the Chi-square test, we can update the feature values of their observations and continue to use the online system for further anomaly detection and exclusion. It is generally recommended to iterate only once.
It is worth mentioning that the offline system can learn new HDBSCAN rules and update the labelled database by constantly adding new observations. Training data from more scenarios tend to benefit the performance improvement of the clustering algorithm. Since the exclusion of anomalous observations may cause the number of satellites in the epoch to be too small to calculate position, to facilitate the performance verification of the proposed algorithm, we chose GPS/BeiDou dual-system observations for experiments.

Data Acquisition and Preprocessing
In this paper, we employed vehicle GNSS observations in typical urban environments for experimental verification. The experimental platform and test environment are shown in Figure 2. The external antenna on the roof of the vehicle is connected to different GNSS receivers through a power splitter. Meanwhile, the platform is equipped with a NovAtel's high-performance tactical grade inertial measurement unit (IMU) ISA-100C for tightly combining GNSS RTK/INS measurements to obtain the calibration values of 3D position, velocity and attitude, which are used to subsequently verify the improvement of positioning accuracy by excluding anomalous observations. The calibration solutions are implemented by post-processing through a high-precision tight combination algorithm from NovAtel Inertial Explorer software. The NovAtel receiver of the rover station and the Trimble receivers of the three base stations contribute the carrier phase observations in tightly combined positioning. It should be noted that due to the heterogeneity between different types of receivers (different hardware configurations and signal processing algorithms), they differ to some extent in terms of satellite signal acquisition and tracking, signal reception strength, observation quality, etc. Therefore, to better demonstrate the universality performance of the proposed algorithm, we leveraged ComNav K508 GNSS OEM board to collect the RINEX data of the training set, while the data collected by NovAtel ProPak6 receiver was used as the test set. Besides, the time interval between the acquisition of the training set and the test set is more than four months. In this case, the spatial-temporal correlation between the datasets will be weakened, which is conducive to a more objective evaluation of algorithm performance.
Remote Sens. 2020, 12, x FOR PEER REVIEW 8 of 26 the classifier, the anomaly detection results can be obtained in real time. Several popular supervised classifiers without parameter fine-tuning are used to verify the system in this article. Finally, like the offline system, we evaluate the effectiveness of anomaly detection with the accuracy and availability of SPP after excluding anomalous observations. Generally, to obtain better real-time positioning accuracy, the Chi-square test can be continued on the final positioning result according to Equation (7). However, the positioning results obtained by the hybrid learning method usually have very few position solutions that fail to pass the verification. When the computing capacity of the device can meet the real-time requirements, for the results that do not pass the Chi-square test, we can update the feature values of their observations and continue to use the online system for further anomaly detection and exclusion. It is generally recommended to iterate only once. It is worth mentioning that the offline system can learn new HDBSCAN rules and update the labelled database by constantly adding new observations. Training data from more scenarios tend to benefit the performance improvement of the clustering algorithm. Since the exclusion of anomalous observations may cause the number of satellites in the epoch to be too small to calculate position, to facilitate the performance verification of the proposed algorithm, we chose GPS/BeiDou dual-system observations for experiments.

Data Acquisition and Preprocessing
In this paper, we employed vehicle GNSS observations in typical urban environments for experimental verification. The experimental platform and test environment are shown in Figure 2. The external antenna on the roof of the vehicle is connected to different GNSS receivers through a power splitter. Meanwhile, the platform is equipped with a NovAtel's high-performance tactical grade inertial measurement unit (IMU) ISA-100C for tightly combining GNSS RTK/INS measurements to obtain the calibration values of 3D position, velocity and attitude, which are used to subsequently verify the improvement of positioning accuracy by excluding anomalous observations. The calibration solutions are implemented by post-processing through a high-precision tight combination algorithm from NovAtel Inertial Explorer software. The NovAtel receiver of the rover station and the Trimble receivers of the three base stations contribute the carrier phase observations in tightly combined positioning. It should be noted that due to the heterogeneity between different types of receivers (different hardware configurations and signal processing algorithms), they differ to some extent in terms of satellite signal acquisition and tracking, signal reception strength, observation quality, etc. Therefore, to better demonstrate the universality performance of the proposed algorithm, we leveraged ComNav K508 GNSS OEM board to collect the RINEX data of the training set, while the data collected by NovAtel ProPak6 receiver was used as the test set. Besides, the time interval between the acquisition of the training set and the test set is more than four months. In this case, the spatial-temporal correlation between the datasets will be weakened, which is conducive to a more objective evaluation of algorithm performance.  The average speed of the vehicle was about 30 km/h, and the maximum speed was 50 km/h. In addition to the dynamic observations, the platform also recorded a small amount of static data when stopping at the intersections. The sampling rate of data is 1 Hz. We only process single-frequency observations from GPS L1 and BeiDou B1 bands. Figure 3 shows the vehicle routes corresponding to three segments of GNSS observations, and the three sets of data are labelled D1, D2, and D3 in chronological order, where D1 is composed of the training set and the validation test, while D2 and D3 are the test sets. Details will be described later. The location where the vehicle travelled is the urban area of Nanjing, and the typical scenarios include urban canyon, semi-urban, tunnel, etc. Many roads in Nanjing are covered by tall London plane trees and other leafy trees. As shown in the figure, D2 and D1 were collected at different places, and the trajectory of D3 had an overlap with D1. Both D2 and D3 are over four months apart with D1.
Remote Sens. 2020, 12, x FOR PEER REVIEW 9 of 26 The average speed of the vehicle was about 30 km/h, and the maximum speed was 50 km/h. In addition to the dynamic observations, the platform also recorded a small amount of static data when stopping at the intersections. The sampling rate of data is 1 Hz. We only process single-frequency observations from GPS L1 and BeiDou B1 bands. Figure 3 shows the vehicle routes corresponding to three segments of GNSS observations, and the three sets of data are labelled D1, D2, and D3 in chronological order, where D1 is composed of the training set and the validation test, while D2 and D3 are the test sets. Details will be described later. The location where the vehicle travelled is the urban area of Nanjing, and the typical scenarios include urban canyon, semi-urban, tunnel, etc. Many roads in Nanjing are covered by tall London plane trees and other leafy trees. As shown in the figure, D2 and D1 were collected at different places, and the trajectory of D3 had an overlap with D1. Both D2 and D3 are over four months apart with D1. The skyplots of three datasets are also given in Figure 4. Although the path of D3 is included in D1, the satellite distributions of D1 and D3 are still quite different because of unknown changes in the surroundings of overlapping areas and the existence of satellite orbit period. Consequently, there will not be a large number of repeated or extremely approximate feature values between them to affect the reliability verification of the algorithm. Considering the environmental similarity between D3 and D1, we employed D3 as another test set to compare the detection and exclusion effect with D2. The three sets of data are processed to extract feature values, respectively. In particular, for the integrity of the sample data, we do not set the satellite elevation mask and C/N0 mask during the positioning process, which helps expand the range and diversity of feature values. Details of the The skyplots of three datasets are also given in Figure 4. Although the path of D3 is included in D1, the satellite distributions of D1 and D3 are still quite different because of unknown changes in the surroundings of overlapping areas and the existence of satellite orbit period. Consequently, there will not be a large number of repeated or extremely approximate feature values between them to affect the reliability verification of the algorithm. Considering the environmental similarity between D3 and D1, we employed D3 as another test set to compare the detection and exclusion effect with D2.
Remote Sens. 2020, 12, x FOR PEER REVIEW 9 of 26 The average speed of the vehicle was about 30 km/h, and the maximum speed was 50 km/h. In addition to the dynamic observations, the platform also recorded a small amount of static data when stopping at the intersections. The sampling rate of data is 1 Hz. We only process single-frequency observations from GPS L1 and BeiDou B1 bands. Figure 3 shows the vehicle routes corresponding to three segments of GNSS observations, and the three sets of data are labelled D1, D2, and D3 in chronological order, where D1 is composed of the training set and the validation test, while D2 and D3 are the test sets. Details will be described later. The location where the vehicle travelled is the urban area of Nanjing, and the typical scenarios include urban canyon, semi-urban, tunnel, etc. Many roads in Nanjing are covered by tall London plane trees and other leafy trees. As shown in the figure, D2 and D1 were collected at different places, and the trajectory of D3 had an overlap with D1. Both D2 and D3 are over four months apart with D1. The skyplots of three datasets are also given in Figure 4. Although the path of D3 is included in D1, the satellite distributions of D1 and D3 are still quite different because of unknown changes in the surroundings of overlapping areas and the existence of satellite orbit period. Consequently, there will not be a large number of repeated or extremely approximate feature values between them to affect the reliability verification of the algorithm. Considering the environmental similarity between D3 and D1, we employed D3 as another test set to compare the detection and exclusion effect with D2. The three sets of data are processed to extract feature values, respectively. In particular, for the integrity of the sample data, we do not set the satellite elevation mask and C/N0 mask during the positioning process, which helps expand the range and diversity of feature values. Details of the The three sets of data are processed to extract feature values, respectively. In particular, for the integrity of the sample data, we do not set the satellite elevation mask and C/N 0 mask during the positioning process, which helps expand the range and diversity of feature values. Details of the datasets are listed in Table 1. The title valid epoch refers to the number of epochs involved in the location resolution. Due to the deterioration of the observation environment, the satellite signal loss of lock will occur, and when the number of visible satellites is less than 5, the dual-system pseudorange single point positioning cannot be performed. These epochs are classified as invalid epochs. Nevertheless, in the urban vehicular environment, there are a considerable number of anomalous GNSS observations at valid epochs. How to effectively identify and eliminate them is exactly what aimed to study in this paper. We further divide D1 into two parts. The first part contains the epochs in which WSSEs pass the Chi-square test, and the second part is the opposite. As mentioned above, we need to preliminarily screen the training set for more accurate anomaly detection results. The segmented data are shown in Table 2. The subscripts a and b indicate the data of the first part and the second part, respectively. D1a is the training set, and the rest are considered the validation set. In the three datasets, epochs that do not meet the Chi-square test account for 4.4%, 10.0%, and 7.2%, respectively. The observation environment of D2 is worse than the other two. In the traditional RAIM fault detection and exclusion (FDE) algorithm, consistency checking based on the redundancy of range measurements is applied to recovering epochs conflicting the Chi-square test to improve the reliability and availability of positioning results. However, the classical algorithm often fails in urban areas [42]. Therefore, it is also crucial to restore the second part of observations reasonably and effectively, especially for the continuity and integrity of dynamic positioning. To keep the features of different value ranges at the same numerical magnitude and reduce the influence of the features with large variance on the model, the feature values were standardized so that the mean value is 0 and the variance is 1. On this basis, principal component analysis (PCA) [43] is used to extract key features and improve learning speed. What is more, PCA ensures that these variables are independent of each other to avoid the instability of the solution space. The explained variance ratio represents the contribution proportion of each principal component axis to the variance of the entire dataset. In D1a, for example, the first six principal components cover more than 98% of the training set information, as shown in Figure 5. Therefore, we used these six principal components instead of the original feature values for learning. It should be noted that when testing the algorithm performance with the test sets, the feature values of the test sets must be standardized using the mean and variance parameters calculated by the training test. Similarly, we should use the dimension reduction matrix obtained from the training set to reduce the dimension of the test sets. Remote Sens. 2020, 12, x FOR PEER REVIEW 11 of 26

Anomaly Detection Based on HDBSCAN
In this section, we will show in detail the results of detecting anomalous GPS/BeiDou observations on the dataset D1 using HDBSCAN algorithms. This contains a critical post-processing process, which is also the focus of this article, laying the foundation for subsequent supervised learning and real-time applications. At the same time, changes in positioning accuracy and availability after excluding anomalous observations are also analyzed.

Results of D1a
The data subset D1a is first processed using the HDBSCAN algorithm. By parameter tuning, the dataset is grouped into one category as far as possible. The criterion for parameter determination is that as outliers evolve from less to more, the maximum positioning errors in the east and north directions are both less than 10 m for the first time after excluding anomalous observations. This is a relatively conservative approach to reduce false positive rate. Another effective method is to extract the upper quantiles according to the probability distribution of outlier scores in HDBSCAN that describes the possibility of the point becoming an outlier to determine the outlier boundary. In this paper, the maximum positioning error was used to determine the parameters of the clustering model because of the available ground truth values. To be specific, we first fixed MinSamples to the default value and adjusted MinClusterSize. As described in the last paragraph of Section 2.2, the number of clusters can be reduced by increasing MinClusterSize. Therefore, we set the parameter in ascending order. When MinClusterSize is 60, the number of clusters starts to converge to 2 (the rest are outliers), and there are very few sample points in one of the clusters. At this point, we started to adjust MinSamples. The larger MinSamples is, the more points are considered anomalies. To reduce false positive rate, after repeated SPP experiments with anomaly exclusion, we finally set MinSamples to 8, and the results exactly meet the above-mentioned maximum plane error conditions. Figure 6 shows the 3D clustering results when the MinClusterSize is 60 and the MinSamples is 8. For better visualization, the first three principal components (PC) were set as the coordinate axes. It can be seen that the dataset is clustered into two categories, Cluster 1 (blue points) and Cluster 2 (green points), labelled 1 and 0, respectively. Here, the outlier samples labelled -1 are considered "anomalous observations" (red points). However, this is just an intuitive preliminary labelling result, because outliers or anomalies do not necessarily mean anomalous GNSS observations. In some extreme circumstances, anomalous observations in the dataset may behave more like non-outliers than normal observations. Without prior knowledge, there may be a risk that the anomalous observations

Anomaly Detection Based on HDBSCAN
In this section, we will show in detail the results of detecting anomalous GPS/BeiDou observations on the dataset D1 using HDBSCAN algorithms. This contains a critical post-processing process, which is also the focus of this article, laying the foundation for subsequent supervised learning and real-time applications. At the same time, changes in positioning accuracy and availability after excluding anomalous observations are also analyzed.

Results of D1a
The data subset D1a is first processed using the HDBSCAN algorithm. By parameter tuning, the dataset is grouped into one category as far as possible. The criterion for parameter determination is that as outliers evolve from less to more, the maximum positioning errors in the east and north directions are both less than 10 m for the first time after excluding anomalous observations. This is a relatively conservative approach to reduce false positive rate. Another effective method is to extract the upper quantiles according to the probability distribution of outlier scores in HDBSCAN that describes the possibility of the point becoming an outlier to determine the outlier boundary. In this paper, the maximum positioning error was used to determine the parameters of the clustering model because of the available ground truth values. To be specific, we first fixed MinSamples to the default value and adjusted MinClusterSize. As described in the last paragraph of Section 2.2, the number of clusters can be reduced by increasing MinClusterSize. Therefore, we set the parameter in ascending order. When MinClusterSize is 60, the number of clusters starts to converge to 2 (the rest are outliers), and there are very few sample points in one of the clusters. At this point, we started to adjust MinSamples. The larger MinSamples is, the more points are considered anomalies. To reduce false positive rate, after repeated SPP experiments with anomaly exclusion, we finally set MinSamples to 8, and the results exactly meet the above-mentioned maximum plane error conditions. Figure 6 shows the 3D clustering results when the MinClusterSize is 60 and the MinSamples is 8. For better visualization, the first three principal components (PC) were set as the coordinate axes. It can be seen that the dataset is clustered into two categories, Cluster 1 (blue points) and Cluster 2 (green points), labelled 1 and 0, respectively. Here, the outlier samples labelled -1 are considered "anomalous observations" (red points). However, this is just an intuitive preliminary labelling result, because outliers or anomalies do not necessarily mean anomalous GNSS observations. In some extreme circumstances, anomalous observations in the dataset may behave more like non-outliers than normal observations. Without prior knowledge, there may be a risk that the anomalous observations are wrongly regarded as "normal", while the normal observations become "anomalous". Therefore, for the sake of insurance, it is necessary to further confirm the clustering results according to the probability distributions of the feature values.
Remote Sens. 2020, 12, x FOR PEER REVIEW 12 of 26 are wrongly regarded as "normal", while the normal observations become "anomalous". Therefore, for the sake of insurance, it is necessary to further confirm the clustering results according to the probability distributions of the feature values.  The four variables chosen for illustration are satellite elevation angle, pseudorange residual, C/N0 measurement, and pseudorange rate consistency. According to the probability distributions of the last three variables, it can be determined that the recognition results of anomalous GNSS observations above are generally consistent with prior statistical knowledge. Specifically, the anomalous observations are characterized in probability by the largest absolute pseudorange residual, the lowest C/N0 measurement and the largest magnitude of pseudorange rate difference, which is highly similar to the properties of NLOS signals in the urban environments. Meanwhile, the satellite elevation angle has a weak ability to discern the types of GNSS observations. However, it can still be seen that points with medium-low elevation angles (less than 50°) account for the majority of the anomalies. As for Cluster 1 and Cluster 2, the samples of Cluster 2 are mainly concentrated in the region with higher elevation angle, smaller residual and larger C/N0 measurement. These samples can be defined as high-quality observations, which are most likely derived from completely contamination-free LOS signals under ideal observation conditions. Cluster 1 is a collection of those between high-quality observations and anomalous observations. Due to the interference of low multipath effect or other potential factors, its quality as a whole is not as good as that of Cluster 2, but it can still be used for position solution. In D1, the number of anomalies and non-anomalies is 677 and 18,892, respectively.  The four variables chosen for illustration are satellite elevation angle, pseudorange residual, C/N 0 measurement, and pseudorange rate consistency. According to the probability distributions of the last three variables, it can be determined that the recognition results of anomalous GNSS observations above are generally consistent with prior statistical knowledge. Specifically, the anomalous observations are characterized in probability by the largest absolute pseudorange residual, the lowest C/N 0 measurement and the largest magnitude of pseudorange rate difference, which is highly similar to the properties of NLOS signals in the urban environments. Meanwhile, the satellite elevation angle has a weak ability to discern the types of GNSS observations. However, it can still be seen that points with medium-low elevation angles (less than 50 • ) account for the majority of the anomalies. As for Cluster 1 and Cluster 2, the samples of Cluster 2 are mainly concentrated in the region with higher elevation angle, smaller residual and larger C/N 0 measurement. These samples can be defined as high-quality observations, which are most likely derived from completely contamination-free LOS signals under ideal observation conditions. Cluster 1 is a collection of those between high-quality observations and anomalous observations. Due to the interference of low multipath effect or other potential factors, its quality as a whole is not as good as that of Cluster 2, but it can still be used for position solution. In D1, the number of anomalies and non-anomalies is 677 and 18,892, respectively. Remote Sens. 2020, 12, x FOR PEER REVIEW 13 of 26 Being short of labels, there is no unified evaluation index for clustering results. However, the goal of clustering in this paper is clear, namely, to improve the accuracy and reliability of GNSS positioning in urban environments, so the root mean square error (RMSE) of positioning results after excluding anomalous observations was used as an index to evaluate the performance of clustering. The high-precision RTK/INS tightly combined solution was considered as ground truth values. The single point positioning solutions in this paper were obtained using the weighted least squares (WLS) method based on satellite elevation angle. Figure 8 and Table 3 show the improvement in positioning results after HDBSCAN was used to eliminate anomalous observations. The original positioning RMSE without anomaly detection is 3.05, 2.41, and 9.88 m in the east, north, and up directions, respectively. In contrast, the positioning RMSE in the three directions after excluding anomalous observations is 1.09, 2.10, and 6.17 m, with accuracy improvements of 64.3%, 12.9%, and 37.6%. As the vehicle track generally goes from south to north, the positioning error in the east direction is relatively large compared with the north direction [44]. The experimental results show that the removal of anomalies has a better effect on the improvement of positioning accuracy in the east direction, which indicates that HDBSCAN is not only effective to identify GNSS anomalous observations but also consistent with the real situation. The continuity of dynamic positioning results is also of crucial importance while ensuring the positioning accuracy. When there are too few GNSS constellations involved in position solution, it is not advisable to lose a large number of original valid epochs by blindly pursuing the positioning accuracy. In the dataset D1a, the number of valid epochs is 1705, and after excluding anomalous observations the number becomes 1655, making the observed data available up to 97.1%. From Figure 9, besides, as the number of satellites participating in position calculation is reduced due to the exclusion of anomalous observations, the geometric dilution of precision (GDOP) has unsurprisingly risen overall. Even so, the positioning accuracy has been improved. Therefore, this method can effectively identify Being short of labels, there is no unified evaluation index for clustering results. However, the goal of clustering in this paper is clear, namely, to improve the accuracy and reliability of GNSS positioning in urban environments, so the root mean square error (RMSE) of positioning results after excluding anomalous observations was used as an index to evaluate the performance of clustering. The high-precision RTK/INS tightly combined solution was considered as ground truth values. The single point positioning solutions in this paper were obtained using the weighted least squares (WLS) method based on satellite elevation angle. Figure 8 and Table 3 show the improvement in positioning results after HDBSCAN was used to eliminate anomalous observations. The original positioning RMSE without anomaly detection is 3.05, 2.41, and 9.88 m in the east, north, and up directions, respectively. In contrast, the positioning RMSE in the three directions after excluding anomalous observations is 1.09, 2.10, and 6.17 m, with accuracy improvements of 64.3%, 12.9%, and 37.6%. As the vehicle track generally goes from south to north, the positioning error in the east direction is relatively large compared with the north direction [44]. The experimental results show that the removal of anomalies has a better effect on the improvement of positioning accuracy in the east direction, which indicates that HDBSCAN is not only effective to identify GNSS anomalous observations but also consistent with the real situation. The continuity of dynamic positioning results is also of crucial importance while ensuring the positioning accuracy. When there are too few GNSS constellations involved in position solution, it is not advisable to lose a large number of original valid epochs by blindly pursuing the positioning accuracy. In the dataset D1a, the number of valid epochs is 1705, and after excluding anomalous observations the number becomes 1655, making the observed data available up to 97.1%. From Figure 9, besides, as the number of satellites participating in position calculation is reduced due to the exclusion of anomalous observations, the geometric dilution of precision (GDOP) has unsurprisingly risen overall. Even so, the positioning accuracy has been improved. Therefore, this method can effectively identify anomalies in GNSS observations. Moreover, the epochs after using HDBSCAN to remove anomalous observations all meet the Chi-square test.
Remote Sens. 2020, 12, x FOR PEER REVIEW 14 of 26 anomalies in GNSS observations. Moreover, the epochs after using HDBSCAN to remove anomalous observations all meet the Chi-square test.   The proposed method can improve the positioning accuracy in two ways. As can be seen from Figure 10, in harsh environments, fewer satellites can be observed and a large number of anomalous observations are mixed into the observation epochs, resulting in huge positioning errors. After anomaly exclusion, there are not enough GPS or BeiDou satellites available for the dual-system positioning process, so these epochs no longer output their positioning results. On the other hand, as  anomalies in GNSS observations. Moreover, the epochs after using HDBSCAN to remove anomalous observations all meet the Chi-square test.   The proposed method can improve the positioning accuracy in two ways. As can be seen from Figure 10, in harsh environments, fewer satellites can be observed and a large number of anomalous observations are mixed into the observation epochs, resulting in huge positioning errors. After anomaly exclusion, there are not enough GPS or BeiDou satellites available for the dual-system positioning process, so these epochs no longer output their positioning results. On the other hand, as The proposed method can improve the positioning accuracy in two ways. As can be seen from Figure 10, in harsh environments, fewer satellites can be observed and a large number of anomalous observations are mixed into the observation epochs, resulting in huge positioning errors. After anomaly exclusion, there are not enough GPS or BeiDou satellites available for the dual-system positioning process, so these epochs no longer output their positioning results. On the other hand, as shown in Figure 11, when there are sufficient visible satellites and considerable anomalous observations received, the remaining satellites can still participate in the position solution even if the removal of anomalies weakened the satellite geometry, and the positioning accuracy is substantially improved, especially in the east direction. In the former case, some valid epochs are lost due to insufficient normal observations caused by the poor observation environment. However, incorrect coordinate solutions are avoided. Since normal observations are preserved, these epochs can be complemented using their normal observations through advanced filtering algorithms and enhanced information from other sensors or measurements.
Remote Sens. 2020, 12, x FOR PEER REVIEW 15 of 26 shown in Figure 11, when there are sufficient visible satellites and considerable anomalous observations received, the remaining satellites can still participate in the position solution even if the removal of anomalies weakened the satellite geometry, and the positioning accuracy is substantially improved, especially in the east direction. In the former case, some valid epochs are lost due to insufficient normal observations caused by the poor observation environment. However, incorrect coordinate solutions are avoided. Since normal observations are preserved, these epochs can be complemented using their normal observations through advanced filtering algorithms and enhanced information from other sensors or measurements.
Ground truth Original HDBSCAN Figure 10. Epochs with insufficient normal observations are discarded to avoid wrong position solutions in harsh environments.

Ground truth
Original HDBSCAN Figure 11. The positioning accuracy is greatly improved at the epochs with enough normal observations after anomaly exclusion.
In the traditional GNSS single point positioning process, the satellite elevation mask and C/N0 mask are usually set to exclude the observations with poor quality. Therefore, this simple and direct method is also used for comparative experiments. The experimental results are listed in Table 4. When the C/N0 mask is set to 50 dB-Hz, the position solutions of only eight epochs can be obtained, and the availability of high-quality observations is greatly reduced, so this result is ignored. By comparison, it can be found that when the elevation angle mask and C/N0 mask are set to 25° and 0 dB-Hz, respectively, the data subset D1a has the minimum two-dimensional plane positioning error. Nevertheless, the positioning accuracy has not been significantly improved compared with the original result. Some conclusions can be drawn here. Firstly, in urban vehicle-mounted scenarios, the C/N0 measurement has a higher resolution in the positioning results compared with the satellite elevation angle. Specifically, the change of the elevation mask has little influence on the positioning result under the condition of a fixed C/N0 mask until the elevation mask reaches 40°. This also indicates that in the urban environment, the observations are mainly concentrated below the C/N0 measurement of 50dB-Hz and above the elevation of 35°. Secondly, anomalous GNSS observations cannot be effectively eliminated by setting elevation mask and C/N0 mask in a complex urban shown in Figure 11, when there are sufficient visible satellites and considerable anomalous observations received, the remaining satellites can still participate in the position solution even if the removal of anomalies weakened the satellite geometry, and the positioning accuracy is substantially improved, especially in the east direction. In the former case, some valid epochs are lost due to insufficient normal observations caused by the poor observation environment. However, incorrect coordinate solutions are avoided. Since normal observations are preserved, these epochs can be complemented using their normal observations through advanced filtering algorithms and enhanced information from other sensors or measurements.
Ground truth Original HDBSCAN Figure 10. Epochs with insufficient normal observations are discarded to avoid wrong position solutions in harsh environments.

Ground truth
Original HDBSCAN Figure 11. The positioning accuracy is greatly improved at the epochs with enough normal observations after anomaly exclusion.
In the traditional GNSS single point positioning process, the satellite elevation mask and C/N0 mask are usually set to exclude the observations with poor quality. Therefore, this simple and direct method is also used for comparative experiments. The experimental results are listed in Table 4. When the C/N0 mask is set to 50 dB-Hz, the position solutions of only eight epochs can be obtained, and the availability of high-quality observations is greatly reduced, so this result is ignored. By comparison, it can be found that when the elevation angle mask and C/N0 mask are set to 25° and 0 dB-Hz, respectively, the data subset D1a has the minimum two-dimensional plane positioning error. Nevertheless, the positioning accuracy has not been significantly improved compared with the original result. Some conclusions can be drawn here. Firstly, in urban vehicle-mounted scenarios, the C/N0 measurement has a higher resolution in the positioning results compared with the satellite elevation angle. Specifically, the change of the elevation mask has little influence on the positioning result under the condition of a fixed C/N0 mask until the elevation mask reaches 40°. This also indicates that in the urban environment, the observations are mainly concentrated below the C/N0 measurement of 50dB-Hz and above the elevation of 35°. Secondly, anomalous GNSS observations cannot be effectively eliminated by setting elevation mask and C/N0 mask in a complex urban In the traditional GNSS single point positioning process, the satellite elevation mask and C/N 0 mask are usually set to exclude the observations with poor quality. Therefore, this simple and direct method is also used for comparative experiments. The experimental results are listed in Table 4. When the C/N 0 mask is set to 50 dB-Hz, the position solutions of only eight epochs can be obtained, and the availability of high-quality observations is greatly reduced, so this result is ignored. By comparison, it can be found that when the elevation angle mask and C/N 0 mask are set to 25 • and 0 dB-Hz, respectively, the data subset D1a has the minimum two-dimensional plane positioning error. Nevertheless, the positioning accuracy has not been significantly improved compared with the original result. Some conclusions can be drawn here. Firstly, in urban vehicle-mounted scenarios, the C/N 0 measurement has a higher resolution in the positioning results compared with the satellite elevation angle. Specifically, the change of the elevation mask has little influence on the positioning result under the condition of a fixed C/N 0 mask until the elevation mask reaches 40 • . This also indicates that in the urban environment, the observations are mainly concentrated below the C/N 0 measurement of 50dB-Hz and above the elevation of 35 • . Secondly, anomalous GNSS observations cannot be effectively eliminated by setting elevation mask and C/N 0 mask in a complex urban environment, because the factors affecting the quality of GNSS observations are various. In addition, appropriate cut-off values are not easy to find. If the mask is set too large, the satellite geometric distribution will deteriorate and the number of visible satellites will decrease, thus reducing the positioning accuracy and availability. For this part of data, the traditional RAIM FDE algorithm recalculates the position by excluding, one by one, the visible satellites at each epoch. Once meeting the Chi-square test, the position solutions of these epochs are output. However, in epochs with a large number of anomalous observations, the algorithm usually fails. Since HDBSCAN itself can also be used to build predictive models, we first try to predict anomalies of D1b using HDBSCAN which are trained on D1a in this section. The prediction results are shown in Figure 12. To facilitate visualization, the dataset was projected onto the PC1-PC2 plane. There are 599 anomalous observations, obviously more than the normal observations. The remaining 134 normal observations are used for single point positioning to verify the effect of anomaly detection. models, we first try to predict anomalies of D1b using HDBSCAN which are trained on D1a in this section. The prediction results are shown in Figure 12. To facilitate visualization, the dataset was projected onto the PC1-PC2 plane. There are 599 anomalous observations, obviously more than the normal observations. The remaining 134 normal observations are used for single point positioning to verify the effect of anomaly detection.  Table 5 shows how the above two methods change the positioning results. It can be seen that although RAIM FDE retains most of the epochs, the positioning accuracy does not increase but decreases because the anomalous observations are not effectively eliminated. HDBSCAN improves the plane positioning accuracy to within 1 m, and the two new epochs after excluding anomalous observation meet the Chi-square test. Since the old epochs of D1b do not conform to the Chi-square test, it contains considerable anomalous observations. After identification and elimination of them, the observations that can be used for position calculation will be greatly reduced, so only a small number of epochs are retained. In D1b, although 79 valid epochs contain 134 normal observations, only two epochs with more than five satellites are involved in the solution. This problem will be alleviated once more constellations are introduced. In addition to the HDBSCAN prediction, we also used some typical supervised classifiers, such as radial basis function (RBF) kernel SVM, decision tree, random forest, adaptive boosting (AdaBoost) and multi-layer perceptron (MLP), to detect anomalous observations of D1b for verification of clustering results. Before the training, we classified Cluster 1 and Cluster 2 in D1a into one category, and anomalous observations into another, forming a binary classifier. The positioning results after classification are listed in Table 6. It can be seen that the positioning results of RBF SVM, decision tree, and MLP are close to that of HDBSCAN. As the availability increases, the positioning accuracy decreases, which indicates that more anomalous observations may be retained in the observations.  Table 5 shows how the above two methods change the positioning results. It can be seen that although RAIM FDE retains most of the epochs, the positioning accuracy does not increase but decreases because the anomalous observations are not effectively eliminated. HDBSCAN improves the plane positioning accuracy to within 1 m, and the two new epochs after excluding anomalous observation meet the Chi-square test. Since the old epochs of D1b do not conform to the Chi-square test, it contains considerable anomalous observations. After identification and elimination of them, the observations that can be used for position calculation will be greatly reduced, so only a small number of epochs are retained. In D1b, although 79 valid epochs contain 134 normal observations, only two epochs with more than five satellites are involved in the solution. This problem will be alleviated once more constellations are introduced. In addition to the HDBSCAN prediction, we also used some typical supervised classifiers, such as radial basis function (RBF) kernel SVM, decision tree, random forest, adaptive boosting (AdaBoost) and multi-layer perceptron (MLP), to detect anomalous observations of D1b for verification of clustering results. Before the training, we classified Cluster 1 and Cluster 2 in D1a into one category, and anomalous observations into another, forming a binary classifier. The positioning results after classification are listed in Table 6. It can be seen that the positioning results of RBF SVM, decision tree, and MLP are close to that of HDBSCAN. As the availability increases, the positioning accuracy decreases, which indicates that more anomalous observations may be retained in the observations. Therefore, for D1b, both the direct prediction based on HDBSCAN and the classifier based on supervised learning can effectively detect anomalous observations. In addition, to deal with the imbalance of positive and negative samples, an over-sampling method called synthetic minority over-sampling technique (SMOTE) [45] was used to increase the number of anomaly samples. However, the desired results were not achieved. Similarly, we also set the elevation mask and C/N 0 mask for this part of data to compare with the proposed method. The positioning results are shown in Table 7. Obviously, due to the poor quality of the data, the positioning results improve only when the C/N 0 mask is set to a large value. One epoch with higher accuracy is retained, which is similar to the results of machine learning methods. However, it is still difficult to determine the optimal elevation and C/N 0 mask, which must be consistent with D1a. In practical applications, the cut-off values of the two parts of data are set uniformly. Therefore, it is generally advisable to abandon the recovery of these epochs, so as not to encumber the positioning result of D1a. Combined with the two parts of observations, it is difficult to identify the anomalies by setting the elevation and C/N 0 mask in the complex urban vehicle-mounted environment, which has little effect on the improvement of the positioning results. Finally, the overall effect of different anomaly detection and exclusion methods on D1 is listed in Table 8. The best localization performance is achieved by using HDBSCAN-based anomaly detection and exclusion method. Compared with the original positioning results, HDBSCAN improves the accuracy by 87.0%, 45.9%, and 69.6% in the east, north, and up directions, respectively. This is in line with the driving path of D1, and it is clear that more anomalous observations are coming from the cross-street (east) direction. Therefore, after anomaly exclusion, the positioning accuracy in the east direction improves the most. Besides, the availability remains at a high level of 92.9% (of course, it depends on the severity of the observation environment), and all the 1657 epochs meet further Chi-square tests. In complex urban vehicular environments, RAIM and cut-off values cannot effectively improve the positioning accuracy but may cause the accuracy to deteriorate.

Predicted Results Based on Supervised Classification
The detection results of anomalies on D1a provide a priori knowledge for the supervised classifier in this section. After training the classifier, the predicted results of the new observations can be used for real-time positioning. There are many classification methods based on supervised learning, including complicated deep neural networks with excellent performance. We did not intend to study the network model in depth, because that is not the point of our article. Several typical lightweight classifiers were used to verify the effectiveness and feasibility of the hybrid learning method in GNSS observation anomaly detection. The parameters of each model are only a preliminarily set.
Tables 9 and 10 list the positioning results of D2 and D3 using different anomaly detection and exclusion methods, respectively. The figures in parentheses represent the number of epochs that do not conform to the Chi-square test. Overall, the RBF SVM classifier has the best effect on the improvement of location results. In D2, the positioning accuracy is improved by 48.4%, 39.6%, and 49.6% in the three directions. The availability is 75.6% because of the harsher environment than D1. While in D3, the positioning accuracy is improved by 45.7%, 63.3%, and 49.1%, and the availability remains at 87.8%. The positioning accuracy improvement of D3 is slightly greater than that of D2, which may be caused by the overlap between the observed trajectories of D3 and D1. Moreover, the positioning accuracy improvement of MLP for D3 is comparable to SVM, which indicates that SVM is not necessarily the only suitable classifier. Better results are expected through more refined parameter tuning. It can be seen that after the anomaly detection and exclusion with the proposed method, the epochs composed of the remaining observations basically conform to the Chi-square test, which also shows the reliability of the algorithm from another perspective.   Figure 13 shows the predicted anomaly results of D2 and D3 on the PC1-PC2 plane using RBF SVM. The subgraph of D2 is zoomed in because some anomalies deviate too far from the normal. After processing by the classifier, normal observations are clustered together, while anomalies are scattered everywhere, which conforms to the assumption from the first category of clustering-based techniques in [25]. Intuitively, the results of the classification are also credible. Nevertheless, the incomplete feature values from the training set and the less elaborate classification model will result in a certain amount of false and missed detections. In addition, the number of satellites and DOP values will drop after excluding anomalous observations, so there may be a small number of relatively large errors in the positioning results, which does not affect the improvement of the overall positioning performance. Remote Sens. 2020, 12 In general, the classifier performs better on the training set than on the test set. Although satisfactory anomaly detection results of the above test sets are obtained by using the hybrid learning rule, the generalization ability of the classification model can still be improved through the following points. Firstly, longer observations from more scenarios should be collected to increase the number of samples in the training set and avoid overfitting. Secondly, it is necessary to extract better and more descriptive feature values, which are not limited to RINEX-level measurements. Finally, the best performance is achieved by selecting a more appropriate classifier and fully tuning the parameters of the model.

The Necessity of Chi-Square Test Separation
As mentioned above, HDBSCAN-based anomaly detection makes the implicit assumption that anomaly points in the sample set account for a small proportion. Therefore, before the clustering algorithm started to run, the Chi-square test is used to separate the offline dataset, where observations in epochs that meet the Chi-square test are considered as the raw training set. To verify the significance of this step, we also used HDBSCAN directly on the whole dataset without Chisquare test separation for the comparative experiment. The parameter determination method is the same as described in Section 3.2.1. Unfortunately, with more anomalies in the dataset, parameter tuning becomes a challenge. When the availability of the dataset without separation is less than 1657 epochs, its positioning error still exceeds the dataset with Chi-square test separation. Additionally, it has five epochs that do not meet the second Chi-square test, as shown in Table 11. During the parameter tuning process, as more "outliers" are detected, the positioning accuracy becomes higher, but the availability will drop sharply, which is not desirable. Through the comparative experiment, it can be found that excessive outliers affect the performance of HDBSCAN anomaly detection and may cause a high false alarm rate, that is, some normal observations are considered as anomalies. Of course, there are many methods of preliminary screening for the dataset, among which the Chisquare test is only a representative one. Besides, the different values of also affect the screening results. However, when clustering-based anomaly detection is performed, reducing the ratio of anomalous observations is a critical step. In general, the classifier performs better on the training set than on the test set. Although satisfactory anomaly detection results of the above test sets are obtained by using the hybrid learning rule, the generalization ability of the classification model can still be improved through the following points. Firstly, longer observations from more scenarios should be collected to increase the number of samples in the training set and avoid overfitting. Secondly, it is necessary to extract better and more descriptive feature values, which are not limited to RINEX-level measurements. Finally, the best performance is achieved by selecting a more appropriate classifier and fully tuning the parameters of the model.

The Necessity of Chi-Square Test Separation
As mentioned above, HDBSCAN-based anomaly detection makes the implicit assumption that anomaly points in the sample set account for a small proportion. Therefore, before the clustering algorithm started to run, the Chi-square test is used to separate the offline dataset, where observations in epochs that meet the Chi-square test are considered as the raw training set. To verify the significance of this step, we also used HDBSCAN directly on the whole dataset without Chi-square test separation for the comparative experiment. The parameter determination method is the same as described in Section 3.2.1. Unfortunately, with more anomalies in the dataset, parameter tuning becomes a challenge. When the availability of the dataset without separation is less than 1657 epochs, its positioning error still exceeds the dataset with Chi-square test separation. Additionally, it has five epochs that do not meet the second Chi-square test, as shown in Table 11. During the parameter tuning process, as more "outliers" are detected, the positioning accuracy becomes higher, but the availability will drop sharply, which is not desirable. Through the comparative experiment, it can be found that excessive outliers affect the performance of HDBSCAN anomaly detection and may cause a high false alarm rate, that is, some normal observations are considered as anomalies. Of course, there are many methods of preliminary screening for the dataset, among which the Chi-square test is only a representative one. Besides, the different values of α also affect the screening results. However, when clustering-based anomaly detection is performed, reducing the ratio of anomalous observations is a critical step.

The Balance between Accuracy and Availability
Due to the lack of true and accurate anomaly labels, we can only verify the effectiveness of the anomaly detection algorithm and supervised classifier by the accuracy of the positioning results after excluding anomalous observations, while also taking into account the availability. However, directly deleting anomalous observations reduces the number of available satellites and weakens the geometric distribution, which also affects the positioning performance to a certain extent. For dual-system GNSS positioning, when the number of satellites is less than five, the positioning process cannot be performed. While correct anomaly detection and exclusion can greatly improve positioning results, it must be acknowledged that this improvement is limited by the above disadvantages in areas where only a few satellites are available. Some epochs cannot output the position due to the too few satellites available, and some epochs can only obtain the suboptimal accuracy due to the poor geometric distribution. Suggestions about determining whether to include, exclude, or downweight multipath or NLOS observations within the navigation solution are given in [19]. In addition, the measurement information from external sensors can also be used to enhance the availability and positioning accuracy of GNSS observations. Although no forthcoming well-developed methods or techniques are available for autonomous integrity monitoring based on standalone GNSS receivers, the integrity monitoring approaches without any additional sensors are considered more promising and attractive because they can reduce the complexity and cost of the on-board equipment [42]. It is certain that when there are sufficient GNSS constellations available, the room for optimizing signal selection will be larger. For D1, a total of 127 epochs have to be abandoned due to fewer than five satellites.

Performance of RAIM with Different False Alarm Rates
In traditional RAIM, Chi-square test based on WSSE is a commonly used method. However, it can only determine whether the measured values are consistent or not, and cannot pick out which observations are anomalous. The epochs that do not meet the Chi-square test are directly discarded, which causes a huge waste of observation information. The positioning error at the epoch conforming to the Chi-square test is still likely to be large because the anomalies cannot be detected completely. In addition, how to properly determine α according to the severity of the environment is also a difficult problem. A large value of α indicates strict inspection conditions, but it also means a large false alarm rate and low credibility. To make our argumentation more complete, a batch of positioning experiments by setting different α values are conducted, and the critical values are obtained by looking up the Chi-square distribution table. The positioning results are listed in Table 12. It can be seen that as the value of α increases, the positioning accuracy becomes higher. However, again, when the availability of RAIM is less than 1657 epochs (the confidence is only 0.3), its plane positioning error is still larger than the proposed method. RAIM improves the overall positioning accuracy by discarding the low-precision positioning results it considers, but for other position solutions, it does not improve the accuracy but maintains their original state. On the other hand, as described in Section 3.1, the overall observation environment of D1 is not very severe, resulting in high availability of RAIM. Although the two schemes are not very comparable (one for epochs and one for observations), we still have a reason to believe that when the available constellations increase or the environment is worse, the superiority of the proposed method will be better demonstrated. As can be seen from the results of the last row in the table, RAIM FDE almost completely fails in urban vehicular environments, regardless of the false alarm rate, while RAIM alone cannot recover abandoned epochs. The authors have thoroughly discussed the limitations of classic RAIM in urban environments in their review article [42], including the difficulty in establishing error statistical models, the reduction of available observations, the existence of a large number of NLOS, and the criteria for integrity risk. Besides, the existing RAIM algorithm in the urban environment needs to be further improved.

Conclusions and Future Work
With the continuous development of ITS and autonomous driving, vehicles require significantly improved accuracy and reliability in terms of communication, time, and position awareness. However, the deterioration of GNSS observation quality caused by the complexity of the urban environment has become one major challenge for reliable positioning, navigation, and timing (PNT) technology. This paper proposed an anomaly detection frame for urban vehicle GNSS observations, consisting of an offline learning system and an online learning system. In the offline system, HDBSCAN clustering is used to detect anomalous observations and construct the labelled offline training set without the need for 3D building models. On this basis, a supervised binary classifier in the online system acquires the classification rule by training the labelled datasets, which are used for anomaly detection of vehicle GNSS observations in real time. The algorithm was validated with measured GPS/BeiDou single frequency data collected by different types of receivers. HDBSCAN-based anomalous observation detection and exclusion improve the original SPP accuracy of D1 by 87.0%, 45.9%, and 69.6% in the east, north, and up directions, respectively. After using the unrefined RBF SVM classifier to detect and exclude anomalies on D2/D3, the positioning accuracy is improved by 48.4%/45.7%, 39.6%/63.3%, and 49.6%/49.1% in the three directions. Besides, the article gave a lot of comparative experiments, including RAIM (FDE), elevation angle and C/N 0 cut-off values, different classification methods, etc. At the same time, some key issues in the practical application of the proposed method were discussed in depth. As the results show, this scheme can greatly improve the positioning accuracy in the urban vehicular environment and has good retention of the availability of observations.
As an exploratory work, the main contribution of this research is to propose a clustering-based anomaly detection method for urban vehicle GNSS observations and demonstrate its feasibility in detail. In the follow-up work, we will further improve it in the following aspects. Firstly, more GNSS observations will be used to establish the offline labelled dataset, so that the training set can cover more scenarios, and increase the generalization ability of the classification model. Secondly, it is necessary to conduct in-depth research on the feature parameters for the types of constellations due to the differences between GNSS constellations. Finally, more suitable anomalous observation detection methods will be sought based on different assumptions for anomaly distribution.
Author Contributions: Conceptualization, Y.X. and W.G.; methodology, Y.X.; formal analysis, Y.X. and Q.Z.; data curation, F.Y.; writing-original draft preparation, Y.X.; writing-review and editing, X.M. and X.Z.; supervision, S.P. and X.M. All authors have read and agreed to the published version of the manuscript.