A Deep-Learning Based GNSS Scene Recognition Method for Detailed Urban Static Positioning Task via Low-Cost Receivers

Li, Yubo; Jiang, Zhuojun; Qian, Chuang; Huang, Wenjing; Yang, Zeen

doi:10.3390/rs16163077

Open AccessArticle

A Deep-Learning Based GNSS Scene Recognition Method for Detailed Urban Static Positioning Task via Low-Cost Receivers

by

Yubo Li

¹

,

Zhuojun Jiang

¹

,

Chuang Qian

^2,*

,

Wenjing Huang

¹

and

Zeen Yang

¹

School of Navigation, Wuhan University of Technology, Wuhan 430063, China

²

Intelligent Transportation Systems Research Center, Wuhan University of Technology, Wuhan 430063, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2024, 16(16), 3077; https://doi.org/10.3390/rs16163077

Submission received: 24 July 2024 / Revised: 18 August 2024 / Accepted: 19 August 2024 / Published: 21 August 2024

(This article belongs to the Special Issue Advances in Multi-GNSS Technology and Applications)

Download

Browse Figures

Versions Notes

Abstract

Global Navigation Satellite Systems (GNSS)-based position service is widely applied in cities, but the precision varies significantly in different obstruction scenes. Scene recognition is critical for developing scene-adaptive GNSS algorithms. However, the complexity of urban environments and the unevenness of received signal especially in low-cost receivers limit the performance of GNSS-based scene recognition models. Therefore, our study aims to construct a scene recognition model suitable for urban static positioning and low-cost GNSS receivers. Firstly, we divide the scenes into five categories according to application requirements, including open area, high urban canyon, unilateral urban canyon, shade of tree and low urban canyon. We then construct feature vectors from original observation data and consider the geometric relationships between satellites and receivers. The different sensitivity to different scenes is discovered through an analysis of the performance of each feature vector in recognition. Therefore, a GNSS positioning scene recognition model based on multi-channel LSTM (MC-LSTM) is proposed. The results of experiments show that an accuracy of 99.14% can be achieved by our model. Meanwhile, only 0.75 s and 1.95 ms are required in model training per epoch and model prediction per data on a CPU, which presents a significant improvement of over 90% compared with existing works. Furthermore, our model can be transferred into different time periods quickly and can maintain robustness in situations where one or two types of observation data are missed. A maximum accuracy of 81.13% can be achieved when two channels are missed, while 96.06% is attainable when one channel is missed. Therefore, our model has the potential for real applications in complex urban environments.

Keywords:

GNSS scene recognition; static urban positioning; deep learning; long short-term memory; transfer learning

1. Introduction

Global Navigation Satellite Systems (GNSS) have undergone rapid development and progress in past decades. With the successive establishment of global services such as GPS, BDS, GLONASS, Galileo and regional services such as QZSS, IRNSS, GNSS-based applications have also widely been implemented. Bike-sharing is a typical example among them. Due to its convenience for short-length travel and last-mile connectivity of public transportation, bike-sharing services have expanded to over 300 cities in mainland China and gained widespread popularity especially in super cities [1]. However, the prosperous growth of bike-sharing has brought a series of problems. Over-deployment by operators, false-parking by users and poor scheduling by public management have caused mass disarray of public sharing-bikes, which has become a hot and troubling issue among citizens [2].

A practical solution to this problem is to use electronic fence technology to regulate users’ parking behaviors, which means the bike-sharing service is made unable to deactivate unless they are parked in the range of virtual “parking zones”. Unfortunately, shared bikes often need to be parked in complex environments, which means GNSS signals could be easily blocked by street trees or affected by strong reflections from buildings on one or both sides [3]. The deteriorated signal quality leads to a reduction in positioning precision, which limits the performance of electronic fence. To address this issue, adjusting the threshold of electronic fence according to the positioning precision is a feasible approach. However, the precision of positioning significantly varies in different obstruction conditions [4], which makes it important to recognize the obstruction scenes before adjusting. On the other hand, improving the precision of GNSS positioning in complex urban environments is also considered as an effective approach and several studies have been proposed in recent years, including anti-multipath [5,6], array signal processing [7,8], high-sensitivity tracking algorithms [9,10] and so on [11]. However, all these scene-adaptive methods require rapid and accurate recognition, or they may cause unexpected errors when scenes are mismatched.

In order to recognize the positioning scenes, researchers have proposed a series of methods, which can be mainly categorized into two technical approaches: multi-sensor fusion and GNSS signal-based methods. From the perspective of multi-sensor fusion, camera, LiDAR [12,13,14], and signal sensors such as accelerometers and gyroscopes are put into use to detect the environmental features surrounding GNSS receivers in order to recognize contexts [15,16]. Although accurate context recognition could be accessible by multi-sensor fusion, the high computational power and device costs make it economically unappealing for bike-sharing operators. From the perspective of GNSS signal-based methods, different observation data and data-constructed features are selected to recognize GNSS positioning contexts. In the field of indoor and outdoor detection,

C / N_{0}

and the number of visible satellites are widely used as classification features [17,18,19]. Recently, more and more research has focused on detailed contexts segmentation and the performances of machine learning in this field. Lai et al. [20] use support vector machine (SVM) to divide the environment into open outdoors, occluded outdoors and indoors, which has reached a recognition accuracy of 90.3%. Dai et al. [21] change the GNSS observation data into sky plots and compare the performances of CNN and Conv-LSTM in recognition of open-outdoor, semi-outdoor, shallow indoor and deep indoor. The results show that CNN can reach an accuracy of 98.82% while Conv-LSTM can reach 99.2%. Zhu et al. [22] conducted an analysis of 196 variables related to visible satellite, satellite distribution, signal strength and multipath effect based on statistical methods to find the most important feature elements. They then used those selected features to recognize contexts and compare the performances of eight machine learning models. The results show that LSTM can reach an accuracy of 95.37% in vehicle-mounted dynamic environmental contexts (open area, urban canyon, boulevard, under viaduct and tunnel) awareness.

The above scene recognition methods show that excellent results can be achieved using GNSS observations to recognize scenes in the condition of lacking multi-sensors, which proves that GNSS-based methods are suitable for shared bikes’ application. However, there are still several unsolved issues:

(a): Many studies only consider large-scale scenes. With regard to urban static positioning tasks such as shared-bike parking, more detailed scenes should to be taken into consideration.
(b): Research studies about machine learning in GNSS scenes recognition still remain in the stage of directly using existing neural networks or constructing feature inputs that meet the requirements of these networks. Those space-transfer strategies may lead to the degradation of original data features. Being able to make full use of observation data in scene recognition remains a challenging problem.
(c): Multi-systems and complex deep learning networks are used to improve the accuracy of recognition. However, the multi-system observations and computation consumption may not be available in low-cost receivers and microprocessors equipped on public devices such as shared bikes. Moreover, there are relatively few studies on transfer learning in GNSS scenes recognition. It is crucial to develop methods that can be quickly transferred into different time periods.

Therefore, summarizing the shortcomings of existing works and considering the characteristics of urban static positioning (shared bike parking) application scenes, we propose a deep-learning-based scene recognition method. Our contributions can be summarized as follows:

(a): A more detailed set of scenes that is suitable for urban static positioning is proposed, including open area, shade of tree, high urban canyon, low urban canyon and unilateral urban canyon. A dataset of 15,000 epochs (3000 s for each scene) is collected for research.
(b): A spatio-temporal correlated method for constructing raw data features is used to analyze the importance of different observation data in recognition based on machine learning. A multi-channel Long Short-Term Memory (MC-LSTM) network for GNSS scene recognition is proposed. The result shows that our method can achieve an accuracy of 99.14% under the condition of using low-cost GNSS receivers to observe single satellite navigation system.
(c): In order to manage the degradation in performances between different time periods, we conduct the transfer learning test of our method’s model. The result shows that our pre-trained model can be fine-tuned with a small number of epochs to adapt to different time periods at the same location, which is cost-acceptable for bike-sharing operators.

The remainder of this article is organized as follows: Section 2 introduces the methodology of our proposed model. Section 3 presents the experiments and an analysis of the results. Section 5 provides our conclusions and directions for future works.

2. Methodology

In this section, we first present the feature analysis and construction. Then we propose a deep-learning-based scene recognition model. The overview of our model is shown in Figure 1. The observation data are firstly constructed into different feature vectors. Then those feature vectors are fed into a multi-channel long short-term memory network (MC-LSTM) to identify five scenes (open area, high urban canyon, unilateral urban canyon, shade of tree, low urban canyon).

2.1. Feature Analysis and Constructions

In this subsection, we first present the observation data and their hidden information. Then we introduce the elevation and azimuth, which shows the geometric relationship between navigation satellites and receivers. Finally, we propose the feature vector definition for our model’s input.

2.1.1. GNSS Observations

Global Navigation Satellite Systems (GNSS) typically consist of three segments: the space (space navigation constellation and satellites), the ground control (master control station, monitoring stations and uploading stations), and the user terminals. Users can receive carrier signals transmitted by the space satellites by various of terminals (receivers). The carrier signals are modulated with ranging codes and navigation messages which can be processed by receivers into different observation data, including the Pseudo range [23], Carrier Phase [24], Doppler frequency [25] and

C / N_{0}

[26].

Pseudo range: The pseudo range observation is an absolute measurement of the distance from the satellite to the receiver, obtained by measuring the signal propagation time delay, and is expressed in meters. The pseudo range includes atmospheric delays and clock biases, and its accuracy is typically at the meter level.

Carrier Phase: Carrier phase observations measure the phase difference between the satellite carrier signal and the reference carrier signal produced by the receiver’s oscillator, making it a relative observation. It is expressed in cycles and includes atmospheric delays, clock biases and ambiguity of whole cycles.

Doppler frequency: The frequency of the GNSS carrier signal received by the receiver differs from the actual frequency transmitted by the satellite. This difference is known as the Doppler shift, and its magnitude is related to the rate of change in distance between the receiver and the satellite. Specifically, the Doppler observation equation for GNSS is as follows:

λ D_{r}^{s} = \dot{ρ_{r}^{s}} - (\dot{δ t_{r}} - \dot{δ t^{s}}) c - \dot{δ_{i o n}^{s}} + \dot{δ_{t r o p}^{s}}

(1)

where

D_{r}^{s}

is the Doppler frequency and it is expressed in Hz.

λ

is the wavelength of the carrier wave.

\dot{ρ_{r}^{s}}

is the rate of change of the distance between the receiver (subscript r denotes the receiver) and the satellite (superscript denotes the satellite).

\dot{δ t^{s}}

and

\dot{δ t^{s}}

is the clock drift of the receiver and the satellite.

\dot{δ_{i o n}^{s}}

and

\dot{δ_{t r o p}^{s}}

is the time derivatives of ionospheric and tropospheric delays.

C/N₀: The carrier-to-noise density ratio (or

C N_{0}

in more straightforward terms) refers to the ratio of the signal power to the noise power per unit bandwidth in the received GNSS signal. It is expressed in dB and reflects the quality of the RF signal received by the receiver.

Then we consider the characteristics of above observations in urban static positioning scene. Under the conditions of no signal occlusion, the distance between the receiver and the satellite changes in a predictable manner due to the receiver being stationary and the satellite moving in a fixed orbit. In different obstruction scenes, the nature of the obstructions and the characteristics of multipath reflections cause varying rates of change in the distance between the receiver and the satellite. Therefore, we believe that the temporal characteristics of the observations can serve as a basis for scene recognition.

All these observation data can be read directly from Receiver Independent Exchange Format (RINEX format) files or the receivers’ output data stream without additional calculations. This is favorable for our application scene which only supports edge computing power. In order to unify the data scale, we normalize the raw observations as follows:

d_{n o r m} = \frac{d - d_{m i n}}{d_{m a x} - d_{m i n}}

(2)

where

d_{m a x}

and

d_{m i n}

are the maximum and minimum values of a specific type of observation data within a time step, and d is the original observation data. The normalization only requires simple arithmetic operations, but can ensure network convergence [27,28].

2.1.2. Satellite Elevation and Azimuth

The satellite elevation angle refers to the angle between the vector from the user’s location to the satellite’s position and its projection onto the Earth ellipsoid tangent plane passing through the user’s location. The satellite azimuth angle refers to the angle between this projection and the true North coordinate axis on the tangent plane, with counterclockwise direction considered positive. The satellite elevation angle and azimuth angle are both related to the position of the user’s receiver [29]. They reflect certain aspects of satellite signals, ranging accuracy, and multipath effects. As shown in Figure 2, visible satellites often exhibit different geometric distributions under various obstruction scenes. The calculation methods for satellite elevation angle and azimuth angle are as follows [30]:

[\begin{matrix} N_{z s} \\ E_{z s} \\ U_{z s} \end{matrix}] = H \times [\begin{matrix} X_{e s} - X_{p} \\ Y_{e s} - Y_{p} \\ Z_{e s} - Z_{p} \end{matrix}], H = [\begin{matrix} - sin B_{p} cos L_{p} & - sin B_{p} sin L_{p} & cos B_{p} \\ - sin L_{p} & cos L_{p} & 0 \\ cos B_{p} cos L_{p} & cos B_{p} sin L_{p} & sin B_{p} \end{matrix}]

(3)

where

N_{z s}

,

E_{z s}

,

U_{z s}

represent the satellite position in the local topocentric coordinate system (ENU).

X_{e s}

,

Y_{e s}

,

Z_{e s}

and

X_{p}

,

Y_{p}

,

Z_{p}

represent the satellite position (calculated from the satellite ephemeris) and station position in the Earth-Centered Earth-Fixed coordinate system (ECEF), respectively. H is the transformation matrix between ENU system and the ECEF system.

B_{p}

and

L_{p}

are the geodetic latitude and longitude of the receiver. Therefore, the elevation (

E l

.) and azimuth (

A z

.) angle is calculated as follows:

A z = a r c t a n (\frac{E_{z s}}{N_{z s}}), E l = a r c s i n (\frac{U_{z s}}{\sqrt{E_{z s}^{2} + N_{z s}^{2} + U_{z s}^{2}}})

(4)

From the physical definitions of the satellite elevation angle and azimuth angle, it can be observed that both the elevation and azimuth angles have upper and lower bounds. The elevation angle ranges from 0 to 90 degrees, and the azimuth angle ranges from

- 180

degrees to 180 degrees. Similarly to the normalization method used for raw observations, we normalize the satellite elevation and azimuth angles as follows:

A z_{n o r m} = \frac{A z - (- 180)}{180 - (- 180)}, E l_{n o r m} = \frac{E l - 0}{90 - 0}

(5)

where the elevation and azimuth angles are both expressed in degree.

2.1.3. Feature Vector Definition

The feature vector is used as the input of recognition model. In this subsection, we are going to introduce our feature vector definition, which considers both original observation data and geometric relationships between satellites and receivers. We use fixed-length sequential vectors to represent the constellation of a specific satellite navigation system. For example, a 32-dimensional vector is used to store the normalized pseudo range observations of GPS because the pseudo-random noise (PRN) code of satellites is in range of 1 to 32 (G01∼G32).

Additionally, we regard the satellite elevation and azimuth angles as a combined feature, which represents the geometric distribution of satellites that the receiver can use. Instead of changing visible satellites into sky plots like Figure 2, we use a

2 N

-dimensional vector to store the normalized angles(where N is the max PRN of the navigation satellite system). Actually, the different dimensions of observation data is the basic of establishment of our multi-channel model, which will be introduced in Section 3 in detail. Our feature definition can be summarized as Table 1. It is worth noting that these vectors of different dimensions can be used both individually and in combination as the input of the multi-channel model.

2.2. A Multi-Channel Model for Scene Recognition

In this subsection, we first introduce the Long Short-Term Memory (LSTM) model, which is served as the fundamental building block for the construction of our model. After comparing the performances of different channels (different feature vectors), we propose our multi-channel model to integrate information from different channels. Finally, we consider the transferability of the model and introduce the transfer learning strategy we use in the temporal dimension.

2.2.1. LSTM and Single-Channel Network

Long short-term memory (LSTM) networks are primarily used for learning and predicting temporal features in data sequences. Actually, in the field of GNSS positioning scene recognition, LSTM has been proven to be the most effective network model among numerous machine learning models [21,22,31]. LSTM is composed of numerous LSTM cells Figure 3, which are used to determine whether information is useful. There are three gates in one cell, including a input gate, a forget gate and an output gate. Additionally, a candidate memory cell is set in the LSTM cell in order to process the chosen memory. A hidden state

h_{t - 1}

and the memory passed to the next cell

C_{t}

are output by each cell. The key expression of the LSTM cell is as follows [32,33,34]:

I_{t} = σ (X_{t} W_{x i} + h_{t - 1} W_{h i} + b_{i})

(6)

F_{t} = σ (X_{t} W_{x f} + h_{t - 1} W_{h f} + b_{f})

(7)

O_{t} = σ (X_{t} W_{x o} + h_{t - 1} W_{h o} + b_{o})

(8)

\hat{C_{t}} = t a n h (X_{t} W_{x c} + h_{t - 1} W_{h c} + b_{c})

(9)

C_{t} = F_{t} ⊙ C_{t - 1} + I_{t} ⊙ \hat{C_{t}};

(10)

where ⊙ represents the dot product and

σ

represents the sigmoid activation function.

I_{t}

,

F_{t}

,

O_{t}

are the outputs of the input gate, forget gate and output gate.

W_{x i}

,

W_{h} i

;

W_{x f}

,

W_{h f}

;

W_{x o}

,

W_{h o}

are the weight matrices of input, forget, output gates.

W_{x c}

,

W_{h c}

are the weight matrices of the candidate memory cell.

b_{i}

,

b_{f}

,

b_{o}

,

b_{c}

are the biases. The hidden state

h_{t}

is caculated as follows:

h_{t} = O_{t} ⊙ t a n h (C_{t})

(11)

To change our feature vector to the input of LSTM, the original feature vector sequences should be transmitted into time segments with a sliding window. It means we combine data vectors within a certain time step into a sequence in chronological order and make it the input of single channel. Take pseudo range feature for example, if a N-dimensional vector represents the normalized pseudo range of the time

t_{0}

, the input of single LSTM channel is in dimension of

(t_{s}, N)

, where

t_{s}

is the time step of sequence. The overview of our single channel LSTM is shown in Figure 4.

To ensure the real-time performance of the algorithm and consider the practical application requirements, we add only 3 LSTM layers in a single channel which is used to process a single type of feature vector. Meanwhile, the time step is set in 10 s. The sliding window is set in 1 s so that 2990 sequences of each scene can be collected. Similar to the majority of machine learning classification tasks, we use the cross-entropy loss function and optimize the parameters of the network with Adam. The network is trained on training and validation datasets for approximately 500 epochs. The batch size is set to 32 and the learning rate is set to

0.00001

.

2.2.2. Scene Categories and Single Channel LSTM Performances

Usually, GNSS positioning scenes are divided into two main parts: outdoor and indoor [17,18,19]. Within urban environments, urban canyons and boulevard are also considered in scene recognition [22,35]. Given the specificity of our application scene (static positioning for shared bikes’ parking), we have subdivided urban canyons into high urban canyon, low urban canyon and unilateral urban canyon. So we divide the positioning scenes into five categories: open area, shade of tree, high urban canyon, low urban canyon and unilateral urban canyon. Due to the bike parking rule, we no longer consider the indoor scenes, which are easily identifiable based on the presence of satellite signals. Figure 5 shows the real-world locations and the street view where we collected data, and Figure 2 shows the distribution of satellites in various scenes.

In our proposed algorithm, we encode the scene categories as one-hot vectors as Table 2. So the index of the max number of output vector represents the highest possibility of the scene. For example, if the model outputs a vector of [0.1, 0.1, 0.5, 0.2, 0.1], the prediction result of scene recognition is the unilateral urban canyon. Under the training setting mentioned in subsection, we have trained and validated the single channel model and the performances are shown in Figure 6.

From the confusion matrices shown in Figure 6, we can find that different feature data have varying sensitivities to the recognition of different scenes. The feature of az/el and

C / N_{0}

performs the best in all categories while other feature data performs in specific category (pseudo range in shade of tree, phase + LLI in high urban canyon). Meanwhile, the convergence speeds of the individual single-channel models also show significant differences, as shown in Figure 7. The az/el has the fast convergence speed while the phase has the slowest. Moreover, the performance of phase + LLI is better than single phase no matter the accuracy or the convergence speed, which prompts us to combine the carrier phase and LLI into one feature consideration in the subsequent research.

2.2.3. MC-LSTM Design

In the last subsection, we analysed the performances of different feature vectors. To improve recognition accuracy and convergence speed of our model across all scenes, we integrate information from different feature vectors through a multi-channel parallel design. The overview of our model is shown in Figure 8. The model consists of channel layers and a fusion classification layer. Different feature vectors are input into corresponding LSTM channels. After LSTM processes and outputs hidden states, these are passed through fully connected layers to merge and classify scene recognition results. It is important to note that the dropout layers are used to introduce non-linearity to the model, which helps prevent overfitting and enhances the model’s ability to generalize [36,37].

In addition, our proposed multi-channel model allows for flexible combinations of channels based on scenes and requirements. If the receiver cannot output a certain type of observation, the corresponding data vector channel can be excluded.

2.2.4. Transfer Learning Settings

A trained model can achieve a high accuracy in training and validation dataset but usually does not perform well in another dataset. Therefore, classification models need to be retrained when changes in the feature space or the feature distribution occur [38]. In our application scene for recognizing shared bicycle parking locations, the absolute position used is generally fixed, but the time when users park is random. It makes the cost exceptionally high for collecting data from all time periods for training. Therefore, the quicker we can execute the transfer learning between time periods, the stronger our model’s generalization is. This means the operator of shared bikes can quickly deploy the pre-trained model with simple fine-tuning to different time periods, significantly reducing the time required for model training. To test the temporal transfer performance of our model, we collected data from different time periods and performed full-layer transfer. The distribution of the data over time is shown in Table 3.

3. Results

In this section, we first introduce the device we used and the dataset we collected. We then analyze the performance of our model and compare it with those from other studies. Finally, we evaluate the real-time performance of our proposed model and present the transfer learning results.

3.1. Device and Dataset

The experimental data were collected using a receiver based on HD8020 series chipset (Allystar Technology, Shenzhen, China), which is a low-cost GNSS receiver chip solution widely used in shared bikes. The appearance of the data collection device is shown in Figure 9. In order to closely replicate the shared bicycle parking scene, we chose not to alter the structure of the bike’s positioning system (GNSS module + antenna). Instead, we directly connected the bike’s built-in module to a PC via a serial port to obtain data.

To ensure the generalization performance of our proposed model, During each time periods, we collected 10 min’s data for each positioning scene (open area, high urban canyon, unilateral urban canyon, shade of tree and low urban canyon). Five such time periods are taken into consideration. Moreover, We shuffled the collected data samples before training and used a cross-validation strategy (70% for training while 30% for validation). As for the dataset for time-transfer-learning, we collected the same distribution data a few days later than training data. The total distribution of the dataset is shown in Table 3.

3.2. Multi-Channel LSTM Model Results

In this subsection, we analyze the performances of our proposed model. First, we introduce the recognition accuracy of our multi-channel LSTM model by confusion matrices and show the changes of accuracy and loss through training epochs by figure. We then compare the results (including accuracy and convergence speeds) of multi-channel model and single channel LSTM mentioned in Section 3. Finally, we compare the performances of our model and other proposed algorithms in recent years.

The confusion matrices and loss and accuracy changes in training epochs are shown in Figure 10. Our model can reach a mean accuracy of 99.14% in all categories and an accuracy of at least 97.5% for each individual category on validation set. From the loss & accuracy changes we can easily find that after only a few training epochs (about 50 epochs) our model is already reaching an accuracy over 90%, which shows it performs well both in recognition as well as training consumption. We observed that the training and validation accuracy of the model initially dropped once during the training process and then quickly increased. This indicates that adding dropout layers in the single channel helps prevent the model from getting stuck in local optima, which aligns with our understanding of nonlinear models [39].

Now we are going to take multi-channel and single-channel into consideration. Compared with the single-channel model mentioned in Section 3, the multi-channel can reach higher accuracy in scene recognition, including mean accuracy of all categories as well as the specific category. The detailed comparison is shown in Table 4.

Despite the accuracy, convergence speed is also an important metric for evaluating model performance. Compared with the single channel model, be similar to accuracy, multi-channel model can also converge faster. Take azimuth and elevation angles channel (the fastest channel in convergence speed) for example, just like Figure 11 shows, the multi-channel model is faster than any other single channel in convergence speed. All results above indicate that our multi-channel design significantly enhances the performance of LSTM in recognizing GNSS positioning scenes, in terms of both accuracy and convergence speed. This demonstrates the benefits of our strategy to integrate multiple channels. It is worth noting that instead directly using existing network, our proposed model design strategy is one of the few specialized network structures specifically designed for GNSS positioning scenes in current related research.

After analyzing the accuracy and convergence speeds, we have already substantiated that our model exhibits superior performance in GNSS positioning scenes recognition. Table 5 shows the comparison between our model and other proposed algorithms in recent years, including accuracy and scene granularity. Our model is the only algorithm which can achieve an accuracy of 99% across five scenes (the scene no GNSS signal received is not included). Indeed, we are also the only researchers who take detailed urban scenes (high, low and unilateral) into consideration.

3.3. Potential Real-Time Ability

In this subsection, we analyze the real-time ability of our proposed model, including the consumption of offline model-training and online model-prediction. We randomly selected 3000 s (2950 time sequences are constructed) of data from the training dataset to evaluate the real-time performance of our model. It is noteworthy that we chose a CPU (Intel (R) Core (TM) i5-12500H, 2.5 GHz) but not a GPU as the computing platform for our real-time performance test and still achieved excellent results.

To evaluate the time consumption of model-training and model-prediction, we first introduce the definition of the metrics: training time per epoch (TPE) and prediction time per data (PPD). Our TPE and PPD is shown in Figure 12. 0.75 s is cost the most in training a dataset of 2950 sequences for one epoch, which makes it possible for our model to do the online fine-tune according to the real-time data. Moreover, only 1.95 ms is needed to perform a full prediction, which makes it possible for our model to be deployed in real-time scenes without any noticeable additional time delay. Compared with existing convolution blocks-relied works, our specially designed multichannel model has a 60% improvement in TPE and 90% in PDD [21,22]. Only 40,817 parameters are included in our model.

3.4. Time-Transfer Ability

The high precision a pre-trained model can achieve requires that the training data and the test data have similar distributions in feature spaces. Therefore, classification models need to be retrained when changes in the feature space or the feature distribution occur [38]. In our application scenes, we always face that the cost of collecting data of all time periods is unacceptable. Meanwhile, the temporal dimension is a decisive factor in evaluating the generalization capability of our model. So the transfer ability between time periods is taken into consideration in our model. Table 6 shows the details of transfer learning and we can find that our pre-trained model can be fine-tuned with a small number of epochs to adapt to different time periods at the same location. Moreover, the transfer learning performs better than non-transfer (directly) learning in most scenes and training stages. The mean accuracy of transfer learning at every check points (0, 10, 20, 30 and 100 epochs) is higher than non-transfer learning.

3.5. Data-Loss Robustness

Due to the limitations in the number of observation channels of low-cost receivers, the application of our model always faces the situation that some observation data are missing. Therefore, we decide to do an additional experiment to evaluate the robustness of our model in such data-missing situations. As the azimuth and elevation angles are calculated by satellite ephemeris and receiver’s position, the pseudo range is the least requirements of calculating the azimuth and elevation angles. So we divide the channels into four groups to test our model’s robustness: pseudo range and azimuth & elevation angles (G1), carrier phase and LLI (G2), Doppler frequency (G3),

C / N_{0}

(G4). As the results show in Table 7, our model can achieve an accuracy of over 80% even in the situation that one or two channels’ missing. For example, if the carrier phase data and

C / N_{0}

data are missing, our model can also reach an accuracy of 81.13% in scenes recognition.

4. Discussion

4.1. Detailed Scenes Recognition in Urban Static GNSS Positioning

Accurate GNSS scenes recognition in urban environments plays an important role in high-precision urban positioning. All the scene-adaptive methods to improve the precision of GNSS positioning require efficient scenes recognition. As shown in Table 5, existing works mainly focus on Indoor and Outdoor recognition [17,20,21] or only consider coarse-grained outdoor scenes [22,35]. Especially for the urban canyon, most research has ignored the different directions of signal blockage. Our proposed model considers the real requirements of urban static positioning application (shared bike parking) and divides the urban canyon scene into high, low and unilateral categories. It achieves an accuracy of over 99% in scenes recognition and is specially designed in multi-channel structure according to the characteristics of GNSS tasks, which represents great progress compared with directly using the existing networks in this field. Detailed scenes and multi-channel network structure design provide a new perspective of GNSS scenes recognition and has the possibility of freely adapting to different tasks according to environments.

4.2. Potential in Real GNSS Scenes Recognition Application

The complexity of urban environment and the limitation of computing power of equipment have always been the main obstacles to the practical application of GNSS scene recognition method based on deep learning. The lightweight network parameter scale, real-time ability and robustness to signal reception failure should be taken into consideration in the real GNSS scenes recognition application. Existing works achieve high precision of recognition by using convolution blocks, which makes the scale of network parameters unacceptable for low-cost computing devices. Our proposed model lightened the hidden layers according to the characteristics of GNSS features and only requires milliseconds of extra time in recognition. Surprisingly, our multi-channel model can also achieve high recognition accuracy even in cases where some observations are lost. All the above findings show that our proposed multi-channel LSTM network structure has a wide potential in real GNSS scenes recognition applications.

5. Conclusions

In this study, a scenes recognition model for urban static GNSS positioning is proposed. The proposed model is able to provide detailed scene information only using a single satellite system’s observations via low-cost receivers equipped on the shared bikes. It organizes the original observation data into five feature vectors with different dimensions but without any additional calculation except normalization. After analyzing the performances of different feature vectors, a specially designed multi-channel LSTM model is proposed to further improve the recognition accuracy in different scenes. Experiments are designed to evaluate the proposed model in terms of accuracy, time consumption and robustness. The results show that our proposed model can achieve an average accuracy of 99.14% in all scenes and at least 97.5% in an individual scene. The time delays of model training and prediction on a CPU are 0.75 s per epoch and 1.95 ms per data, which makes it possible for our model to be deployed in real-time applications. Furthermore, our model can be transferred into different time periods with only a few epochs training and maintain a high accuracy even in the presence of missing channel data (96.06% for one channel missed and 81.13% for two channels missed), which shows the robustness of our model in real-world applications. In future works, more detailed and mixed scenes should be taken into consideration. In addition, the contributions from different satellite systems and frequencies require further research.

Author Contributions

Conceptualization, C.Q. and Y.L.; methodology, Y.L.; validation, Y.L., Z.J. and W.H.; investigation, Y.L., Z.J. and Z.Y.; resources, Z.J. and W.H.; writing—original draft preparation, Z.J. and W.H.; writing—review and editing, Y.L. and C.Q.; visualization, Z.J. and Z.Y.; supervision, C.Q.; project administration, C.Q.; funding acquisition, C.Q. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Natural Science Foundation of China (42301506).

Data Availability Statement

All data contained within this study are available from the authors for academic purposes on request.

Acknowledgments

The authors would like to thank the hardware and software support provided by Beijing Sankuai Science and Technology Co. (Meituan Co.), Bejing, China in data collection.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Sun, Y. Sharing and riding: How the dockless bike sharing scheme in China shapes the city. Urban Sci. 2018, 2, 68. [Google Scholar] [CrossRef]
Chang, S.; Song, R.; He, S.; Qiu, G. Innovative Bike-Sharing in China: Solving Faulty Bike-Sharing Recycling Problem. J. Adv. Transp. 2018, 1, 4941029. [Google Scholar]
Yao, H.; Dai, Z.; Chen, W.; Xie, T.; Zhu, X. GNSS Urban Positioning with Vision-Aided NLOS Identification. Remote Sens. 2022, 14, 5493. [Google Scholar] [CrossRef]
Shytermeja, E.; Paśnikowski, M.J.; Julien, O.; López, M.T. GNSS quality of service in urban environment. In Multi-Technology Positioning; Springer: Cham, Switzerland, 2017; pp. 79–105. [Google Scholar]
Closas, P.; Fernández-Prades, C.; Arribas, J. A Bayesian approach to multipath mitigation in GNSS receivers. IEEE J. Sel. Top. Signal Process. 2009, 3, 695–706. [Google Scholar] [CrossRef]
Zou, X.; Li, Z.; Wang, Y.; Deng, C.; Li, Y.; Tang, W.; Fu, R.; Cui, J.; Liu, J. Multipath error fusion modeling methods for Multi-GNSS. Remote Sens. 2021, 13, 2925. [Google Scholar] [CrossRef]
Fernández-Prades, C.; Arribas, J.; Closas, P. Robust GNSS receivers by array signal processing: Theory and implementation. Proc. IEEE 2016, 104, 1207–1220. [Google Scholar] [CrossRef]
Sun, Y.; Chen, F.; Lu, Z.; Wang, F. Anti-jamming method and implementation for GNSS receiver based on array antenna rotation. Remote Sens. 2022, 14, 4774. [Google Scholar] [CrossRef]
Del Peral-Rosado, J.A.; López-Salcedo, J.A.; Seco-Granados, G.; López-Almansa, J.M.; Cosmen, J. Kalman filter-based architecture for robust and high-sensitivity tracking in GNSS receivers. In Proceedings of the 2010 5th ESA Workshop on Satellite Navigation Technologies and European Workshop on GNSS Signals and Signal Processing (NAVITEC), Noordwijk, The Netherlands, 8–10 December 2010; pp. 1–8. [Google Scholar]
Yang, H.; Zhou, B.; Wang, L.; Wei, Q.; Ji, F.; Zhang, R. Performance and evaluation of GNSS receiver vector tracking loop based on adaptive cascade filter. Remote Sens. 2021, 13, 1477. [Google Scholar] [CrossRef]
Xu, P.; Zhang, G.; Yang, B.; Hsu, L.T. Machine Learning in GNSS Multipath/NLOS Mitigation: Review and Benchmark. IEEE Aerosp. Electron. Syst. Mag. 2024, 1, 1–17. [Google Scholar] [CrossRef]
Wender, S.; Dietmayer, K. 3D vehicle detection using a laser scanner and a video camera. IET Intell. Transp. Syst. 2008, 2, 105–112. [Google Scholar] [CrossRef]
Zhu, F.; Shen, Y.; Wang, Y.; Jia, J.; Zhang, X. Fusing GNSS/INS/vision with a priori feature map for high-precision and continuous navigation. IEEE Sens. J. 2021, 21, 23370–23381. [Google Scholar] [CrossRef]
Cheng, J.; Xiang, Z.; Cao, T.; Liu, J. Robust vehicle detection using 3D Lidar under complex urban environment. In Proceedings of the 2014 IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, China, 31 May–7 June 2014; pp. 691–696. [Google Scholar]
Feriol, F.; Vivet, D.; Watanabe, Y. A review of environmental context detection for navigation based on multiple sensors. Sensors 2020, 20, 4532. [Google Scholar] [CrossRef] [PubMed]
Gao, H.; Groves, P.D. Context determination for adaptive navigation using multiple sensors on a smartphone. In Proceedings of the 29th International Technical Meeting of The Satellite Division of the Institute of Navigation (ION GNSS+ 2016), Portland, OR, USA, 12–16 September 2016; pp. 742–756. [Google Scholar]
Chen, K.; Tan, G. SatProbe: Low-energy and fast indoor/outdoor detection based on raw GPS processing. In Proceedings of the IEEE INFOCOM 2017-IEEE Conference on Computer Communications, Atlanta, GA, USA, 1–4 May 2017; pp. 1–9. [Google Scholar]
Bui, V.; Le, N.T.; Vu, T.L.; Nguyen, V.H.; Jang, Y.M. GPS-based indoor/outdoor detection scheme using machine learning techniques. Appl. Sci. 2020, 10, 500. [Google Scholar] [CrossRef]
Bai, Y.B.; Holden, L.; Kealy, A.; Zaminpardaz, S.; Choy, S. A hybrid indoor/outdoor detection approach for smartphone-based seamless positioning. J. Navig. 2022, 75, 946–965. [Google Scholar] [CrossRef]
Lai, Q.; Yuan, H.; Wei, D.; Li, T. Research on GNSS/INS integrated positioning method for urban environment based on context aware. Navig. Position. Timing 2021, 8, 151–162. (In Chinese) [Google Scholar]
Dai, Z.; Zhai, C.; Li, F.; Chen, W.; Zhu, X.; Feng, Y. Deep-learning-based scenario recognition with GNSS measurements on smartphones. IEEE Sens. J. 2022, 23, 3776–3786. [Google Scholar] [CrossRef]
Zhu, F.; Luo, K.; Tao, X.; Zhang, X. Deep Learning Based Vehicle-Mounted Environmental Context Awareness via GNSS Signal. IEEE Trans. Intell. Transp. Syst. 2024, 1, 1–14. [Google Scholar] [CrossRef]
Chaffee, J.; Abel, J. On the exact solutions of pseudorange equations. IEEE Trans. Aerosp. Electron. Syst. 1994, 30, 1021–1030. [Google Scholar] [CrossRef]
Forssell, B.; Martin-Neira, M.; Harrisz, R.A. Carrier phase ambiguity resolution in GNSS-2. In Proceedings of the 10th International Technical Meeting of the Satellite Division of the Institute of Navigation (ION GPS 1997), Kansas City, MO, USA, 16–19 September 1997; pp. 1727–1736. [Google Scholar]
Chenggong, Z.; Xi, C.; Zhen, H. A comprehensive analysis on Doppler frequency and Doppler frequency rate characterization for GNSS receivers. In Proceedings of the 2016 2nd IEEE International Conference on Computer and Communications (ICCC), Chengdu, China, 14–17 October 2016; pp. 2606–2610. [Google Scholar]
Falletti, E.; Pini, M.; Presti, L.L. Low complexity carrier-to-noise ratio estimators for GNSS digital receivers. IEEE Trans. Aerosp. Electron. Syst. 2011, 47, 420–437. [Google Scholar] [CrossRef]
Pei, X.; Zhao, Y.; Chen, L.; Guo, Q.; Duan, Z.; Pan, Y.; Hou, H. Robustness of machine learning to color, size change, normalization, and image enhancement on micrograph datasets with large sample differences. Mater. Des. 2023, 232, 112086. [Google Scholar] [CrossRef]
Singh, D.; Singh, B. Investigating the impact of data normalization on classification performance. Appl. Soft Comput. 2020, 97, 105524. [Google Scholar] [CrossRef]
Zhang, X.; He, L.; Li, Y.; Zhang, R. An improved star selection algorithm based on altitude angle and GDOP contribution value. Softw. Guide 2016, 15, 16–20. (In Chinese) [Google Scholar]
Hu, X.; Liu, F.; Weng, H. Observability analysis of MSINS/GPS complete integrated system. J. Chin. Inert. Technol. 2011, 19, 38–45. (In Chinese) [Google Scholar]
Zhu, Y.; Luo, H.; Zhao, F.; Chen, R. Indoor/outdoor switching detection using multisensor DenseNet and LSTM. IEEE Internet Things J. 2020, 8, 1544–1556. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Van Houdt, G.; Mosquera, C.; Nápoles, G. A review on the long short-term memory model. Artif. Intell. Rev. 2020, 53, 5929–5955. [Google Scholar] [CrossRef]
Zhang, A.; Lipton, Z.C.; Li, M.; Smola, A.J. Modern Recurrent Neural Network. In Dive into Deep Learning; Cambridge University Press: Cambridge, UK, 2023; pp. 342–348. [Google Scholar]
Wang, Y.; Liu, P.; Liu, Q.; Adeel, M.; Qian, J.; Jin, X.; Ying, R. Urban environment recognition based on the GNSS signal characteristics. Navigation 2019, 66, 211–225. [Google Scholar] [CrossRef]
Baldi, P.; Sadowski, P.J. Understanding dropout. In Proceedings of the Advances in Neural Information Processing systems, Lake Tahoe, NV, USA, 5–8 December 2013; Volume 26. [Google Scholar]
Gal, Y.; Ghahramani, Z. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In Proceedings of the International Conference on Machine Learning, New York, NY, USA, 19–24 June 2016; pp. 1050–1059. [Google Scholar]
Ying, W.; Zhang, Y.; Huang, J.; Yang, Q. Transfer learning via learning to transfer. In Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; pp. 5085–5094. [Google Scholar]
Kingma, D.P.; Salimans, T.; Welling, M. Variational dropout and the local reparameterization trick. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 7–12 December 2015; Volume 28. [Google Scholar]

Figure 1. The overview of our proposed model.

Figure 2. The sky plots of GNSS satellites (GPS system only) in different scenes. This figure shows the elevation and azimuth angles of navigation satellites.

Figure 3. The structure of LSTM cell, including the input gate, forget gate, output gate and the candidate memory cell. The LSTM cell is used to determine whether information is useful.

Figure 4. The structure of LSTM network and the full structure of our single channel recognition model. The blue blocks in this figure represent the LSTM cells metioned in Figure 3.

Figure 5. The street view of real-world locations where we collected data, including open area, high/low/unilateral urban canyon and shade of tree.

Figure 6. The confusion matrices of different feature vector, which shows the performances of different channels in recognition.

Figure 7. The comparison of convergence speed between different channel, including elevation and azimuth angles (azel), pseudo range (psr), doppler frequency(dopp),

C / N_{0}

, phase and phase + LLI. The horizontal axis represents the training epochs, and the vertical axis represents the average accuracy of the model on the validation set.

Figure 7. The comparison of convergence speed between different channel, including elevation and azimuth angles (azel), pseudo range (psr), doppler frequency(dopp),

C / N_{0}

, phase and phase + LLI. The horizontal axis represents the training epochs, and the vertical axis represents the average accuracy of the model on the validation set.

Figure 8. The overview of our proposed multi-channel LSTM model for GNSS scene recognition.

Figure 9. The data collection platform.

Figure 10. The training results of our proposed multi-channel LSTM GNSS scenes recognition model. (a) is the confusion matrices while (b) shows the changes in loss and accuracy.

Figure 11. Comparison of convergence speeds between multi-channel and single-channel LSTM.

Figure 12. The training time per epoch and prediction time per data of our model. X-axis shows the number of channels. Left y-axis shows the training time per epoch and expresses in seconds. Right y-axis shows the prediction time per data and expresses in milliseconds.

Table 1. Brief Feature Definition List.

Feature Information	Feature Dimension
Elevation and azimuth angles	2N ¹
Pseudo range	N
Carrier phase	N
Carrier phase and LLI	$2 N$
Doppler frequency	N
$C / N_{0}$	N

¹ Where N is the max number of PRN for the navigation satellite system. For example, if the satellite system is GPS,

N = 32

.

Table 2. Scene Categories and Encoding.

Categories	Encoding
open area	[1, 0, 0, 0, 0]
high urban canyon	[0, 1, 0, 0, 0]
unilateral urban canyon	[0, 0, 1, 0, 0]
shade of tree	[0, 0, 0, 1, 0]
low urban canyon	[0, 0, 0, 0, 1]

Table 3. Transfer Learning Data.

Dataset	Scene	Time of First Observation	Lasting Time
Dataset-1	open area	10:31/13:04/14:16/15:21/16:26	10 min for each
	high urban canyon	10:43/13:16/14:28/15:32/16:38	10 min for each
	unilateral urban canyon	10:57/13:33/14:40/15:45/16:51	10 min for each
	shade of tree	11:16/13:44/14:55/16:00/17:15	10 min for each
	low urban canyon	11:34/14:02/15:07/16:13/17:28	10 min for each
Dataset-2 *	open area	11:09	10 min
	high urban canyon	11:21	10 min
	unilateral urban canyon	11:33	10 min
	shade of tree	11:44	10 min
	low urban canyon	11:57	10 min

* The dataset which is used as the target of transfer learning.

Table 4. Comparison between Multi-Channel and Single-Channel.

Channel *	Accuracy in Scenes					Mean Accuracy
Channel *	Open Area	High Urban Canyon	Unilateral Urban Canyon	Shade of Tree	Low Urban Canyon	Mean Accuracy
azel	97.31%	94.16%	92.53%	91.76%	98.77%	94.90%
psr	69.47%	86.47%	87.95%	96.03%	98.00%	87.58%
adr	99.60%	90.61%	85.33%	78.91%	98.55%	90.60%
adrs	99.80%	98.86%	91.55%	80.74%	95.71%	93.22%
dopp	94.79%	80.96%	44.92%	88.46%	96.50%	81.13%
$C / N_{0}$	98.44%	99.60%	94.73%	93.01%	96.71%	96.50%
Multi-channel	100.00%	100.00%	97.49%	98.22%	99.99%	99.14%

* Where “azel” represents the azimuth and elevation angles, “psr” represents the pseudo range, “adr” represents the carrier phase, “adrs” represents the carrier phase plus LLI (loss of lock indicator), “dopp” represents the doppler frequency. “multi-channel” represents our proposed recognition model: MC-LSTM.

Table 5. Scene Categories and Encoding.

Authors	Years	Methods	Scenes *	Accuracy
Chen et al. [17]	2017	Threshold judgment	I/O	85.6%
Wang et al. [35]	2019	SVM & temporal filtering	5 urban dynamic scenes	89.30%
Lai et al. [20]	2021	SVM	3 scenes of I/O	90.3%
Dai et al. [21]	2022	CNN & conv-LSTM	4 scenes in I/O	98.82% & 99.92%
Zhu et al. [22]	2024	LSTM	4 urban dynamic scenes	95.39%
Ours	2024	MC-LSTM	5 urban static scenes	99.14%

* Where the scenes that GNSS signals can not be received is ignored in this table. I/O means the indoor and outdoor scenes.

Table 6. Comparison between transfer and non-transfer learning.

Scenes *	Accuracy through Epochs
	0 Epoch		10 Epochs		20 Epochs		30 Epochs		100 Epochs
	T.	Non-T.	T.	Non-T.	T.	Non-T.	T.	Non-T.	T.	Non-T.
open a.	98.31%	19.84%	100.00%	0.00%	100.00%	85.59%	100.00%	0.00%	100.00%	99.83%
high u.c.	3.39%	19.85%	90.85%	55.76%	94.92%	98.64%	97.63%	100.00%	99.32%	100.00%
unilateral	0.00%	19.68%	98.81%	0.00%	100.00%	0.00%	96.27%	3.90%	96.27%	88.81%
s.o.t.	0.00%	19.97%	0.00%	19.32%	6.61%	88.81%	100.00%	91.53%	100.00%	100.00%
low u.c.	85.93%	20.58%	88.47%	100.00%	95.76%	100.00%	95.93%	99.66%	100.00%	98.64%
mean	37.52%	19.98%	75.62%	35.02%	79.46%	74.61%	97.97%	59.02%	99.12%	97.46%

* Where “open a.” represents the open area, “high u.c.” represents the high urban canyon, “unilateral” represents the unilateral urban canyon, “s.o.t.” represents the shade of tree, “low u.c.” represents the low urban canyon. Columns named “T.” shows the transfer learning results while “Non-T.” shows the non-transfer learning.

Table 7. Channel-missing Robustness.

Combinations	Mean Accuracy
G1 & G2	54.80%
G1 & G3	81.13%
G1 & G4	54.24 %
G1 & G2 & G3	90.28%
G1 & G2 & G4	57.17%
G1 & G3 & G4	96.06%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, Y.; Jiang, Z.; Qian, C.; Huang, W.; Yang, Z. A Deep-Learning Based GNSS Scene Recognition Method for Detailed Urban Static Positioning Task via Low-Cost Receivers. Remote Sens. 2024, 16, 3077. https://doi.org/10.3390/rs16163077

AMA Style

Li Y, Jiang Z, Qian C, Huang W, Yang Z. A Deep-Learning Based GNSS Scene Recognition Method for Detailed Urban Static Positioning Task via Low-Cost Receivers. Remote Sensing. 2024; 16(16):3077. https://doi.org/10.3390/rs16163077

Chicago/Turabian Style

Li, Yubo, Zhuojun Jiang, Chuang Qian, Wenjing Huang, and Zeen Yang. 2024. "A Deep-Learning Based GNSS Scene Recognition Method for Detailed Urban Static Positioning Task via Low-Cost Receivers" Remote Sensing 16, no. 16: 3077. https://doi.org/10.3390/rs16163077

APA Style

Li, Y., Jiang, Z., Qian, C., Huang, W., & Yang, Z. (2024). A Deep-Learning Based GNSS Scene Recognition Method for Detailed Urban Static Positioning Task via Low-Cost Receivers. Remote Sensing, 16(16), 3077. https://doi.org/10.3390/rs16163077

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Deep-Learning Based GNSS Scene Recognition Method for Detailed Urban Static Positioning Task via Low-Cost Receivers

Abstract

1. Introduction

2. Methodology

2.1. Feature Analysis and Constructions

2.1.1. GNSS Observations

2.1.2. Satellite Elevation and Azimuth

2.1.3. Feature Vector Definition

2.2. A Multi-Channel Model for Scene Recognition

2.2.1. LSTM and Single-Channel Network

2.2.2. Scene Categories and Single Channel LSTM Performances

2.2.3. MC-LSTM Design

2.2.4. Transfer Learning Settings

3. Results

3.1. Device and Dataset

3.2. Multi-Channel LSTM Model Results

3.3. Potential Real-Time Ability

3.4. Time-Transfer Ability

3.5. Data-Loss Robustness

4. Discussion

4.1. Detailed Scenes Recognition in Urban Static GNSS Positioning

4.2. Potential in Real GNSS Scenes Recognition Application

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI