Advancing Map-Matching and Route Prediction: Challenges, Methods, and Unified Solutions

Waksmundzki, Tomasz; Niewiadomska-Szynkiewicz, Ewa; Granat, Janusz

doi:10.3390/electronics14183608

Open AccessArticle

Advancing Map-Matching and Route Prediction: Challenges, Methods, and Unified Solutions

by

Tomasz Waksmundzki

,

Ewa Niewiadomska-Szynkiewicz

^*

and

Janusz Granat

Institute of Control and Computation Engineering, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, Poland

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(18), 3608; https://doi.org/10.3390/electronics14183608

Submission received: 3 August 2025 / Revised: 30 August 2025 / Accepted: 9 September 2025 / Published: 11 September 2025

(This article belongs to the Special Issue Big Data Analytics and Information Technology for Smart Cities and Citizen Wellbeing)

Download

Browse Figures

Versions Notes

Abstract

Map-matching involves aligning raw positioning data with actual road networks. It is a complex process due to measurement inaccuracies, ambiguous street layouts, and sensor noise. The paper explores the challenges in map-matching and vehicle route prediction and presents an overview of existing methods and algorithms. The solutions employing hidden Markov models (HMMs), where emission and transition probabilities are crucial in correctly matching positions to roads, are examined and evaluated. Machine Learning (ML) offers robust algorithms capable of managing complex urban environments and varied data sources. While HMMs have demonstrated their efficacy in capturing sequential dependencies, more advanced ML techniques, including deep learning, provide enhanced capabilities for learning spatial and temporal relationships. They improve prediction accuracy and adapt to evolving traffic conditions and diverse vehicle behaviours. Special attention is paid to a holistic solution, assuming a combination of map-matching and route prediction within a unified framework. It fosters more efficient route planning, real-time traffic management, and overall decision-making in intelligent transportation systems.

Keywords:

map-matching; route prediction; intelligent transport; machine learning; hidden Markov model

1. Introduction

Rapid advancements in geolocation technologies have facilitated access to vast datasets describing vehicle trajectories. These data, collected using a wide range of devices from specialised on-board vehicle systems to standard mobile devices, constitute a crucial resource for numerous applications in Intelligent Transport Systems (ITS), mobility planning, traffic management, route optimisation, and safety analyses. Despite their significant potential, raw GPS data are seldom suitable for direct use in advanced studies. This is due to their inaccuracies and limitations, which can lead to erroneous conclusions and inefficient solutions [1]. The inaccuracy of GPS devices can stem from numerous factors that affect the precision of the determined position. One of the key sources of error involves signal propagation phenomena, such as the multi-path effect, where signals reflect off obstacles, particularly in dense urban environments, leading to distortions [2]. Additionally, the signal’s passage through the ionosphere and troposphere can alter its velocity, thereby affecting the accuracy of distance measurements [3]. Errors within the receivers themselves, including inaccuracies in their internal clocks, are also significant [3]. Other sources of error include satellite orbital inaccuracies, where deviations in the predicted satellite orbits introduce errors into positioning calculations, and receiver noise, interference introduced by the GNSS receiver, which affects measurement accuracy. All these elements contribute to the overall uncertainty of GPS measurements, which is particularly problematic in applications requiring high precision.

This paper presents an overview of the methods currently used for map-matching and vehicle route prediction, focusing on techniques that utilise artificial intelligence algorithms. These processes form the technological backbone of modern navigation services that are used daily, such as Google Maps and Waze, as well as advanced autonomous vehicle guidance systems. The work provides a critical review of existing solutions, highlighting their limitations. Based on this review, we propose directions for novel research that could improve the effectiveness of this type of solution. The problem of map-matching involves aligning observed geolocation data with the correct segments of a digital road network to reconstruct the vehicle’s actual route [4]. In this process, in addition to basic coordinates, supplementary information such as azimuth [5] and speed [6] is often utilised, and its interpretation takes into account the geometric and topological structure of the road network. The problem of vehicle trajectory prediction involves estimating an object’s future route based on its historical path and surrounding conditions. The application of map-matching algorithms is crucial for improving the quality of the input data required for building precise predictive models, as the enhanced accuracy and reliability of historical trajectories provide a more robust foundation for their development.

This article is structured in the following way. First, Section 2 sets out the formal formulation of the map-matching and route prediction problems. Then, Section 3 presents a literature review, focusing on established map-matching algorithms. It outlines the development of key approaches and the current state of the art; furthermore, a classification of these approaches is proposed to systematise the knowledge in this field. Section 4 is devoted to the analysis of algorithms used in the field of route prediction. It discusses key approaches and techniques for forecasting the future position of vehicles. Section 5 discusses multimodal data fusion, an approach that integrates geolocation data with information from a wide range of sensors to obtain a significantly more accurate and reliable situational awareness. Finally, Section 6 provides a summary, synthesises the main findings, and identifies potential avenues for future research.

2. Problem Formulation

This paper defines two research problems: map-matching and route prediction. Their formalisation requires introducing a set of core concepts that provide the foundation for the detailed definitions of both problems. Table 1 provides the variables, parameters, and their descriptions used in the article.

Definition 1.

An observation

O^{t_{k}}

represents the set of parameters recorded for a given vehicle at time instant

t_{k}

. It is a tuple defined as

O^{t_{k}} = (l a t_{k}, l o n_{k}, v_{k}, ϕ_{k})

, where

l a t_{k}

denotes the latitude,

l o n_{k}

the longitude,

v_{k}

the instantaneous speed, and

ϕ_{k}

the azimuth.

Definition 2.

A vehicle trajectory T is a sequence of observations

O^{t_{1}}, O^{t_{2}}, \dots, O^{t_{k}}

, where each observation

O^{t_{i}}

(for

i = 1, \dots, k

) represents a set of recorded parameters at a discrete time instant

t_{i}

. In the context of real-time processing systems, a trajectory is defined as a potentially infinite stream of observations

{(O^{t_{j}})}_{j \geq 1}^{\infty}

, continuously populated with new data.

Definition 3.

A road network is represented as a directed graph

G = (V, S)

, where V is the set of vertices (e.g., intersections, junctions, segment boundary points), and S is the set of edges corresponding to road segments.

2.1. Problem Definition for Map-Matching

The map-matching problem can be formulated as follows: for a given road network

G = (V, S)

and a vehicle trajectory T, the objective is to determine the sequence of segments

R = 〈 s_{1}, s_{2}, \dots, s_{M} 〉

, where

s_{i} \in S

for each

i = 1, \dots, M

, which represents the most probable mapping of the trajectory T onto the actual route within the road network. Furthermore, the sequence R must satisfy the continuity constraint, defined as

\forall i \in {1, \dots, M - 1} : v_{e n d} (s_{i}) = v_{s t a r t} (s_{i + 1})

, where

v_{e n d} (s_{i})

is the end vertex of segment

s_{i}

, and

v_{s t a r t} (s_{i + 1})

is the start vertex of segment

s_{i + 1}

. Figure 1 provides a graphical representation of the map-matching problem.

2.2. Problem Definition for Route Prediction

The vehicle route prediction problem can be defined as follows: given a historical vehicle trajectory

T_{h i s t} = 〈 O^{t_{1}}, O^{t_{2}}, \dots, O^{t_{k}} 〉

, the objective is to determine a future sequence of observations

T_{p r e d} = 〈 O^{t_{k + 1}}, O^{t_{k + 2}}, \dots, O^{t_{k + N}} 〉

, which describes the most probable vehicle movement within a specific time horizon N. When incorporating information from the road network, represented as a graph

G = (V, S)

, vehicle route prediction involves determining the most probable future sequence of road segments

R_{p r e d} = 〈 s_{n + 1}, s_{n + 2}, \dots, s_{n + p} 〉

based on the historical route

R_{h i s t} = 〈 s_{1}, s_{2}, \dots, s_{n} 〉

, where each segment

s_{j} \in S

. The sequence

R_{p r e d}

represents the vehicle’s predicted route on the network within a specified horizon (e.g., temporal or until

n + p

segments are reached). Furthermore, the sequences

R_{h i s t}

and

R_{p r e d}

must satisfy the continuity constraint, defined as

\forall j \in {1, \dots, n + p - 1} : v_{e n d} (s_{j}) = v_{s t a r t} (s_{j + 1})

, where

v_{e n d} (s_{j})

is the end vertex of segment

s_{i}

, and

v_{s t a r t} (s_{j + 1})

is the start vertex of segment

s_{j + 1}

. Figure 2 provides a graphical representation of the route prediction problem.

3. Map-Matching Algorithms

The literature presents a multitude of algorithms for solving the map-matching problem. This section will discuss various approaches, commencing with the employed notation and a formal definition of the problem.

3.1. Classification of Map-Matching Algorithms

The literature proposes various approaches to classifying map-matching methods, reflecting the specific nature of their applications and data processing characteristics. One of the fundamental divisions is the classification into online and offline algorithms [6]. Offline algorithms operate on pre-collected and integrated geolocation data, processing an entire trajectory over a given time horizon. In contrast, online algorithms determine solutions in real time by sequentially analysing incoming GPS data. In the paper [7], offline algorithms are termed global, while online algorithms are referred to as incremental. The authors of [8] proposed dividing map-matching algorithms into four main categories: geometric, topological, probabilistic, and advanced. This classification has become a cornerstone for numerous subsequent studies, offering a clear framework for analysing the diverse approaches used to match GPS trajectories to a road network. A review by [9] highlights the evolution of this classification, particularly concerning the probabilistic and advanced categories, which in many scientific papers have been merged and are collectively treated as advanced methods. The authors of [10] note that most methods developed over the last decade can be classified as advanced due to their interdisciplinary nature and integration of various concepts. This approach underscores the trend towards developing hybrid algorithms that combine multiple techniques to maximise the precision and efficiency and enhance the methods’ versatility. The category of geometric map-matching algorithms includes methods based on geometric information, such as the distances between GPS trajectory points and road network elements (i.e., road segments or vertices). Decisions are typically made based on the shortest point-to-point distance, point-to-line distance, or the minimum curve-to-curve distance. Topological algorithms strive to maintain the continuity of the trajectory’s mapping within the road network and focus on analysing the similarity between the shape of the trajectory and the actual layout of the road network. A drawback of both these approaches is their high sensitivity to measurement errors [6]. Probabilistic methods are used to model trajectory uncertainty to determine the most probable travel route. The category of advanced methods includes more complex algorithms that utilise various techniques to solve the map-matching problem with greater precision.

The authors of [11] draw attention to the limitations of existing methods for classifying map-matching algorithms. They argue that geometric methods are losing relevance due to their poor performance. Furthermore, application-based classifications fail to adequately differentiate between methods, while a division based on the mathematical tools employed is impractical, as many algorithms combine various techniques. In response, they propose a new classification method that distinguishes four classes: similarity model, state-transition model, candidate-evolving model, and scoring model. The similarity model matches a trajectory to the geometrically or topologically nearest vertices or edges of the road network, focusing on the definition of proximity. The state-transition model uses a weighted topological graph to determine an optimal global path, where vertices represent states and edges represent the transitions between them. The candidate-evolving model manages a set of hypotheses that evolve as new data arrives, discarding irrelevant ones to find the segment with the most votes. Finally, the scoring model assigns scores to candidates based on a scoring function that considers features such as proximity, predicted location, reachability, and turn intention, enabling lane-level trajectory matching.

Within this paper, the authors propose a modified classification approach for map-matching methods based on computational paradigms. This approach categorises algorithms according to the computational strategy adopted to solve the map-matching problem. It focuses on how algorithms process data and make decisions, rather than exclusively on the type of data used or the specific mathematical tools employed. The authors distinguish three groups of methods:

Spatio-Temporal Constraint Approach,
Inference Model-Based Approach,
Data-Driven/Learning Approach.

3.2. Spatio-Temporal Constraint Approach

Spatio-temporal constraint analysis is a fundamental strategy used in map-matching problems. At its core, this approach defines the problem as finding a vehicle’s path based on a predefined set of explicit rules—namely, temporal, geometric, and topological constraints. An algorithm of this type is deterministic, and its operation is defined by precise spatio-temporal criteria. The computational process involves operations such as checking distance and shape conditions, comparing spatial attributes, analysing the time between measurements, and verifying the validity of path connections based on the road network graph.

These algorithms represent a historically early approach, which is to identify the correct route by directly verifying its compliance with a predefined set of explicit rules. The primary geometric techniques include the following [12]:

Point-to-point matching—this method is sensitive to how the map was digitised,
Point-to-curve matching—although this approach considers the distance to a segment, it ignores the historical context and can consequently be unstable,
Curve-to-curve matching—this method involves comparing the shape of trajectory fragments with the road network and is susceptible to outliers.

A significant modification is the use of topological constraints, which utilise the connection structure of the road network to narrow the search space. The authors of [12,13] indicate that such an approach significantly improves the results. This is confirmed, for instance, by Greenfield’s analyses of weighted topological algorithms [14], as presented in the paper [13]. In this approach, basic temporal constraints are typically implemented implicitly through the sequential processing of points. More advanced variants may incorporate features such as driving direction or speed [13]. A defect of these methods is their sensitivity to noise in GPS data and the digital map’s accuracy. This can result in low effectiveness in complex urban environments, particularly at intersections.

Nevertheless, methods based on spatio-temporal constraints form the foundation for more advanced map-matching techniques.

3.3. Inference Model-Based Approach

Methods that utilise inference models rely on estimating the most probable or credible trajectory when input data are inherently noisy and uncertain. Probabilistic mathematical models explicitly represent and systematically manage this uncertainty within the matching process. The computational process begins with evaluating compatibility measures between observed positional data and hypothetical states within the adopted model. It then proceeds with the dynamic updating of credibility assessments for the alternative paths under consideration as new information becomes available. It concludes with identifying a globally optimal or most coherent sequence of matches based on aggregated evaluations. This process is realised by applying dedicated inference rules and optimisation algorithms specific to the modelling formalism. The operating principle of this approach is illustrated in Figure 3.

3.3.1. Particle Filters

Particle Filters (PF) are sequential Monte Carlo methods that use the recursive calculation of relevant probability distributions through importance sampling and the approximation of these distributions using discrete random measures [15]. The PF constitutes an effective method for solving problems of a non-linear nature, as exemplified by the map-matching problem.

In the study by [16], a probabilistic PF-based approach was applied to the map-matching problem. The system is modelled as a first-order Markov chain, in which the filter estimates the hidden states (the actual vehicle positions) based on noisy GPS data. Each particle represents a hypothesis of the vehicle’s state and is defined as a point on the road network, comprising a road segment identifier, the distance along it, and the direction of travel. Sampling particles initialise the process from a Gaussian distribution around the first GPS point, constrained to the road network. The transition model utilises the distance between consecutive GPS points with added Gaussian noise, whilst the likelihood function (particle weights), also based on a Gaussian distribution, favours particles closer to the GPS measurement. The algorithm generates a set of possible routes and their assigned probability values, enabling an assessment of the matching certainty. Tests demonstrated the algorithm’s high accuracy for data with a high sampling frequency and its sensitivity to low-quality data.

The algorithm presented in [17] utilises Sampling Importance Resampling (SIR), a numerical approximation of the Bayes filter, to estimate the vehicle’s state. A set of weighted particles represents this state, each constituting a hypothesis concerning the vehicle’s location. In addition to GPS data, this algorithm incorporates digital map information. The particle weights are modified according to two key map attributes: the conformity of the vehicle’s speed with the road’s functional class and the route’s topological consistency. Such an approach steers the estimation process towards more probable hypotheses, significantly enhancing the map-matching accuracy.

The authors of [18] also employ the PF as a numerical implementation of the Bayes filter to estimate the vehicle’s position. In their approach, the vehicle’s state is modelled as a two-dimensional vector of North–East coordinates

x_{k} = {[\begin{matrix} P_{k}^{N} & P_{k}^{E} \end{matrix}]}^{T}

. The described algorithm focuses on a dual-mode particle transition model, depending on whether the vehicle moves straight or turns. When driving straight, the particles follow the road geometry from the map, which reduces the cross-track error. During a turn, the particles move according to the Dead Reckoning (DR) equations:

[\begin{matrix} P_{k + 1}^{N} \\ P_{k + 1}^{E} \end{matrix}] = [\begin{matrix} P_{k}^{N} \\ P_{k}^{E} \end{matrix}] + V_{k} T_{k} [\begin{matrix} cos ψ_{k} \\ sin ψ_{k} \end{matrix}],

(1)

where

V_{k}

is the vehicle’s velocity at time instant k,

T_{k}

is the time interval between step k and

k + 1

, and

ψ_{k}

is the heading angle from North at time instant k. Applying this method during turns allows for the effective reduction of the along-track error. Particle weights are updated based on a likelihood function, which considers the distance from the measured position and the difference in the vehicle’s heading.

3.3.2. Fuzzy Logic

Fuzzy Logic (FL) constitutes an extension of classical logic, replacing its binary {true, false} system with a membership function that takes values from the continuous interval [0, 1] [19]. Its capacity for the formal representation and processing of uncertainty and ambiguity makes it applicable to solving problems in map-matching.

A similar procedural pattern can be observed in the analysed works, utilising a Fuzzy Inference System (FIS). This system processes input data, such as heading error or distance from the road, through rules to generate a likelihood score for candidate roads. Despite these similarities, the algorithms differ significantly in their details. In the paper [20], the credibility assessment was based on three criteria:

Heading consistency—the vehicle is more likely to be on a road whose direction is consistent with its current heading,
Proximity—the vehicle’s true position, despite positioning errors, lies near the GPS measurement. Consequently, the road closest to the measured point is considered the most probable,
Shape similarity—the more closely the road’s shape resembles the vehicle’s trajectory, the greater the likelihood that this road constitutes the best match.

The authors utilised DR to smooth the vehicle’s trajectory for the shape similarity rule. Other authors, such as those in [21,22], use more complex multi-stage approaches. In the study by [22], the process is as follows:

Fuzzification: the transformation of numerical data into linguistic imprecise categories known as fuzzy sets (e.g., small, large, low, high, etc.):
–
First stage (road identification): vehicle speed, heading error, perpendicular distance from the road, and the Horizontal Dilution of Precision (HDOP) factor,
–
Second stage (tracking): speed, heading increment, and gyroscope reading,
Inference: the application of rules that combine the fuzzy input data to conclude,
Defuzzification: converting the linguistic conclusions from the rule base into specific numerical values, representing the likelihood that a given road segment matches the current input data.

Furthermore, the algorithm also utilises data on the road network topology and the vehicle’s historical trajectory to improve accuracy, particularly at intersections. An even more elaborate model is presented in [21], where three distinct processes have been delineated, in which fuzzy logic is crucial for decision-making:

Initialisation—the fuzzy logic system evaluates candidate roads based on rules and traffic data (speed, heading error) to identify the initial correct segment,
Tracking along the segment—a separate set of fuzzy rules dynamically assesses whether the vehicle is continuing on the same road by analysing changes in its movement, e.g., heading increment or gyroscope data,
Matching at an intersection—at intersections, fuzzy logic assists in selecting a new segment, and its rules are enriched with topological criteria, such as connectivity with the previous road.

The algorithms also differ in the final stage of determining the vehicle’s position. The paper [20] applies a simple perpendicular projection onto the best-matched road, whereas [21,22] utilise optimal estimation. Furthermore, the authors of [21] indicate that, based on the tests conducted, their solution is characterised by high accuracy in various road networks with varying degrees of complexity.

3.3.3. Dempster–Shafer Theory

Dempster–Shafer Theory (DST) is a generalisation of Bayesian theory [23]. In this approach, the degree of belief is represented by a belief function, which assigns values to entire sets of possibilities rather than to single events. This allows for more flexible modelling of uncertainty and the explicit representation of ignorance, which is one of the key distinctions from the classical probabilistic approach.

The authors of [24] proposed a map-matching method that combines DST with an approach based on representing measurement uncertainty in geometric areas, rather than as single values. The algorithm integrates data from a GPS system and DR sensors. This approach assigns a degree of belief to each representation of the vehicle’s location following DST. The theory is utilised to fuse evidence from various sources, such as map topology and geometric similarity, to dynamically assess the likelihood of each possible road. The final selection of the most probable road is made using a DST decision rule, which permits the effective management of multiple hypotheses in ambiguous situations, such as at intersections.

The study by [25] presents an advanced topological map-matching algorithm that aims to enhance accuracy in dense road networks. The method utilises real-world GPS data, including geographical coordinates, speed, and heading. The algorithm bases its assessment on four key factors: the road speed limit, direction of motion, the proximity of the GPS point to the road, and spatial correlation. DST is applied to dynamically estimate the weight of each of these factors and to calculate an aggregated probability for all candidate points, thereby increasing the system’s flexibility. The process commences with data cleaning; subsequently, following the fusion of DST results, an optimisation method based on trajectory shape matching is used to filter out incorrect matches further. Compared to other algorithms described in the literature, the developed method provides higher performance and accuracy.

3.3.4. Hidden Markov Model

One of the most popular techniques to solve the map-matching problem is the Hidden Markov Model (HMM), wherein observations result from an underlying hidden Markov process. An algorithm first utilising an HMM was described in the paper [26]. The fundamental objective of the algorithm is to estimate the most probable sequence of hidden states that generated a given sequence of observations. The overall process, which integrates these probabilities using the Viterbi algorithm, is illustrated in Figure 4.

The two fundamental component functions of the model are the emission probability and the transition probability. The emission probability

P_{E} (O^{t_{k}}, s_{i})

defines the likelihood that an observation

O^{t_{k}}

corresponds to the actual position of the vehicle on a given road segment

s_{i}

.

P_{E} (O^{t_{k}}, s_{i}) = \frac{1}{\sqrt{2 π} σ} exp (- \frac{{(D_{s_{i}}^{k})}^{2}}{2 σ^{2}}),

(2)

where

s_{i}

is the i-th segment in the road network,

D_{s_{i}}^{k}

is the orthogonal distance between the observation at time step

t_{k}

, and the segment

s_{i}

,

σ

denotes the standard deviation of the GPS measurements. The second component, the transition probability

P_{T} (O^{t_{k}}, s_{j}, O^{t_{k + 1}}, s_{l})

defines the likelihood of transitioning from road segment

s_{j}

, associated with observation

O^{t_{k}}

to road segment

s_{l}

associated with observation

O^{t_{k + 1}}

. This probability is modelled using an exponential distribution.

P_{T} (O^{t_{k}}, s_{j}, O^{t_{k + 1}}, s_{l}) = \frac{1}{λ} exp (- \frac{|D^{k} - D_{G (s_{j}, s_{l})}^{k}|}{λ}),

(3)

where

λ

is the scale parameter of the exponential distribution,

D^{k}

is the Euclidean distance between the consecutive observations

O^{t_{k}}

and

O^{t_{k + 1}}

, and

D_{G (s_{j}, s_{l})}^{k}

is the distance along the road network between the orthogonal projections of the consecutive observations (

O^{t_{k}}

and

O^{t_{k + 1}}

) onto their associated segments (

s_{j}

,

s_{l}

). Figure 5 presents the geometric relationships utilised to determine the emission and transition probabilities.

Based on the determined emission and transition probabilities, estimating the most probable sequence of road segments R (the sequence of hidden states) is possible, corresponding to the sequence of recorded observations. For this purpose, the Viterbi algorithm can be applied [27]. The algorithm consists of two stages. In the first stage, the maximum cumulative probabilities

δ_{k} (O^{t_{k}})

are calculated for all possible states at each time step k (4) to determine the optimal transitions between states. On this basis, in the second stage, the most probable sequence of states is reconstructed backwards, which guarantees finding the path that maximises the global evaluation metric

S_{H M M}

(5).

δ_{k} (O^{t_{k}}) = max [δ_{k - 1} (O^{t_{k - 1}}) \cdot P_{E} (O^{t_{k}}, s_{i}) \cdot P_{T} (O^{t_{k}}, s_{j}, O^{t_{k + 1}}, s_{l})],

(4)

S_{H M M} = \sum_{k = 1}^{N - 1} P_{E} (O^{t_{k}}, s_{i}) \cdot P_{T} (O^{t_{k}}, s_{j}, O^{t_{k + 1}}, s_{l}) .

(5)

A range of variants of map-matching algorithms employing HMM can be found in the literature, aimed at improving both operational quality and performance. The authors of the paper [28] propose an extension to the classic HMM matching algorithm that incorporates the vehicle’s movement trend.

S_{T r e n d H M M} = S_{H M M} + \sum_{k = 1}^{N - 1} W_{t r e n d} (O^{t_{k}}),

(6)

where

W_{t r e n d} (O^{t_{k}})

is the weight accounting for the movement trend, based on the observation at time step k.

In the paper [7], a more sophisticated approach to determining the emission probability was proposed. By introducing a penalty function for exceeding the speed limit, the authors assumed that drivers would not significantly surpass the permitted speed.

P_{E} (O^{t_{k}}, s_{i}) = \frac{v_{s_{i}}}{max (0, v_{k} - v_{s_{i}}) + v_{s_{i}}} \cdot \frac{1}{w_{s_{i}}} \int_{- w_{s_{i}} / 2}^{w_{s_{i}} / 2} \frac{1}{\sqrt{2 π} σ} exp (- \frac{{(l - D_{s_{i}}^{k})}^{2}}{2 σ^{2}}) d l,

(7)

where

w_{s_{i}}

denotes the width of segment

s_{i}

,

v_{s_{i}}

is the permitted speed for segment

s_{i}

, and

v_{k}

is the vehicle’s speed at time step

t_{k}

. The definition of the transition probability was also modified to account for the change in the vehicle’s momentum between segments associated with consecutive observations:

O^{t_{k}}

and

O^{t_{k + 1}}

. To determine the most probable path, the Viterbi algorithm was employed in an online mode [29] to make step-by-step decisions within the Markov chain, without knowledge of future input data. A key element of this method is the Variable Sliding Window (VSW), the size of which dynamically adapts to the structure of the state space. This window expands whilst processing new observations and narrows upon finding a convergence point (a point in the Markov chain where all current, most probable paths converge, guaranteeing that any future path will contain the same sub-path). This approach ensures high accuracy. However, a problem can arise if there is no limit on the maximum window length, which may lead to delays in such a situation. The Bounded Variable Sliding Window (BVSW) method is the response to this issue [7]. It allows for a reduction in delays but may lead to sub-optimal results. The solution described above, despite its aforementioned limitations, makes it possible to eliminate one of the main problems associated with the classic HMM method for map-matching: the necessity of having the entire trajectory available before the algorithm is applied.

In the article [5], a modified formula (8) is proposed for determining the emission probability. This formula considers the distance of the observation

O^{t_{k}}

from the segment

s_{i}

and the difference in their azimuths.

P_{E} (O^{t_{k}}, s_{i}) = \frac{1}{\sqrt{2 π} σ} exp (- \frac{{(D_{s_{i}}^{k})}^{2}}{2 σ^{2}}) \cdot |cos (ϕ_{k} - ϕ_{s_{i}})| \cdot H (v_{k} - v_{s_{i}}),

(8)

where

ϕ_{k}

is the instantaneous azimuth of the observation at time step

t_{k}

, and

ϕ_{s_{i}}

is the azimuth of segment

s_{i}

. Furthermore, speed exceedance is accounted for using the Heaviside step function

H (x)

. This function returns a value of 0 when the vehicle’s instantaneous speed

v_{k}

at time step

t_{k}

exceeds the permitted speed for the road segment

v_{s_{i}}

and returns one otherwise.

An additional component

P_{T I M E}

was introduced to calculate the transition probability. This component is responsible for decreasing the transition probability when the travel time between observations would necessitate exceeding the permitted speed. Furthermore,

P_{C H O I C E}

component was included, which models the driver’s preferences concerning path selection. In the proposed approach, the transition probability

P_{T}

is calculated as follows:

\begin{matrix} P_{T} (O^{t_{k}}, s_{j}, O^{t_{k + 1}}, s_{l}) = \frac{1}{λ} exp (- \frac{|D^{k} - D_{G (s_{j}, s_{l})}^{k}|}{λ}) \\ \cdot P_{T I M E} (O^{t_{k}}, O^{t_{k + 1}}) \cdot P_{C H O I C E} (O^{t_{k}}, O^{t_{k + 1}}) . \end{matrix}

(9)

The paper [6] presents an interesting approach to real-time GPS trajectory mapping. It utilises a probabilistic route prediction model instead of awaiting future GPS data, thereby eliminating delays. The authors model the emission probability using the formula:

P_{E} (O^{t_{k}}, s_{i}) = \frac{1}{l_{s_{i}}} \int_{0}^{l_{s_{i}}} \frac{1}{\sqrt{2 π} σ} exp (- \frac{{(D_{s_{i}}^{k} (x))}^{2}}{2 σ^{2}}) d x,

(10)

where

l_{s_{i}}

is the length of road segment

s_{i}

, and

D_{s_{i}}^{k} (x)

is the distance between observation

O^{t_{k}}

and position x on road segment

s_{i}

. This method accounts for the uncertainty regarding the vehicle’s precise position within the segment.

The transition probability is based on the vehicle’s speed, which is modelled using a normal distribution. The mean speed and variance are estimated using a Kalman filter [30]. An essential component of the transition probability is the likelihood of transitioning between individual road segments, determined from historical data according to the formula:

P_{s_{i} \to s_{j}} = \frac{N_{s_{i} \to s_{j}} + 1}{\sum_{m} N_{s_{i} \to s_{m}} + N_{k}},

(11)

where

N_{k}

is the number of road segments connected to segment

s_{i}

,

N_{s_{i} \to s_{j}}

is the number of transitions from segment

s_{i}

to segment

s_{j}

, and

\sum_{m} N_{s_{i} \to s_{m}}

is the sum of all transitions from segment

s_{i}

to all its adjacent segments. Bayesian filtering was utilised for online map-matching to iteratively estimate the probability of a route based on current and previous observations [6]. The process involves alternating between a prediction step, which calculates the prior probability of a route, and a filtering step, which updates this probability based on the observed data.

An analysis of the literature on HMM-based map-matching methods indicates that the key differences between the approaches of various authors stem primarily from their differing ways of modelling emission and transition probabilities. Each approach focuses on different aspects, including spatial and temporal characteristics, driver preferences, or the specifics of the road network. The resulting algorithms are characterised by varying effectiveness and potential for practical application. Notably, some authors emphasise that their solutions achieve high accuracy even with low-frequency GPS sampling [5,31,32,33]. However, despite these variations, a common practical challenge for nearly all HMM-based approaches is the tuning of model parameters. The definition of these probabilities often relies on specific assumptions, such as a normal distribution for GPS measurement errors, which may not hold in all real-world scenarios, particularly in dense urban environments or areas with poor signal quality. Consequently, achieving optimal performance frequently requires significant empirical tuning or domain-specific expertise to calibrate the model correctly. Furthermore, some of the more advanced HMM variants increase their accuracy by incorporating highly detailed road network attributes, such as the width of a road segment [7] or the permitted speed for individual segments [5,7]. While effective, this creates another practical hurdle, as obtaining such granular and consistently accurate data for an entire road network can be problematic, mainly when relying on publicly available map sources. This reliance on careful parameterisation and the potential need for highly detailed map data represent significant practical considerations that can impact the scalability and adaptability of these methods.

3.4. Data-Driven/Learning Approach

The Data-Driven/Learning-Based Approach encompasses algorithms that employ machine learning techniques, including deep learning, to extract complex patterns and relationships directly from trajectory data and digital maps. In recent years, machine learning techniques have played a pivotal role in solving problems related to vehicle trajectory matching, primarily due to their ability to effectively analyse large datasets and capture intricate patterns in spatio-temporal data. Contemporary research in this field demonstrates the application of a broad spectrum of methods, including classical algorithms such as k-Nearest Neighbours (kNN) [34,35], advanced sequence-to-sequence (seq2seq) models [36,37,38], Graph Neural Networks (GNNs) [39], and Recurrent Neural Networks (RNNs) [40]. Regardless of the specific architecture, the fundamental computational process of these approaches involves training a model to map input data to the expected outputs. During the training phase, the model analyses extensive datasets, adjusting its parameters to minimise prediction errors and to represent the complex spatio-temporal patterns and relationships. Subsequently, in the inference phase, the trained model utilises its acquired knowledge to generate the most probable sequences of road segments for new, unseen GPS trajectories. A key feature of this approach is that the matching logic and decision criteria are learned mainly from the data, rather than being explicitly defined by predefined rules or mathematical models.

A notable and increasingly prevalent trend in the scientific literature is the combination of various techniques into more sophisticated solutions to achieve enhanced performance and accuracy. For instance, the paper [37] proposed a solution utilising a seq2seq architecture with an attention mechanism, which transforms a sequence of GPS points into a sequence of road segments. The innovation was to transfer the matching process into a latent space. GPS points and potential road segments are mapped to feature vectors, which increases robustness to noise, as points close to the actual road are also close to the corresponding segment in the latent space. The researchers employed stacked Long Short-Term Memory (LSTM) networks, which played a key role in encoding and decoding trajectories, thereby enabling the modelling of the sequential data constituting a trajectory. The DeepMM model was trained on large datasets containing historical trajectories from numerous vehicles, which allowed it to learn traffic patterns within the road network, such as preferred routes. A significant aspect of this paper was extensive data augmentation, achieved by generating synthetic trajectories based on real data while simulating errors such as random point omissions and spatial noise. This approach enhanced the model’s robustness to sparse sampling and significant errors. The experimental results demonstrated that DeepMM achieved a matching accuracy more than 10% higher than classical HMMs under various conditions, particularly with noisy and sparse data. Applying the attention mechanism improved the results by an additional 3%, confirming that neural networks can significantly enhance the quality of map matching compared to traditional methods.

The paper presented in [41] proposes the DST-MM algorithm, which uses deep learning to precisely match vehicle trajectories to a road network. This algorithm comprehensively analyses data by considering the spatial relationships between the trajectory and the road network, the predicted vehicle speed, and the temporal consistency of GPS observations to identify the most probable path efficiently. In the first step, the algorithm evaluates geometric and topological dependencies between GPS points and roads to filter out unrealistic paths. This approach is based on an analysis considering the road shape, node proximity, and intersection characteristics. At this stage, the algorithm employs Convolutional Neural Networks (CNNs) to analyse spatial dependencies between road segments and to extract features from the vehicle trajectory and the surrounding road network. The next stage is speed prediction, where a Bi-directional Long Short-Term Memory (BD-LSTM) model uses historical traffic data to predict the vehicle’s speed. This model then assesses how the predicted speed aligns with a selected road candidate, which is crucial for determining the most likely trajectory. Finally, the DST-MM algorithm performs a temporal analysis integrating the predicted speed with actual GPS data. A dynamic candidate graph is constructed, wherein a constrained set of possible road segments and the connections between them are established for consecutive GPS points, considering the predicted speed and travel time. This allows the algorithm to narrow the search space to realistic paths, significantly improving the matching accuracy. The experimental results show a considerable improvement in accuracy compared to other algorithms, especially for sparse GPS samples in dense urban networks.

The authors of the paper [42] developed the L2MM model, which applies deep learning to match trajectories derived from low-quality GPS measurements. GPS data are often characterised by the presence of noise, a low sampling frequency, and irregular time intervals, which complicates their analysis. The L2MM model employs a seq2seq approach that transforms low-frequency GPS data into more precise representations. Furthermore, trajectory representation enhancement techniques were introduced, including Gaussian Mixture Models, which allow the model to capture regularities in driver behaviour and the road network layout. An essential component of the algorithm is a decoder with an attention mechanism, which enables the identification of complex dependencies within the trajectory data, thereby improving the overall quality of the match.

The paper described in [36] presents a map-matching solution based on deep learning that utilises information about the network’s topology. The proposed algorithm employs a seq2seq model that incorporates this topological information. To achieve this, the authors developed dependency dictionaries based on a grid-based division of the map and road topology, considering adjacencies and connections. They then applied dictionary compression using the Byte Pair Encoding (BPE) method to constrain the output space. The model’s core is a bidirectional Gated Recurrent Unit network with an attention mechanism (Bi-GRU+Attention), which facilitates learning trajectory representations and the selection of the most probable segment sequences. Gated Recurrent Unit (GRU) achieves accuracy comparable to LSTM for moderately-sized datasets while exhibiting greater computational efficiency. To enhance the training process, the authors augmented the data with synthetic trajectories generated using Gaussian noise within a specified Fréchet distance, which enabled the construction of a more diverse dataset for the model.

The paper presented in [43] utilised an artificial neural network as a pre-processing method for geolocation data originating from the GPS system. The proposed approach involves correcting the coordinates of raw GPS points by estimating and applying a horizontal offset vector before the data are fed into the map-matching algorithm. Integrating the Artificial Neural Network (ANN) module with an existing map-matching algorithm improved the road segment identification accuracy rate. This was particularly evident in areas with a complex network topology, where the original algorithm failed to yield satisfactory results. It should be noted, however, that these modifications resulted in a decrease in performance of approximately 15%.

Recent publications in map-matching are increasingly exploring the application of Reinforcement Learning (RL). An example of such an approach is the Deep Map Matching (DMM) solution proposed in the paper [40], which utilises data from cellular networks. Instead of traditional GPS data, DMM uses sequences of GSM base station locations as the source of location information. In this approach, an agent, represented by an RNN, learns to make decisions regarding selecting the most probable route, treating the map-matching task as a decision-making process. The reward is based on the consistency of the trajectory with observations from the GSM towers. Experiments confirmed that the model effectively matched trajectories after training over many episodes. Although this is a specific application using cellular data for localisation, the paper demonstrates the potential of Reinforcement Learning in map-matching, including its relevance to traditional GPS data.

While data-driven approaches, particularly deep learning models, have demonstrated superior performance in handling complex spatio-temporal patterns, their practical implementation comes with a distinct set of significant challenges. The most prominent of these is the profound dependency on large-scale, high-quality, and accurately labelled training data, the acquisition of which can be both costly and time-consuming. Furthermore, the effectiveness of these models is highly sensitive to the quality of the input GPS trajectories. Real-world data are often sparse or noisy, necessitating sophisticated data pre-processing and augmentation techniques to ensure model robustness. Another practical consideration is the high computational cost associated with training deep neural networks, often requiring specialised hardware. Finally, providing the model’s generalisation to new unseen environments and maintaining its performance in the face of dynamic changes to the road network requires a continuous cycle of monitoring and retraining, which poses a significant operational challenge.

Table 2 summarises the techniques applied to the map-matching problem.

4. Route Prediction Algorithms

Vehicle route prediction involves estimating an object’s future path based on its previously recorded trajectory and surrounding conditions. Approaches to solving this problem commonly include statistical modelling, machine learning techniques, and Hidden Markov Models, which enable the incorporation of complex spatio-temporal dependencies. Maintaining high prediction accuracy alongside computational efficiency presents a key challenge, particularly in real-time systems.

4.1. Hidden Markov Models in Route Prediction

The paper [44] introduces a destination prediction algorithm that defines the destination as a hidden state within an HMM. The model treats key points, such as intersections and junctions extracted from the user’s historical GPS data, as its observations. The model’s training process involves estimating the parameters of two crucial distributions: the probability, which defines the likelihood of observing a specific point on the route for each potential destination, and the probability, which governs the dynamics of how the intended destination might change between consecutive observations. Once these parameters have been estimated, the sequence of observations from the current partial route is analysed to determine the most probable sequence of hidden states. The final state in this sequence is then interpreted as the prediction of the ultimate travel destination. Notably, this is achieved without the necessity of using digital terrain maps.

The authors of the paper [45] developed an algorithm for vehicle route prediction, the general steps of which are illustrated in Figure 6. It is based on the utilisation of aggregated data from a large number of vehicles to eliminate the cold-start problem. They applied the Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN) algorithm to cluster road segments based on their prevailing traffic density. The clusters identified in this manner, representing popular travel patterns (routes), were subsequently used to define the hidden states and to train the model. In the prediction process, a single vehicle’s current partial journey is mapped to a sequence of segments, constituting the model’s observations. Based on these observations, the trained model employs the Viterbi algorithm to calculate the most probable complete route. The destination path can be predicted even from a short initial fragment of the journey.

4.2. Artificial Intelligence Methods in Route Prediction

Neural networks are quite commonly employed to determine probable vehicle routes. They enable the effective utilisation of large datasets and facilitate a deep exploration of information, which allows for the detection of hidden dependencies and interactions among various data types. The article [46] discusses multiple neural network architectures used for vehicle route prediction. The most frequently utilised architectures for this purpose include: Feedforward Neural Networks (FNN), CNNs, RNNs, Transformers, and Attention Mechanisms. RNNs, such as LSTM and GRU, are utilised for processing sequential data and effectively retaining historical information [47,48,49]. CNNs are used to detect spatio-temporal dependencies [50,51]. GNNs, particularly Graph Convolutional Networks (GCNs), are increasingly being applied to model road network topology [50,52]. Transformer models incorporating attention mechanisms are highly effective at encoding interactions and road scenes [53,54,55].

In the paper [56], the authors compared the results of applying several machine learning methods to route prediction. For this purpose, they designed methods based on LSTM, GRU, and Stacked Autoencoders (SAEs). The input data for the models consisted of historical vehicle trajectory data, which were pre-processed with a Savitzky–Golay filter for noise reduction. The objective of the models was to predict the future position and velocity of the vehicles. A comparison of their performance demonstrated that the LSTM-based model could predict the velocity and position of the vehicle most accurately. The results obtained for the GRU and SAEs models proved to be inferior to those of the LSTM network.

The paper presented in [50] introduces a hybrid Graph CNN-LSTM architecture for both short-term and long-term traffic flow prediction. The model combines the properties of a Graph CNN, used for extracting information about the complex spatial structure of the road network, with an LSTM cell mechanism, which enables the modelling of temporal dynamics. The sparse GPS data obtained from vehicle journeys in two cities was utilised to train the model. A technique for input dimensionality reduction was proposed, based on selecting key road segments. The test results demonstrate the clear superiority of the proposed solution over classical LSTM networks and numerous other forecasting methods in traffic flow prediction tasks. The authors present experimental results where the traffic flow was forecast across time horizons ranging from 5 min to 4 h. The stability of the results, irrespective of the timescale applied, should be emphasised.

In the paper [57], the NetTraj model is proposed, wherein the traditional representation of a trajectory as a sequence of road segments

R = 〈 s_{1}, s_{2}, \dots, s_{n} 〉, s_{j} \in S

is replaced by a direction-based approach. Each road segment is uniquely mapped to a pair

(s . s o u r c e - n o d e, s . d i r e c t i o n)

. The value of

s . d i r e c t i o n

depends on

i n t e r v a l

, which is calculated by discretising the continuous geographical heading of the segment (

s . h e a d i n g

) into one of K intervals (where

K = 8

in the study) using the formula:

i n t e r v a l = ⌊ \frac{K \cdot s . h e a d i n g}{360} ⌋,

(12)

where the interval width is

W = 360 / K

. Typically, it is assumed that

s . d i r e c t i o n = i n t e r v a l

. However, in the rare cases where more than one segment originating from the same node has an identical

i n t e r v a l

, the

s . d i r e c t i o n

value is adjusted to the nearest available interval to ensure uniqueness. This trajectory representation aims to reduce the dimensionality of the prediction problem. Instead of predicting one of thousands of segments, the model predicts one of only a few directions. The study employed a seq2seq architecture with an LSTM encoder–decoder, which processes trajectories represented as intersections and movement direction sequences. The authors also utilised two attention mechanisms:

Spatial attention—for dynamically modelling spatial dependencies within the road network,
Temporal attention—for detecting short- and long-term patterns in trajectory data using a sliding window.

The experimental results demonstrated that NetTraj achieves higher prediction accuracy than other methods, confirming the effectiveness of the proposed trajectory representation and the applied attention mechanisms.

In the paper [58], the authors describe a hybrid method, KGCN-LSTM, which combines contextual knowledge of the road infrastructure with the sequence of a vehicle’s historical trajectory points. GCN is employed for feature fusion and to extract knowledge about the spatial environment of the vehicle’s trajectory. Points of Interest (POI)—specific locations such as shops, hospitals, restaurants, or hotels—are used in this model to reflect potential travel destinations and driver intentions. The LSTM network is responsible for processing the sequential trajectory points over time, leveraging the spatial knowledge obtained from the GCN. The results show that the proposed KGCN-LSTM method provides higher accuracy and greater stability in route prediction than the baseline methods, particularly under conditions of incomplete data. Contextual knowledge proved crucial when historical GPS data did not capture the driver’s intent.

The paper [59] focuses on long-term vehicle route prediction while maintaining consistency with the road network topology. The authors proposed using a Transformer model augmented with situational context and a trajectory correction procedure termed link projection. The Transformer is utilised to process the motion sequence. At the same time, an additional encoding network facilitates the extraction of knowledge about road infrastructure objects that influence driver behaviour (e.g., speed cameras, traffic lights), thereby creating an abstract representation of the environment. The generated trajectory is then projected onto the nearest road segments (the link projection technique), which prevents the determination of trajectories outside the actual infrastructure and thus reduces error accumulation. Furthermore, the ABC (area-between-curves) measure was introduced to more accurately assess the shape conformity between the predicted and ground-truth trajectories. Applying the method to real-world data yielded superior results to a standard LSTM. The findings confirm that including contextual information and topological constraints significantly improves long-term trajectory forecasts.

The paper [51] proposes a hybrid CNN-LSTM method for urban traffic prediction based on GPS data. A CNN is utilised to extract spatial features from a grid-based matrix map prepared by the authors, which is crucial for understanding how road infrastructure influences traffic. The LSTM network, in turn, allows temporal dependencies to be captured. Combining these two methods enables the simultaneous analysis of both spatial and temporal aspects, significantly improving the prediction quality. The hybrid neural network proposed in the article comprises two sub-models. The first sub-model uses a classic CNN without a pooling layer to learn spatial features, as there is no need to reduce the resolution of the GPS data. The second sub-model also employs a CNN without a pooling layer to capture local dependencies and subsequently introduces an LSTM cell to retain long-term temporal information. The results of the experiments show that the hybrid CNN-LSTM model achieves superior performance compared to traditional methods. Additionally, the authors applied a training strategy based on a greedy policy, which allowed for an effective trade-off between higher accuracy and reasonable execution time.

The modelling of movement trajectories using RNNs is described in the paper [60]. The authors proposed two innovative models: Constrained State Space RNN (CSSRNN) and Latent Prediction Information RNN (LPIRNN). CSSRNN is an extension of traditional RNNs, designed to explicitly incorporate the road network’s topological constraints. Instead of learning the topology from data, this model incorporates it by replacing the standard output layer with a softmax function over a constrained state space. This mechanism is based on the application of a binary masking vector

M_{i}

for each state (segment)

s_{i}

, where

s_{i} \in S

, determined as follows:

M_{i j} = \{\begin{matrix} 1 & if s_{i} is a valid consecutive segment to s_{j} \\ 0 & otherwise \end{matrix} .

(13)

The determined masking vector is then utilised in the final step to calculate the conditional probability of selecting the next route segment

s_{n + 1}

.

p (s_{n + 1} | s_{1 : n}) = \frac{exp (W h_{n} + b) ⊙ M_{i}}{∥ exp (W h_{n} + b) ⊙ M_{i} ∥_{1}},

(14)

where ⊙ is the Hadamard product,

h_{n}

is the hidden state, W is the weight matrix, b is the bias vector. The main advantage of this approach is that it enables the model to focus solely on updating the weights associated with legal states. The second model, LPIRNN, utilises a multi-task learning approach. It consists of two phases that incorporate topological information through different tasks:

Shared Task Layer—RNN layer generates universal information about the direction of the next segment based on the current partial trajectory,
Multiple Individual Tasks—for each segment $r_{i}$ , a separate model is defined. Based on the shared task layer information, this model predicts the next segment $r_{i + 1}$ , which must be a legal transition.

The model learns homogeneous predictive information through this approach while simultaneously handling heterogeneous prediction tasks for each point in the network.

While artificial intelligence models excel at predicting vehicle routes by capturing complex dependencies, their practical application faces significant hurdles. Effective prediction demands not only massive historical trajectory datasets but also the fusion of rich contextual data, making data collection and maintenance a significant operational challenge. Furthermore, the immense number of potential future paths creates high computational demands, making long-term prediction accuracy a key unresolved challenge, as prediction errors accumulate rapidly due to the uncertainty of driver behaviour.

Table 3 summarises the techniques applied to the route prediction problem.

5. Multimodal Data Fusion for Enhanced Vehicle Positioning

This paper focuses on map-matching and trajectory prediction solutions, wherein GPS data constitute the primary source of location information. However, the availability of other data sources alongside GPS opens up the possibility of leveraging the dynamically developing trend of multimodal data fusion. This approach integrates geolocation data with information from a wide range of sensors to obtain a significantly more accurate and reliable situational awareness. Contemporary positioning systems often combine data from external sensors, such as vision cameras that enable the precise detection of road features like lane markings to determine the vehicle’s lateral position [61,62]; radar, capable of estimating the vehicle’s velocity and ego-motion from Doppler measurements [63]; or LiDAR (Light Detection and Ranging), with data from internal motion sensors, which include Inertial Navigation Systems (INS) based on Inertial Measurement Units (IMU) and odometers (DMI: Distance Measuring Instrument) [61,64]. The fusion of these diverse data streams can be realised using advanced algorithms, such as the Extended Kalman Filter (EKF) [64], Particle Filters (PF) [65], or factor graph optimisation [66]. These techniques allow for solving key challenges, for instance, by utilising machine learning algorithms like YOLO (You Only Look Once) to remove dynamic objects from LiDAR data [67] or by matching scans to reference maps using methods such as NDT (Normal Distributions Transform) or ICP (Iterative Closest Point) [61]. An additional crucial element of these systems is digital maps, from publicly available ones like OpenStreetMap (OSM) [63] to vector high-definition (HD) maps, which serve as an absolute reference point for correcting the cumulative errors typical of SLAM (Simultaneous Localisation and Mapping) systems [61,66]. A summary of selected applications using the discussed techniques is presented in Table 4.

6. Conclusions

This paper presents an overview of current methods and algorithms in map-matching and vehicle route prediction, emphasising artificial intelligence techniques. The authors systematise the existing knowledge by proposing a novel classification of map-matching algorithms based on computational paradigms. Hidden Markov Models, particle filters, and fuzzy logic are employed in map-matching to probabilistically match noisy GPS points to the most likely road segments. Conversely, advanced deep learning approaches, including RNNs and CNNs, learn complex spatio-temporal dependencies directly from the data to efficiently convert sequences of GPS observations into routes on the road network. Regarding trajectory prediction, Hidden Markov Models are utilised to forecast travel destinations or entire routes based on historical patterns. In contrast, deep learning models, such as RNNs and Transformers, predict a vehicle’s future movement by analysing its preceding path, often leveraging knowledge of the road network topology and contextual information to enhance precision.

Map-matching and vehicle route prediction are intrinsically linked tasks in analysing location data, since both processes operate within the same spatial context: the road network. Integrating map-matching with prediction has the potential to yield a more coherent and reliable outcome. In such an approach, navigation data would be filtered and aligned with the actual road infrastructure (map-matching), thereby enhancing the quality of the input data for predictive algorithms. The prediction algorithm could prompt a re-evaluation of the map-matching result if discrepancies arise during predicting future positions—such as anomalous transitions between road segments or the selection of routes inconsistent with the network’s topology. This would create a synergistic relationship where both stages are mutually reinforcing: a more precise match to the road network would translate into improved route prediction, while analysing potential future paths would aid in detecting map-matching errors. This conceptual framework is illustrated in Figure 7. An integrated approach would not only result in higher accuracy and fluidity in vehicle positioning but also offer enhanced capabilities for traffic management and the support of ITS. Furthermore, the proposed integration has practical potential in real-time navigation, autonomous vehicle guidance, fleet optimisation, and smart city mobility services, where improved positioning and predictive accuracy can directly enhance the safety, efficiency, and decision-making. Notably, relatively few academic studies have integrated these two domains, presenting a significant opportunity and a broad scope for future research and development.

Author Contributions

Conceptualisation, T.W., J.G., and E.N.-S.; methodology, T.W., J.G., and E.N.-S.; investigation, T.W.; writing—original draft preparation, T.W.; writing—review and editing, T.W., E.N.-S. and J.G.; supervision, E.N.-S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

HMM	Hidden Markov Model
ML	Machine Learning
ITS	Intelligent Transport System
PF	Particle Filters
SIR	Sampling Importance Resampling
DR	Dead Reckoning
FL	Fuzzy Logic
FIS	Fuzzy Inference System
HDOP	Horizontal Dilution of Precision
DST	Dempster–Shafer Theory
VSW	Variable Sliding Window
BVSW	Bounded Variable Sliding Window
kNN	k-Nearest Neighbours
seq2seq	sequence-to-sequence
GNN	Graph Neural Network
RNN	Recurrent Neural Network
LSTM	Long Short-Term Memory
CNN	Convolutional Neural Network
BD-LSTM	Bi-directional Long Short-Term Memory
GRU	Gated Recurrent Unit
ANN	Artificial Neural Network
RL	Reinforcement Learning
DMM	Deep Map Matching
HDBSCAN	Hierarchical Density-Based Spatial Clustering of Applications with Noise
FNN	Feedforward Neural Network
GCN	Graph Convolutional Network
SAEs	Stacked Autoencoders
POI	Points of Interest
CSSRNN	Constrained State Space RNN
LPIRNN	Latent Prediction Information RNN
GNSS	Global Navigation Satellite System
ACC	Accuracy
RMSE	Root Mean Square Error
SATLP	Situation-Aware Transformer with Link Projection
LiDAR	Light Detection and Ranging
IMU	Inertial Measurement Unit
SLAM	Simultaneous Localisation And Mapping
INS	Inertial Navigation System
DMI	Distance Measuring Instrument
NDT	Normal Distribution Transform
ICP	Iterative Closest Point
HD	high-definition
EKF	Extended Kalman Filter
OSM	OpenStreetMap

References

Hu, G.; Shao, J.; Liu, F.; Wang, Y.; Shen, H.T. IF-Matching: Towards Accurate Map-Matching with Information Fusion. IEEE Trans. Knowl. Data Eng. 2017, 29, 114–127. [Google Scholar] [CrossRef]
Chen, W.; Li, Z.; Yu, M.; Chen, Y. Effects of Sensor Errors on the Performance of Map Matching. J. Navig. 2005, 58, 273–282. [Google Scholar] [CrossRef]
Mohanty, A.; Gao, G. A survey of machine learning techniques for improving Global Navigation Satellite Systems. EURASIP J. Adv. Signal Process. 2024, 2024, 73. [Google Scholar] [CrossRef]
Liu, Y.; Li, Z. A novel algorithm of low sampling rate GPS trajectories on map-matching. EURASIP J. Wirel. Commun. Netw. 2017, 30, 1653820. [Google Scholar] [CrossRef]
Xiong, Z.; Li, B.; Liu, D. Map-Matching Using Hidden Markov Model and Path Choice Preferences under Sparse Trajectory. Sustainability 2021, 13, 12820. [Google Scholar] [CrossRef]
Taguchi, S.; Koide, S.; Yoshimura, T. Online Map Matching With Route Prediction. IEEE Trans. Intell. Transp. Syst. 2019, 20, 338–347. [Google Scholar] [CrossRef]
Goh, C.Y.; Dauwels, J.; Mitrovic, N.; Asif, M.T.; Oran, A.; Jaillet, P. Online map-matching based on Hidden Markov model for real-time traffic sensing applications. In Proceedings of the 2012 15th International IEEE Conference on Intelligent Transportation Systems, Anchorage, AK, USA, 16–19 September 2012; pp. 776–781. [Google Scholar] [CrossRef]
Quddus, M.A.; Ochieng, W.Y.; Noland, R.B. Current map-matching algorithms for transport applications: State-of-the art and future research directions. Transp. Res. Part C-Emerg. Technol. 2007, 15, 312–328. [Google Scholar] [CrossRef]
Singh, S.; Singh, J.; Goyal, S.B.; Barachi, M.E.; Kumar, M. Analytical Review of Map Matching Algorithms: Analyzing the Performance and Efficiency Using Road Dataset of the Indian Subcontinent. Arch. Comput. Methods Eng. 2023, 30, 4897–4916. [Google Scholar] [CrossRef]
Kubicka, M.; Çela, A.; Mounier, H.; Niculescu, S. Comparative Study and Application-Oriented Classification of Vehicular Map-Matching Methods. IEEE Intell. Transp. Syst. Mag. 2018, 10, 150–166. [Google Scholar] [CrossRef]
Chao, P.; Xu, Y.; Hua, W.; Zhou, X. A Survey on Map-Matching Algorithms. arXiv 2019, arXiv:1910.13065. [Google Scholar] [CrossRef]
Bernstein, D.; Kornhauser, A. An Introduction to Map Matching for Personal Navigation Assistants; New Jersey TIDE Center: Lawrenceville, NJ, USA, 1996. [Google Scholar]
Quddus, M.A.; Ochieng, W.Y.; Zhao, L.; Noland, R.B. A general map matching algorithm for transport telematics applications. GPS Solut. 2003, 7, 157–167. [Google Scholar] [CrossRef]
Greenfeld, J.S. Matching GPS observations to locations on a digital map. In Proceedings of the Transportation Research Board 81st Annual Meeting, Washington, DC, USA, 13–17 January 2002; Volume 22, pp. 576–582. [Google Scholar]
Djurić, P.M.; Kotecha, J.H.; Zhang, J.; Huang, Y.; Ghirmai, T.; Bugallo, M.F.; Míguez, J. Particle filtering. IEEE Signal Process. Mag. 2003, 20, 19–38. [Google Scholar] [CrossRef]
Kempinska, K.; Davies, T.O.; Shawe-Taylor, J. Probabilistic map-matching using particle filters. arXiv 2016, arXiv:1611.09706. [Google Scholar] [CrossRef]
Peker, A.U.; Tosun, O.; Acarman, T. Particle filter vehicle localization and map-matching using map topology. In Proceedings of the 2011 IEEE Intelligent Vehicles Symposium (IV), Baden-Baden, Germany, 5–9 June 2011; pp. 248–253. [Google Scholar] [CrossRef]
Davidson, P.; Collin, J.; Takala, J.H. Application of particle filters to a map-matching algorithm. Gyroscopy Navig. 2011, 2, 285–292. [Google Scholar] [CrossRef]
Zadeh, L.A. Fuzzy logic and approximate reasoning. Synthese 1975, 30, 407–428. [Google Scholar] [CrossRef]
Yang, Y.; Ye, H.; Fei, S. Integrated map-matching algorithm based on fuzzy logic and dead reckoning. In Proceedings of the ICCAS 2010, Goyang, Republic of Korea, 27–30 October 2010; pp. 1139–1142. [Google Scholar] [CrossRef]
Quddus, M.A.; Noland, R.B.; Ochieng, W.Y. A High Accuracy Fuzzy Logic Based Map Matching Algorithm for Road Transport. J. Intell. Transp. Syst. 2006, 10, 103–115. [Google Scholar] [CrossRef]
Zhang, Y.; Gao, Y. A Fuzzy Logic Map Matching Algorithm. In Proceedings of the 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery, Jinan, China, 18–20 October 2008; Volume 3, pp. 132–136. [Google Scholar] [CrossRef]
Denźux, T. 40 years of Dempster-Shafer theory. Int. J. Approx. Reason. 2016, 79, 1–6. [Google Scholar] [CrossRef]
Nassreddine, G.; Abdallah, F.; Denoeux, T. Map matching algorithm using interval analysis and Dempster-Shafer theory. In Proceedings of the 2009 IEEE Intelligent Vehicles Symposium, Xi’an, China, 3–5 June 2009; pp. 494–499. [Google Scholar] [CrossRef]
Zhao, X.; Cheng, X.; Zhou, J.; Xu, Z.; Dey, N.; Ashour, A.S.; Satapathy, S.C. Advanced Topological Map Matching Algorithm Based on D–S Theory. Arab. J. Sci. Eng. 2017, 43, 3863–3874. [Google Scholar] [CrossRef]
Hummel, B. Map matching for vehicle guidance. In Proceedings of the Dynamic and Mobile GIS; CRC Press: Boca Raton, FL, USA, 2006; pp. 211–222. [Google Scholar]
Forney, G. The Viterbi Algorithm. Proc. IEEE 1973, 61, 268–278. [Google Scholar] [CrossRef]
Song, H.Y.; Lee, J.H. A map matching algorithm based on modified hidden Markov model considering time series dependency over larger time span. Heliyon 2023, 9, e21368. [Google Scholar] [CrossRef] [PubMed]
Bloit, J.; Rodet, X. Short-time Viterbi for online HMM decoding: Evaluation on a real-time phone recognition task. In Proceedings of the 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, Las Vegas, NV, USA, 31 March–4 April 2008; pp. 2121–2124. [Google Scholar] [CrossRef]
Maybeck, P.S. The Kalman Filter: An Introduction to Concepts. In Autonomous Robot Vehicles; Springer: Berlin/Heidelberg, Germany, 1990; pp. 194–204. [Google Scholar] [CrossRef]
Newson, P.; Krumm, J. Hidden Markov map matching through noise and sparseness. In Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, Seattle, WA, USA, 4–6 November 2009; pp. 336–343. [Google Scholar]
Lou, Y.; Zhang, C.; Zheng, Y.; Xie, X.; Wang, W.; Huang, Y. Map-matching for low-sampling-rate GPS trajectories. In Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, Seattle, WA, USA, 4–6 November 2009; pp. 352–361. [Google Scholar] [CrossRef]
Liao, J. Optimization of Map Matching Algorithm in Various Road Conditions. Highlights Sci. Eng. Technol. 2023, 78, 59–66. [Google Scholar] [CrossRef]
Hashemi, M. Reusability of the Output of Map-Matching Algorithms Across Space and Time Through Machine Learning. IEEE Trans. Intell. Transp. Syst. 2017, 18, 3017–3026. [Google Scholar] [CrossRef]
Liu, T.; Chen, Z.; Chen, C.; Duan, Z.; Zhao, B. A Dynamic K-nearest Neighbor Map Matching Method Combined with Neural Network. In Proceedings of the 2019 IEEE Intelligent Transportation Systems Conference (ITSC), Auckland, New Zealand, 27–30 October 2019; pp. 3573–3578. [Google Scholar] [CrossRef]
Bai, Y.; Li, G.; Lu, T.; Wu, Y.; Zhang, W.; Feng, Y. Map Matching Based on Seq2Seq with Topology Information. Appl. Sci. 2023, 13, 12920. [Google Scholar] [CrossRef]
Feng, J.; Li, Y.; Zhao, K.; Xu, Z.; Xia, T.; Zhang, J.; Jin, D. DeepMM: Deep Learning Based Map Matching with Data Augmentation. IEEE Trans. Mob. Comput. 2022, 21, 2372–2384. [Google Scholar] [CrossRef]
Ren, H.; Ruan, S.; Li, Y.; Bao, J.; Meng, C.; Li, R.; Zheng, Y. MTrajRec: Map-Constrained Trajectory Recovery via Seq2Seq Multi-task Learning. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, Singapore, 14–18 August 2021; pp. 1410–1419. [Google Scholar] [CrossRef]
Liu, Y.; Ge, Q.; Luo, W.; Huang, Q.; Zou, L.; Wang, H.; Li, X.; Liu, C. GraphMM: Graph-Based Vehicular Map Matching by Leveraging Trajectory and Road Correlations. IEEE Trans. Knowl. Data Eng. 2024, 36, 184–198. [Google Scholar] [CrossRef]
Shen, Z.; Yang, K.; Zhao, X.; Zou, J.; Du, W.; Wu, J. DMM: A Deep Reinforcement Learning Based Map Matching Framework for Cellular Data. IEEE Trans. Knowl. Data Eng. 2024, 36, 5120–5137. [Google Scholar] [CrossRef]
Liu, Z.; Fang, J.; Tong, Y.; Xu, M. Deep learning enabled vehicle trajectory map-matching method with advanced spatial–temporal analysis. IET Intell. Transp. Syst. 2020, 14, 2052–2063. [Google Scholar] [CrossRef]
Jiang, L.; Chen, C.; Chen, C. L2MM: Learning to Map Matching with Deep Models for Low-Quality GPS Trajectory Data. ACM Trans. Knowl. Discov. Data 2022, 17, 1–25. [Google Scholar] [CrossRef]
Hashemi, M.; Karimi, H.A. A Machine Learning Approach to Improve the Accuracy of GPS-Based Map-Matching Algorithms (Invited Paper). In Proceedings of the 2016 IEEE 17th International Conference on Information Reuse and Integration (IRI), Pittsburgh, PA, USA, 28–30 July 2016; pp. 77–86. [Google Scholar] [CrossRef]
Álvarez-García, J.A.; Ortega, J.A.; Abril, L.G.; Morente, F. Trip destination prediction based on past GPS log using a Hidden Markov Model. Expert Syst. Appl. 2010, 37, 8166–8171. [Google Scholar] [CrossRef]
Chawuthai, R.; Kawachakul, K.; Boonrod, K.; Threepak, T. Route Prediction from GPS Trajectory and Road Data. In Proceedings of the 2023 15th International Conference on Computer and Automation Engineering (ICCAE), Sydney, Australia, 3–5 March 2023; pp. 65–69. [Google Scholar] [CrossRef]
Yin, C.; Cecotti, M.; Auger, D.J.; Fotouhi, A.; Jiang, H. Deep-learning-based vehicle trajectory prediction: A review. IET Intell. Transp. Syst. 2025, 19, e70001. [Google Scholar] [CrossRef]
Jiang, R.; Xu, H.; Gong, G.; Kuang, Y.; Liu, Z. Spatial-Temporal Attentive LSTM for Vehicle-Trajectory Prediction. ISPRS Int. J. Geo Inf. 2022, 11, 354. [Google Scholar] [CrossRef]
Qiao, S.; Gao, F.; Wu, J.; Zhao, R. An Enhanced Vehicle Trajectory Prediction Model Leveraging LSTM and Social-Attention Mechanisms. IEEE Access 2024, 12, 1718–1726. [Google Scholar] [CrossRef]
Altché, F.; de La Fortelle, A. An LSTM network for highway trajectory prediction. In Proceedings of the 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), Yokohama, Japan, 16–19 October 2017; pp. 353–359. [Google Scholar] [CrossRef]
Bogaerts, T.; Masegosa, A.D.; Angarita-Zapata, J.S.; Onieva, E.; Hellinckx, P. A graph CNN-LSTM neural network for short and long-term traffic forecasting based on trajectory data. Transp. Res. Part C-Emerg. Technol. 2020, 112, 62–77. [Google Scholar] [CrossRef]
Duan, Z.; Yang, Y.; Zhang, K.; Ni, Y.; Bajgain, S. Improved Deep Hybrid Networks for Urban Traffic Flow Prediction Using Trajectory Data. IEEE Access 2018, 6, 31820–31827. [Google Scholar] [CrossRef]
Gong, S.; Liu, J.; Yang, Y.; Cai, J.; Xu, G.; Cao, R.; Jing, C.; Liu, Y. Self-paced Gaussian-based graph convolutional network: Predicting travel flow and unravelling spatial interactions through GPS trajectory data. Int. J. Digit. Earth 2024, 17, 2353123. [Google Scholar] [CrossRef]
Quintanar, A.; Llorca, D.F.; Parra, I.; Izquierdo, R.; Sotelo, M.Á. Predicting Vehicles Trajectories in Urban Scenarios with Transformer Networks and Augmented Information. In Proceedings of the 2021 IEEE Intelligent Vehicles Symposium (IV), Nagoya, Japan, 11–17 July 2021; pp. 1051–1056. [Google Scholar] [CrossRef]
Lee, N.; Choi, W.; Vernaza, P.; Choy, C.B.; Torr, P.H.S.; Chandraker, M. DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 336–345. [Google Scholar] [CrossRef]
Wang, B.; He, L.; Song, L.; Niu, R.; Cheng, M. Attention-Linear Trajectory Prediction. Sensors 2024, 24, 6636. [Google Scholar] [CrossRef] [PubMed]
Jiang, H.; Chang, L.; Li, Q.; Chen, D. Trajectory Prediction of Vehicles Based on Deep Learning. In Proceedings of the 2019 4th International Conference on Intelligent Transportation Engineering (ICITE), Singapore, 5–7 September 2019; pp. 190–195. [Google Scholar] [CrossRef]
Liang, Y.; Zhao, Z. NetTraj: A Network-Based Vehicle Trajectory Prediction Model With Directional Representation and Spatiotemporal Attention Mechanisms. IEEE Trans. Intell. Transp. Syst. 2021, 23, 14470–14481. [Google Scholar] [CrossRef]
Chen, J.; Fan, D.; Qian, X.; Mei, L. KGCN-LSTM: A graph convolutional network considering knowledge fusion of point of interest for vehicle trajectory prediction. IET Intell. Transp. Syst. 2023, 17, 1087–1103. [Google Scholar] [CrossRef]
Kim, M.; Kwak, B.I.; Hou, J.U.; Kim, T. Robust Long-Term Vehicle Trajectory Prediction Using Link Projection and a Situation-Aware Transformer. Sensors 2024, 24, 2398. [Google Scholar] [CrossRef]
Wu, H.; Chen, Z.; Sun, W.; Zheng, B.; Wang, W. Modeling Trajectories with Recurrent Neural Networks. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, Melbourne, Australia, 19–25 August 2017; pp. 3083–3090. [Google Scholar] [CrossRef]
Zhu, F.; Zhou, R.; Chen, W.; Yu, M.; Zhang, X. Fusing Information From Multi-Sensors and High-Definition Maps for Continuous and Precise Positioning in Autonomous Driving Services. IEEE Trans. Intell. Transp. Syst. 2025, 1–19. [Google Scholar] [CrossRef]
Sadli, R.; Afkir, M.; Hadid, A.; Rivenq, A.; Taleb-Ahmed, A. Map-Matching-Based Localization Using Camera and Low-Cost GPS for Lane-Level Accuracy. Sensors 2022, 22, 2434. [Google Scholar] [CrossRef]
Elkholy, M.; Elsheikh, M.; El-Sheimy, N. Radar/INS Integration and Map Matching for Land Vehicle Navigation in Urban Environments. Sensors 2023, 23, 5119. [Google Scholar] [CrossRef]
Mounier, E.; Elhabiby, M.; Korenberg, M.; Noureldin, A. LiDAR-Based Multisensor Fusion With 3-D Digital Maps for High-Precision Positioning. IEEE Internet Things J. 2025, 12, 7209–7224. [Google Scholar] [CrossRef]
Zhang, H.; Qian, C.; Li, W.; Li, B.; Liu, H. Tightly coupled integration of vector HD map, LiDAR, GNSS, and INS for precise vehicle navigation in GNSS-challenging environment. Geo-Spat. Inf. Sci. 2025, 28, 1341–1358. [Google Scholar] [CrossRef]
Zhu, J.; Zhou, H.; Wang, Z.; Yang, S. Improved Multi-Sensor Fusion Positioning System Based on GNSS/LiDAR/Vision/IMU With Semi-Tight Coupling and Graph Optimization in GNSS Challenging Environments. IEEE Access 2023, 11, 95711–95723. [Google Scholar] [CrossRef]
Jeong, S.; Shin, H.; jun Kim, M.; Kang, D.; Lee, S.W.; Oh, S. Enhancing LiDAR Mapping with YOLO-Based Potential Dynamic Object Removal in Autonomous Driving. Sensors 2024, 24, 7578. [Google Scholar] [CrossRef]

Figure 1. Graphical representation of the map-matching problem.

Figure 2. Graphical representation of the route prediction problem.

Figure 3. Inference model-based approach diagram.

Figure 4. A flowchart of the HMM map-matching process using the Viterbi algorithm.

Figure 5. Graphical representation of geometric relationships used to determine emission and transition probabilities.

Figure 6. System architecture for route prediction using HDBSCAN and HMM.

Figure 7. A conceptual diagram of the proposed integrated framework for map-matching and route prediction.

Table 1. Notations.

Notation	Description
$t_{k}$	the k-th time step
$O^{t_{k}}$	an observation at time step $t_{k}$
$l a t_{k}$	the latitude of observation $O^{t_{k}}$
$l o n_{k}$	the longitude of observation $O^{t_{k}}$
$v_{k}$	the vehicle’s speed from observation $O^{t_{k}}$
$ϕ_{k}$	the azimuth of observation $O^{t_{k}}$
T	a trajectory—a sequence of observations
G	a road network—a directed graph
V	a set of vertices (e.g., intersections, junctions, segment boundary points)
S	a set of edges (road segments)
$s_{i}$	the i-th road segment
R	a route—a sequence of road segments
$v_{s t a r t} (s_{i})$	the start vertex of segment $s_{i}$
$v_{e n d} (s_{i})$	the end vertex of segment $s_{i}$
$T_{h i s t}$	a historical trajectory—a sequence of past observations
$T_{p r e d}$	a future trajectory—a sequence of future observations
$R_{h i s t}$	a historical route—a sequence of past road segments
$R_{p r e d}$	a future route—a sequence of future road segments
$σ$	the standard deviation of the GPS measurements
$D_{s_{i}}^{k}$	the orthogonal distance between observation $O^{t_{k}}$ and segment $s_{i}$
$P_{E} (O^{t_{k}}, s_{i})$	the emission probability for observation $O^{t_{k}}$ and road segment $s_{i}$
$λ$	the scale parameter of the exponential distribution
$D^{k}$	the Euclidean distance between consecutive observations $O^{t_{k}}$ and $O^{t_{k + 1}}$
$D_{G (s_{j}, s_{l})}^{k}$	the distance along the road network between the projections of observations $O^{t_{k}}$ and $O^{t_{k + 1}}$ onto their respective segments $s_{j}$ and $s_{l}$
$P_{T} (O^{t_{k}}, s_{j}, O^{t_{k + 1}}, s_{l})$	the transition probability from observation $O^{t_{k}}$ on segment $s_{j}$ to observation $O^{t_{k + 1}}$ on segment $s_{l}$
$S_{H M M}$	the global evaluation metric (HMM)
$δ_{k} (O^{t_{k}})$	the maximum cumulative probability for observation $O^{t_{k}}$ (HMM)
$W_{t r e n d} (O^{t_{k}})$	the weight accounting for the movement trend for observation $O^{t_{k}}$
$S_{T r e n d H M M}$	the global evaluation metric including the movement trend
$w_{s_{i}}$	the width of segment $s_{i}$
$v_{s_{i}}$	the permitted speed on segment $s_{i}$
$ϕ_{s_{i}}$	the azimuth of segment $s_{i}$
$H (x)$	the Heaviside step function
$P_{T I M E} (O^{t_{k}}, O^{t_{k + 1}})$	the probability determining whether travel between observations is feasible without exceeding the permitted speed
$P_{C H O I C E} (O^{t_{k}}, O^{t_{k + 1}})$	the probability determining the driver’s preferences for path selection
$l_{s_{i}}$	the length of road segment $s_{i}$
$D_{s_{i}}^{k} (x)$	the distance between observation $O^{t_{k}}$ and a position x on road segment $s_{i}$
$N_{k}$	the number of road segments connected to the current segment
$N_{s_{i} \to s_{j}}$	the number of historical transitions from segment $s_{i}$ to segment $s_{j}$
$P_{s_{i} \to s_{j}}$	the transition probability from segment $s_{i}$ to segment $s_{j}$ , determined from historical data

Table 2. Summary of techniques for the map-matching problem.

Technique	Description	Advantages	Disadvantages	References
Geometric and topological methods	Deterministic algorithms based on explicit predefined rules.	- a foundation for advanced techniques, - use of network topology improves matching	- sensitivity to GPS noise and map errors, - low effectiveness in urban environments	[12,13,14]
Particle Filters	A sequential Monte Carlo method; particles represent hypotheses about the vehicle’s state.	- effective for non-linear problems, - high accuracy with frequent sampling	- sensitivity to low-quality data	[16,17,18]
Fuzzy Logic	Utilises fuzzy logic (FIS) and linguistic rules to evaluate candidate roads.	- formal representation and processing of uncertainty, - improved accuracy using topology and history	- complexity of defining rules and membership functions	[20,21,22]
Dempster–Shafer Theory	A generalisation of Bayesian theory; combines evidence from multiple sources to evaluate hypotheses.	- flexible modelling of uncertainty, - effective management of multiple hypotheses	- high computational complexity	[24,25]
Hidden Markov Model	Estimates a sequence of roads based on GPS points (emission and transition probabilities).	- a popular and effective method, - online versions are available	- the offline version requires the entire trajectory, - online versions may yield sub-optimal results	[5,6,7,26,28,29]
seq2seq	Transforms GPS sequences into road sequences; learns spatio-temporal dependencies.	- high accuracy (better than HMM), - robustness to noise and sparse sampling	- requires large training datasets	[36,37,38,42]
CNN + RNN	Combines CNN (spatial analysis) with RNN (temporal analysis) for route prediction.	- significant accuracy improvement under challenging conditions, - comprehensive data analysis	- high model complexity, - requires large datasets	[41]
GNN	Models that incorporate the graph structure of the road network in the learning process.	- better modelling of road network topology	- high computational complexity	[39]
Reinforcement Learning	An agent learns route selection through interaction and a reward system.	- potentially high effectiveness after training	- requires lengthy training, - complex implementation	[40]
Other ML algorithms (kNN)	Standard ML algorithms for road segment identification.	- simplicity of implementation (compared to DL models)	- less effective with complex patterns	[34,35]

Table 3. Summary of techniques for route prediction. The Test Results column contains the results of experiments published by the authors in the cited works.

Technique	Description	Key Feature	Data	Test Results	References
HMM	Predicts the trip destination (HMM’s hidden state) based on observations of key road infrastructure objects.	Trip destination is the HMM hidden state; prediction is without digital maps.	User’s historical GPS data with extracted road infrastructure objects.	Prediction accuracy improved as the trip progressed, from 36.1% (at 25% of trip) to 94.6% (at 90%).	[44]
HMM + HDBSCAN	Clusters road segments into popular routes using HDBSCAN, which become the hidden states of an HMM for route prediction.	Defining HMM states through density-based clustering (HDBSCAN) eliminates the cold start problem.	Aggregated data from many vehicles and the current, partial trip as an observation sequence.	Using aggregated vehicle data, the model’s Hit@3 of 0.895 greatly outperformed the baseline HMM’s 0.163 on partially completed (25%) trips.	[45]
LSTM/GRU/ SAE	A comparative analysis of LSTM, GRU, and SAE models for predicting a vehicle’s future position and speed.	Evaluation of different neural network architectures; LSTM achieved the highest accuracy.	Historical vehicle trajectory data, filtered using a Savitzky–Golay filter.	For highway velocity prediction, the LSTM model’s RMSE of 1.69 was significantly better than GRU (3.35) and SAEs (4.66).	[56]
Graph CNN-LSTM	A hybrid architecture combining Graph CNN (spatial analysis) and LSTM (temporal analysis) for traffic flow forecasting.	Fusion of Graph CNN and LSTM models to simultaneously model network topology and temporal dynamics.	GPS data of limited density; dimensionality reduction was applied by selecting key segments.	For 5 min speed prediction on ride-hailing data, the model’s rush-hour RMSE of 4.08 km/h beat the baseline LSTM (5.12), SVM (5.24), and k-NN (5.51).	[50]
NetTraj (seq2seq + Attention Mechanism)	A seq2seq (LSTM) model with attention mechanisms that predicts sequences of movement directions instead of road segments.	Direction-based trajectory representation to reduce the problem’s dimensionality.	Trajectories represented as sequences of intersections and discrete movement directions.	Predicting five-segment taxi trajectories, NetTraj achieved a higher AMR of 65.8%, compared to the best baseline (62.5%).	[57]
KGCN-LSTM	A hybrid of a KGCN (analysis of road network context, e.g., POIs) and an LSTM (analysis of trajectory sequence).	Incorporation of contextual knowledge about the surroundings (POIs) into the model.	A sequence of historical vehicle trajectory points and contextual knowledge about infrastructure (POIs).	By integrating POI data, the KGCN-LSTM model’s RMSE of 0.0184 showed higher robustness than the baselines (0.0200–0.0222).	[58]
Transformer	A Transformer model that considers situational context (e.g., speed cameras) and corrects the trajectory to align with the map.	Trajectory correction (link projection) to ensure consistency with the road network topology.	Movement sequences and information about road infrastructure objects.	For long-term bus trajectory prediction, the proposed SATLP model cut the RMSE to 0.0701 m, a 65.7% improvement over the vanilla Transformer (0.2046 m).	[59]
CNN-LSTM	A hybrid of CNN (spatial features from a grid map) and LSTM (temporal dependencies) for motion prediction.	Input for the CNN as a grid map from GPS data; training with a greedy strategy.	GPS data from urban trips, converted into a grid-based map.	Using a greedy training strategy, the improved model lowered the RMSE to 11.15 compared to 14.15 for the standard CNN-LSTM.	[51]
RNN	Two RNN models: CSSRNN explicitly incorporates network topology, while LPIRNN uses multi-task learning.	CSSRNN: output layer masking to incorporate topology. LPIRNN: a multi-task learning approach.	Traffic trajectories on a road network.	By incorporating road network topology, the proposed models (CSSRNN/LPIRNN) achieved an accuracy (ACC) of 94.1%, outperforming the standard RNN ( 93.6%).	[60]

Table 4. Summary of selected applications using data fusion for enhanced vehicle positioning.

Sensors	Core Problem	Methodology	Test Results	Test Environment	References
GNSS, LiDAR, Camera, IMU	GNSS unreliability in urban canyons; SLAM drift.	Tightly-coupled LiDAR/Vision/IMU with loosely-coupled GNSS via factor graph optimisation.	93% RMSE reduction vs. GNSS-only in urban tests (9.593 m RMSE).	Rural and Urban	[66]
Camera, GNSS, INS, DMI, HD Map	Commercial HD map offsets/errors; INS drift in GNSS-denied areas.	Online HD map offset calibration using GNSS. Tightly-coupled EKF for INS/DMI/Lane-observations.	Centimeter-level lateral accuracy during 200 s GNSS outage in a tunnel.	Urban, Tunnel	[61]
LiDAR, GNSS, INS, Vector HD Map	High computational cost of point-cloud maps; INS drift.	PF matching LiDAR scans to simulated scans from a lightweight vector map.	>75% position improvement vs. standard GNSS/INS.	Simulated GNSS-challenging	[65]
LiDAR, Camera	Dynamic objects corrupting LiDAR map-matching.	YOLOv4 on camera data to detect and remove dynamic objects from LiDAR point clouds before NDT matching.	Urban RMSE reduced from 1.3874 m to 1.1217 m.	Open and Urban	[67]
LiDAR, IMU, Odometer, 3D Digital Map	LiDAR odometry drift; GNSS-denial.	EKF fusion of LiDAR-to-map registration (with deskewing and multi-scan aggregation) and onboard motion sensors.	Avg. RMSE: 20 cm horizontal, 13 cm vertical.	Urban, Indoor Parking	[64]
Radar, INS, OSM	GNSS outages.	Fusing radar-based ego-motion with INS via EKF; correcting position with OSM matching.	<1% position error of distance traveled during 3-min GNSS outage.	Urban	[63]
Camera, Low-cost GPS, Digital Map	High cost of LiDAR; low accuracy of cheap GPS.	Combining camera-based relative lane positioning with rough GPS position matched to a reference map.	Reduced mean deviation from lane center from 49.3 cm to 29.5 cm.	Test Track	[62]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Waksmundzki, T.; Niewiadomska-Szynkiewicz, E.; Granat, J. Advancing Map-Matching and Route Prediction: Challenges, Methods, and Unified Solutions. Electronics 2025, 14, 3608. https://doi.org/10.3390/electronics14183608

AMA Style

Waksmundzki T, Niewiadomska-Szynkiewicz E, Granat J. Advancing Map-Matching and Route Prediction: Challenges, Methods, and Unified Solutions. Electronics. 2025; 14(18):3608. https://doi.org/10.3390/electronics14183608

Chicago/Turabian Style

Waksmundzki, Tomasz, Ewa Niewiadomska-Szynkiewicz, and Janusz Granat. 2025. "Advancing Map-Matching and Route Prediction: Challenges, Methods, and Unified Solutions" Electronics 14, no. 18: 3608. https://doi.org/10.3390/electronics14183608

APA Style

Waksmundzki, T., Niewiadomska-Szynkiewicz, E., & Granat, J. (2025). Advancing Map-Matching and Route Prediction: Challenges, Methods, and Unified Solutions. Electronics, 14(18), 3608. https://doi.org/10.3390/electronics14183608

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Advancing Map-Matching and Route Prediction: Challenges, Methods, and Unified Solutions

Abstract

1. Introduction

2. Problem Formulation

2.1. Problem Definition for Map-Matching

2.2. Problem Definition for Route Prediction

3. Map-Matching Algorithms

3.1. Classification of Map-Matching Algorithms

3.2. Spatio-Temporal Constraint Approach

3.3. Inference Model-Based Approach

3.3.1. Particle Filters

3.3.2. Fuzzy Logic

3.3.3. Dempster–Shafer Theory

3.3.4. Hidden Markov Model

3.4. Data-Driven/Learning Approach

4. Route Prediction Algorithms

4.1. Hidden Markov Models in Route Prediction

4.2. Artificial Intelligence Methods in Route Prediction

5. Multimodal Data Fusion for Enhanced Vehicle Positioning

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI