1. Introduction
Trajectory data hold rich information about spatio-temporal behavior [
1]. Analyzing trajectory data is fundamental to understanding urban mobility and human activity [
2]. The underlying patterns, or topics, within trajectory data are not static. Trajectory topics often correlate with temporal cycles. For instance, commute flows differ significantly between weekdays and non-workdays. An understanding of dynamic changes is essential to applications in traffic management, public service optimization, and urban planning. A clear need, therefore, exists for methods to effectively model and interpret trajectory topic evolution.
Modeling trajectories to effectively capture features is a fundamental step for analyzing dynamic topics. Existing modeling approaches often use representation learning to generate high-dimensional embeddings for trajectory sequences. RNN-based methods [
3] employ a recurrent structure to process sequences. However, for long trajectories, the recurrent structure struggles to propagate information from the start to the end of a sequence [
4,
5,
6,
7]. Transformer-based methods [
8] offer an alternative but require supervised training for specific tasks. Consequently, the resulting trajectory representations become task-dependent [
9,
10,
11,
12,
13]. A new trajectory embedding strategy is required to produce effective semantic representations that preserve spatial context.
Many existing topic modeling methods lack flexibility for analyzing topic dynamics across different temporal scales. These methods often rely on fixed time windows for analysis. An analysis based on fixed windows limits the scope of inquiry. A predefined interval may be too coarse to capture short-term events or too fine to reveal long-term trends. Such limitations prevent a comprehensive, multi-scale understanding of movement dynamics. The dynamic topic model (DTM) [
14], a dynamic extension of Latent Dirichlet Allocation (LDA) [
15], exemplifies these challenges. While DTM allows topics to evolve over time, the model requires a fixed number of time slices to be set before analysis. This requirement reduces flexibility, especially for temporally uneven data. Any adjustment to the modeling goal in DTM requires a full retraining of the model with a new time slice configuration. Such retraining is computationally expensive. DTM also requires pre-specifying the number of topics and offers limited semantic modeling capabilities. A clear need, therefore, exists for a more flexible topic modeling approach. Such an approach should permit dynamic adjustment of the temporal scope to match different behavioral patterns and analytical goals.
Visual analytics systems use interactive interfaces to help users explore and interpret dynamic trajectory topics [
16]. Existing visualization tools for trajectory analysis often focus on presenting statistical information [
17,
18,
19]. A key limitation is the lack of support for exploring the various components of a topic to understand its underlying meaning. Effective visual analysis of trajectory topics thus requires intuitive components designed for interactively exploring multi-level semantics.
To address these challenges, this paper presents a new method for constructing dynamic topic models from trajectory data. The proposed method consists of three main components. First, the method introduces a new embedding strategy to capture complex trajectory semantics while preserving spatial context. This strategy uses a modified RoBERTa model and a word-level tokenizer to generate context-aware embeddings from Morton-coded trajectories. Second, the method employs BERTopic [
20], a data-driven neural topic model, for a more flexible modeling of dynamic topics. BERTopic clusters the trajectory embeddings and then generates topic representations using a class-based TF-IDF procedure. Then, a visualization system centered on trajectory topic words and features is developed for the interactive exploration of dynamic movement topics. The system translates abstract model outputs into an intuitive interface, enabling the direct exploration and interpretation of evolving trajectory topics. The proposed method applies established text processing methods to the challenge of topic modeling for unstructured spatio-temporal data. The main contributions of this work are as follows:
(1) A domain-specific trajectory embedding method to effectively capture both spatial context and sequential patterns. This addresses the limitations of information decay in RNN-based approaches and task-dependent representations in standard Transformer models.
(2) A flexible dynamic topic modeling method using an adjustable time window to support multi-scale temporal analysis. This overcomes the constraint of fixed time slices found in traditional dynamic topic models.
(3) An interactive visualization system to translate abstract model outputs into an intuitive interface for the direct exploration and interpretation of evolving trajectory topics.
The rest of the paper is organized as follows: We introduce related work on modeling dynamic topics of trajectories in
Section 2. Then, we give the detail of the proposed method for dynamic modeling trajectory topics in
Section 3. Extensive experiments are performed in
Section 4. Next, we present a discussion and mention future work in
Section 5. Finally, we conclude this paper in
Section 6.
2. Related Work
In this section, we briefly introduce related works. Specifically, we introduce works on the dynamic topics of trajectories first, and then, we introduce works of modeling trajectories. Finally, we detail related work about trajectory visualization.
Methods for modeling dynamic trajectory topics are generally classified into static and dynamic approaches. Static topic models such as LDA [
15], NMF [
21], and BTM [
22] can identify global movement patterns across an entire dataset [
23,
24,
25,
26,
27,
28]. A primary limitation of static models is the inability to capture temporal topic dynamics. Dynamic extensions like Dynamic NMF [
29] and dynamic topic models (DTMs) [
14] were developed to explicitly model topic evolution. However, a common limitation is the requirement to fix the number and duration of time slices prior to model training. Furthermore, traditional topic models including LDA and DTMs employ a bag-of-words assumption. This assumption ignores the sequential order of trajectory points, preventing the capture of complex spatial movement semantics. BERTopic [
20] presents a modern deep learning paradigm for this problem. The method adapts text modeling techniques to capture both sequential patterns and spatial context in trajectory data. Moreover, BERTopic enhances analytical flexibility through a modeling process that supports adjustable time windows. Unlike traditional DTMs, which require computationally expensive retraining whenever time slices are altered, this approach allows for dynamic window adjustments during the inference phase without retraining, thereby offering superior computational efficiency for multi-scale analysis. Representation learning is a key component within BERTopic for capturing sequence features. The following section details relevant methods for trajectory sequence modeling.
Effective trajectory modeling forms the basis for dynamic topic analysis. Existing methods for learning trajectory representations [
30] are generally classified into two categories: RNN-based [
3] and Transformer-based [
8] approaches. RNN-based methods [
4,
5,
6,
7] use a recurrent structure to process sequential information. For long trajectory sequences, however, this structure struggles to propagate information from the start to the end of a sequence, resulting in information decay. Standard Transformer-based methods [
9,
10,
11,
12,
13] typically require supervised training on downstream tasks. The resulting trajectory representations are, therefore, task-specific. Consequently, applying these methods to trajectory topic modeling introduces the risks of information decay and task-dependent feature representations. To address these limitations, the proposed method leverages RoBERTa [
31]. The approach combines RoBERTa with structural encoding and a word-level tokenizer [
32] to learn contextual trajectory embeddings while preserving the semantic independence of individual locations.
Converting abstract trajectory topics into interpretable insights depends on effective visualization. Visual analytics is a common approach for exploring features from trajectory data [
17,
18,
19]. Many existing methods use visual components to analyze traffic trajectories. For instance, He et al. [
33] developed a visual filter, a form of coordination between views, to explore the spatio-temporal evolution of topics. Gao et al. [
34] employed visualization to guide the trajectory clustering process. However, a comprehensive interpretation of dynamic topics requires integrating a topic’s multi-level semantics, from individual topic words and trajectory segments to the overall theme. Existing approaches often lack this deep integration needed to explain dynamic topic behavior. The proposed visualization system is designed based on the core attributes of trajectories. The approach presents trajectory topics by centering on two semantic levels: topic words and trajectory segments. A system of coordinated views displays topic features across different time slices. This design yields clear visual representations of topic characteristics and allows analysts to explore dynamic topic evolution from macro- to micro-scales.
3. Methodology
The primary research problem is to capture dynamic semantics from unstructured trajectory data. The main goal is to build a flexible framework for analyzing topic evolution. Based on this goal, we introduce the details of our dynamic topic modeling method for trajectory data in this section.
Figure 1 illustrates the pipeline of the proposed method. There are three steps to modeling dynamic topics in trajectory data and analyzing the dynamic topics in trajectory data with visualization. Basically, after generating trajectory embeddings with a domain-specific embedding strategy, we incorporate the adjustable topic modeling method to generate trajectory dynamic topics. Then, we develop a visualization system to interactively analyze the details of the dynamic topics of the trajectory.
3.1. Trajectory Embedding with Spatial-Context Feature
3.1.1. Morton Encoding for Spatial Discretization
The trajectory is composed of a series of trajectory coordinates in sequential order, and it is necessary to simplify the trajectory data structure. The trajectory coordinates are encoded by Morton codes [
35]. The following formulation is the definition of the Z-order curve of Morton codes:
where
x and
y are the longitude and latitude coordinates in binary format,
and
represent the bit at position
i, and
l is used to quantify the number of digits of the coordinates. Morton encoding simplifies the structure of trajectory data because its dimensions are encoded from a multi-dimensional space to a single-dimensional space. Basically, Morton encoding adapts the Z-curve to fill the space, which keeps the spatial information. The length of a Morton code remains 12 bits with an accuracy of 10 m. Additionally, Morton encoding provides the benefit of keeping the related spatial feature in nearby GPS coordinates. In detail, near trajectory GPS coordinates are encoded into the near digital space, which improves the calculation efficiency. We convert all trajectory coordinates with the expression of longitude and latitude into the string word with Morton encoding, establishing the bridge from trajectory sequence to the following RoBERTa method.
3.1.2. Tokenizer and Vocabulary Design
The tokenizer converts the trajectory sequence into a discrete signal sequence. The WordLevel tokenizer is suitable in the task of trajectory embedding because the WordLevel tokenizer preserves the integrity of Morton identifiers and domain terms. Additionally, The WordLevel tokenizer avoids the semantic damage caused by sub-word splitting and preserves the spatial encoding granularity.
The training process of WordLevel tokenizer is present in Algorithm 1. Let the trajectory be represented as a sequence
, where each
denotes a Morton code. Algorithm 1 aims to construct a WordLevel tokenizer
that satisfies
where the maximum sequence length is represented as
. The training pipeline includes three phases. Firstly, we construct the vocabulary
by combining the special tokens
with the unique Morton codes tokens
extracted from all trajectories. Then, with the configurations of whitespace splitting at Morton code boundaries and the length limitation
, the mapping function
is constructed to initialize the WordLevel tokenizer
.
| Algorithm 1 WordLevel tokenizer training. |
Require: Trajectory set , Max trajectory length , Vocab size Ensure: Tokenizer - 1:
- 2:
for each do - 3:
- 4:
end for - 5:
▹ Fixed IDs 0–4 - 6:
▹ Preserve ordering - 7:
Build vocabulary mapping: - 8:
where for - 9:
Initialize with: Tokenization function: Preprocessing: Whitespace splitting Truncation/Padding:
- 10:
return
|
The WordLevel tokenizer has better adaptability to the trajectory. Compared with the BPE tokenizer, the WordLevel tokenizer generates the complete token from the trajectory sequence. The complete token of Morton codes maintains the local spatial pattern. The object with minimal byte character given by the BPE tokenizer causes difficulties in learning the trajectory pattern. In the experimental section, we compare the modeling performance of two topic modeling approaches. These two topic modeling approaches are respectively equipped with a BPE tokenizer and a word-level tokenizer.
3.1.3. Domain-Specific RoBERTa Training
RoBERTa is a Transformer-based language model and presents an optimized iteration of BERT. RoBERTa benefits from the self-attention mechanism and generates a contextualized representation of words. To capture the spatial context of a trajectory, we train RoBERTa with the WordLevel tokenizer. We use a trajectory dataset to train RoBERTa and aggregate the outputs from RoBERTa by mean pooling to generate trajectory embeddings. Unlike general-purpose models, this retraining strategy allows the model to learn domain-specific features. The retrained RoBERTa model adapts the vocabulary distribution of trajectory text.
The RoBERTa model is trained with the Masked Language Modeling (MLM) strategy. When training the RoBERTa model, we randomly mask a trajectory coordinate with a special token [MASK] and make the RoBERTa model predict the masked trajectory coordinate. The MLM training strategy makes the RoBERTa model learn the sequence pattern and does not need any notation. Additionally, the task of mask prediction forces the model explore the context of trajectory and strengthens the understanding of the trajectory sequence. In addition, the random mask acts as the trajectory noise during the training process with the MLM strategy, increasing the robustness of model encoding. For a trajectory token
, the MLM loss is defined as
where
M is the set of masked token indices,
is the correct trajectory token, and
denotes the output vector corresponding to the masked position.
RoBERTa captures the spatial relationship between nearby trajectory coordinates and learns the semantics from the whole trajectory, which offers trajectory embeddings with rich semantic information. Additionally, the position encoding of RoBERTa can capture the order of trajectory coordinates, complementing the spatial proximity of Morton codes. In the Experiments Section, the topic modeling performance of two kinds of RoBERTa models, a RoBERTa model trained with trajectory data and a RoBERTa model fine-tuned with trajectory data, is compared.
3.2. Adjustable Dynamical Topic Modeling for Trajectory Data
To overcome the limitations of the bag-of-words model and fixed time windows, we construct a flexible method for the dynamic topic modeling of trajectories based on BERTopic. During the inference phase, the capability for dynamic time window adjustment expands the boundaries of trajectory topic modeling. The core phases of the adjustable dynamic topic modeling method include embedding, dimensionality reduction, clustering, and topic representation. Let represent the document collection of a trajectory. Document d could be split into the set of trajectory sequences , where is the sentence count of document d. We detail the adjustable dynamic topic modeling method in the following content.
3.2.1. UMAP Dimensionality Reduction
Operating directly on raw trajectory embeddings from RoBERTa for modeling topics encounters dimensionality-induced obstacles. Basically, the high-dimensional embedding space characterizing trajectory data fundamentally hinders direct topic modeling by introducing semantic sparsity and computational intractability. UMAP achieves non-linear dimensionality reduction based on manifold learning, enabling tasks for dimension reduction for trajectory embeddings. We could divide UMAP into two main steps. Basically, UMAP learns the manifold structure in high-dimensional data spaces and constructs its low-dimensional representation. As a result, the inherent manifold learning preserves the local and global patterns of the trajectory embeddings.
3.2.2. HDBSCAN Density-Based Clustering
HDBSCAN performs hierarchical clustering on trajectory embeddings, yielding clusters of similar trajectory embeddings [
36]. Let
denote a trajectory embedding generated from UMAP reduction. HDBSCAN processes trajectory embedding
and preserves the principal topological structures within the embedding space. HDBSCAN generates the cluster labels
, where
denotes noise. HDBSCAN automatically determines the optimal number of clusters while avoiding parameter sensitivity issues. This reveals the core advantage of using HDBSCAN for trajectory topic clustering: it establishes membership relationships between trajectory-embedded documents and their corresponding clusters. Additionally, HDBSCAN detects semantically coherent trajectory topics by identifying core points to locate high-density regions and marking low-density areas as noise. HDBSCAN identifies arbitrarily shaped trajectory embeddings using density-reachable principles, effectively adapting to dynamically changing movement patterns.
3.2.3. CountVectorizer Frequency Quantization
CountVectorizer is an essential step in trajectory topic calculation. It links the trajectory clustering results with keywords. Given the trajectory document and vocabulary . CountVectorizer constructs a sparse matrix , where denotes the frequency of word in document .
3.2.4. c-TF-IDF Topic Representation
c-TF-IDF calculates the weight of trajectory words at the cluster level. c-TF-IDF takes as input the clusters
from the output of HDBSCAN, where each cluster
is a document collection. Then, term frequency
and inverse document frequency
for the trajectory words are computed, yielding the importance score
for word
in the cluster:
c-TF-IDF achieves semantic consistency and enhances discriminability through . For trajectory embeddings, the method applies normalization to eliminate the effect of length variation, enabling fair cross-topic comparisons. Additionally, c-TF-IDF employs to suppress high-frequency geographical noise.
3.2.5. KeyBERTInspired Contextual Topic Representation
KeyBERTInspired operates as an essential plug-in and a re-ranker for c-TF-IDF results to enhance the consistency of trajectory semantics. It ensures that the selected topic words closely represent the cluster centroid. Let
denote the representation documents of the
t-th topic and
be candidate keywords, where
is the
-th keyword in the
t-th topic. KeyBERTInspired conducts the semantic ranking of candidate keywords
by computing spatial similarity between representative documents
and keyword embeddings. Let
denote the encoder from RoBERTa; the centroid of the trajectory topic is calculated with
. Subsequently, compute the cosine similarity between trajectory embeddings and centroids with
where
. KeyBERTInspired selects keywords by maximizing the similarity scores
, enforcing directional alignment between trajectory words and the trajectory topic. The collaboration between KeyBERTInspired and c-TF-IDF improves trajectory topic modeling while maintaining computational efficiency and semantic precision.
3.2.6. Dynamic Time-Window Parameterization
Our topic modeling method outperforms traditional approaches by dynamically adjusting the temporal range for analysis, optimizing time windows for different behavioral patterns. Time window
is defined as
where
represents a trajectory document,
is its associated time,
denotes the size of the time window, and
is the start time of the window. This formula defines a time window
containing all events points
meeting specific temporal conditions. Specifically, each
must be included in
, where
denotes the start time and
the interval duration. Adjustable time windows enable multi-scale temporal analysis, balancing short-term events with long-term trends through flexible scaling.
3.3. Interactive Visualization for Trajectory Dynamic Topic
To provide an intuitive dynamic analysis of trajectory topics, we develop an interactive visualization system.
Figure 2 illustrates the user interface of the proposed system, which presents the dynamics of trajectory topics in both spatial and temporal dimensions. Specifically, the system consists of three main components: a trajectory topic map, a topic frequency trend view, and a topic word evolution view. We describe each component in detail below.
3.3.1. Trajectory Topic Map
The interactive trajectory map view is designed to support the geospatial exploration and analysis of the semantic information embedded in trajectory topics. As illustrated in
Figure 2A, this component visualizes the spatial distribution of trajectory topics. Specifically, after a user selects topics from a target time interval using the control panel on the left of the map view, the map view renders the corresponding GPS sequences onto the basemap. The GPS sequences represent the most frequent trajectories for the selected time interval and topic. Each sequence is rendered as a colored line, and the line color identifies the trajectory topic. To avoid visual clutter, our design follows Tufte’s principle of maximizing the data–ink ratio. Consequently, we use a grayscale basemap to ensure that the colored trajectories stand out. For visual representation, each trajectory topic is assigned a distinct color, creating a clear visual hierarchy. Furthermore, the topic frequency, a quantitative value, is mapped to a sequential color scale within the same hue, where higher topic frequency corresponds to a darker and more saturated color. The map supports standard interactions such as panning, zooming, and filtering to facilitate interactive exploration.
3.3.2. Topic Frequency Trend
To complement the spatial analysis of trajectory topics, we designed a topic frequency trend chart for revealing the temporal dynamics of different trajectory topics. This chart serves as a supplement to the map view, providing users with a linked component for coordinated spatial temporal analysis. To support the goals of identifying periodic patterns and comparing temporal dynamics, the chart’s visual encoding is organized through the time dimension. Specifically, we compute the frequency of each topic over time to generate multiple time-series plots for topic frequency.
Figure 2B displays the topic frequency trend as a line chart, where the x-axis represents time and the y-axis represents the topic frequency. Furthermore, to better illustrate the trend of frequency changes over time, we compute and plot the average changes in topic frequency, as shown in
Figure 2C. To ensure consistency across views, the color encoding is kept consistent with that of the map view. Furthermore, to address the visual challenges of displaying multiple trend lines, we provide interactive features such as highlighting on mouse hover and displaying details on demand, which helps users focus on information of interest.
3.3.3. Topic Word Evolution
To refine the exploration granularity from the topic level to the individual topic word level, a topic word evolution view is proposed. The view supports analysis of the evolutionary trends of topic words over time. Specifically, the component is a composite view consisting of a temporal heatmap and an aggregated bar chart. These two coordinated views reveal both the overall dynamics and the detailed composition of topic word frequencies.
Figure 2D,E present the evolution for all topic words, while
Figure 2F,G illustrate the evolution for a single topic word. As shown in
Figure 2D,F, the temporal frequency heatmap displays the frequency of each word within every time interval at fine granularity. Frequency values are encoded using a color channel. The heatmap is intended for identifying periodic and trend-based patterns and discerning co-occurrence relationships among trajectory words.
Figure 2E,G show the aggregated bar chart. The x-axis of the aggregated bar chart in
Figure 2E is aligned with the x-axis of the heatmap in
Figure 2D, with both axes representing topic words. The length of each bar in the aggregated bar chart encodes the total frequency of a topic word summed across all time intervals. The aggregated bar chart provides an overview of the frequency intensity for each topic word. The temporal heatmap and the aggregated bar chart are tightly coupled through strict alignment and a shared x-axis. Such coupling enables efficient visual correlation analysis. Additionally, the two views implement a classic focus and context design. The temporal frequency heatmap serves as the high-resolution focus view, displaying details of interest, while the underlying aggregated bar chart acts as a summarized context view. A focus and context design allows for a balanced exploration between microscopic details and macroscopic trends.
5. Discussion
The proposed method holds significant value for the analysis of trajectory data. By combining a domain-specific language model with a flexible topic modeling method, this work provides a powerful analytical tool for discovering dynamic movement patterns from raw GPS sequences. The method has substantial potential impact across multiple domains. Urban planners and traffic engineers can use the system to understand the evolution of commute patterns and identify traffic characteristics. In tourism and commerce, the approach enables the identification of changing visitor flows and the analysis of activity hotspots around commercial centers. Furthermore, the integration of such analytics with IoT platforms could further transform automation in these sectors [
40].
Furthermore, the interactive visualization system allows stakeholders without extensive data science expertise to perform complex spatio-temporal analysis, bridging the gap between advanced data mining and practical decision making. The unsupervised nature of the method also reduces dependency on manually labeled data, increasing the applicability of the method to diverse trajectory datasets.
While the proposed method shows strong performance, several areas offer opportunities for improvement. First, the current trajectory representation relies solely on geospatial coordinates encoded via Morton codes. Future work could incorporate richer contextual information and multimodal data, such as timestamps or external factors, to obtain more specific topic representations and to add greater semantic depth to the generated topics. Second, the evaluation is conducted on a single large-scale taxi trajectory dataset from Porto. Validating the method on diverse trajectory datasets is necessary to ensure generalization. The characteristics of taxi movement, being demand-driven, may differ from other forms of mobility like public transportation, pedestrian movement, or logistics. Extending and validating the method on datasets from these other domains remains an important direction for future research. Finally, training the RoBERTa model is computationally intensive. Future work will investigate strategies like model pruning or quantization to reduce training costs. Additionally, we plan to use optimization algorithms, such as MFO, SOA, or HBA, to tune hyperparameters for better efficiency. Future work should, therefore, consider the trade-off between performance and efficiency for practical implementations.