FedTULGAC: A Federated Learning Method for Trajectory User Linking Based on Graph Attention and Clustering
Abstract
1. Introduction
- These traditional deep learning-based trajectory user linking methods primarily focus on capturing short-term temporal dependencies in user movement trajectories. This hinders their ability to accurately model the evolution of user behavior over extended periods and to capture complex movement patterns.
- We propose a novel federated learning-based TUL method named FedTULGAC to address the challenge of privacy preservation in trajectory user linking. Our method enables distributed data processing by exchanging model parameters instead of sharing raw trajectory data. Under the coordination of a central server, the FedTULGAC framework enables multiple clients to collaboratively optimize a global model.
- We design the GATULER model and the FL-DBSCAN algorithm and integrate them into FedTULGAC. The GATULER model uses the dynamic graph attention mechanism to capture the complex features of trajectory data. The FL-DBSCAN algorithm optimizes the aggregation of model parameters across clients to mitigate the Non-IID characteristics and the spatiotemporal distribution heterogeneity of client-side data.
- We conducted extensive experiments on the synthesized Gowalla and Foursquare datasets. The results indicate that the FedTULGAC method achieves the highest prediction accuracy compared to baseline models while safeguarding user privacy, and it maintains the training model’s stability and robustness.
2. Preliminary Preparation
2.1. Trajectory User Link Problem
2.2. Federated Learning of Trajectory User Linking
3. Methods
3.1. Model: GATULER
- Trajectory Graph Construction: Two types of trajectory graphs are constructed: the undirected graph and the directed graph . is used to represent interactions or movement relationships between users, while further captures the directionality of user interactions or movement. Finally, these two graphs are fused to obtain the final trajectory graph , which incorporates both geographical proximity and temporal movement order. The fusion is defined by the following equation:
- Node Graph Embedding and Graph Attention Computation: The trajectory graph is used to construct initial embeddings for user and location through their respective embedding layers. This process generates embeddings for users and locations, denoted as and , respectively. These embeddings are subsequently transformed into one-dimensional vectors using the “Flatten” operation, as follows:
- Trajectory Vector Representation, for a given user trajectory , this phase first retrieves , , an corresponding to each check-in point in , and concatenates them to form . The computation for the -th node in the trajectory is as follows:
- Trajectory-User Linking: The model leverages the features extracted and integrated from preceding stages to predict the association probability between each user and a given trajectory. The computation is as follows:
3.2. FL-DBSCAN
3.2.1. Design of Implementation Algorithm
| Algorithm 1: FL-DBSCAN Input: Total number of clients , Client selection ratio , Number of communication rounds , Number of local epochs , Batch size Output: |
| Algorithm 2: Subroutine: CalculateDistanceMatrix |
| Input: Set of selected clients Output: |
| Algorithm 3: Subroutine: DBSCANClustering |
| Input: Distance matrix , Neighborhood radius , Minimum number of points Output: GroupList visited ← InitializeVisitedSet() for each client do if client is not visited then Mark as visited Ni ← FindNeighbors(, ) if then Form and expand cluster else Mark as noise end if end if end for return GroupList |
| Algorithm 4: Subroutine: UpdateClientModels |
| Input: Current global model weights at time step , List of clusters GroupList, Distance matrix , Performance metrics Output: for each client all clients do for each cluster GroupList do if cluster is not noise then end if end for end for |
3.2.2. Computational Complexity Analysis
- The subroutine is responsible for computing the pairwise cosine distance matrix , which serves as the foundation for the FL-DBSCAN algorithm. This subroutine initializes an matrix and employs a nested loop structure. The outer loop iterates times, while the inner loop iterates over clients, resulting in unique client pairs . For each pair, it retrieves two model vectors of size and computes the cosine distance between them. The cosine distance calculation, given by , involves a dot product and norm calculations, each of which is an operation. Therefore, the time complexity of this subroutine is , which indicates its complexity is quadratic in the number of clients and linear in the model dimensionality .
- The subroutine performs density-based clustering on the clients using the precomputed distance matrix . This subroutine iterates through each of the clients. For each unvisited client, it performs a query to find all neighbors within a radius . Each query requires scanning a row of the matrix to find entries where , which is an operation. In the worst case, every client is a core point, and its neighborhood must be expanded, which potentially requires a query for all clients. Then, the worst-case time complexity of this subroutine is , which indicates this subroutine’s complexity is independent of the model dimensionality .
- The subroutine contains a nested loop structure. The outer loop iterates over all clients. For each client, the inner loop iterates over the clusters in the GroupList to identify the client’s assignment. Once the correct cluster is found, the subroutine calculates the cluster centroid, computes the distance from the client’s model to the centroid and performs a weighted average of the client’s model and the cluster centroid. Then, this subroutine’s time complexity is . In the worst case, where each client forms its own cluster (), the complexity becomes . However, in practice, is expected to be much smaller than .
4. Results and Discussion
4.1. Experimental Environment Settings
4.1.1. Experimental Environment
4.1.2. Experimental Dataset
- The local learning rate was set to 0.01, a standard value for SGD-based optimizers when training deep learning models on relatively homogeneous data subsets. Coupled with a batch size , this configuration ensures stable local updates without excessive noise. The number of local epochs allows for local progress on each client’s dataset.
- The global training epochs allowed the following experiments: Comparison of Model Performance and Variation with sampling rates, respectively, which provides insight into the proposed method’s stability and efficiency in the training process.
4.2. Experimental Results and Discussion
4.2.1. GATULER Model Performance
- Comparison between GATULER and Traditional TUL Models
- GATULER achieved significant improvements with the highest accuracy (50.56%) and the lowest standard deviation (0.38) on the Gowalla dataset. Although GATULER’s accuracy (60.21%) on the Foursquare dataset is nearly on par with TULMGAT’s highest accuracy (60.22%), it achieved significant improvement with a lower standard deviation (0 vs. 0.43). This improvement underscores the efficacy of GATULER’s graph attention mechanism in capturing key information in user trajectory data. Both GATULER and TULMGAT design graph-based embedding and attention mechanisms to achieve high prediction accuracy, but GATULER adopts a dynamic attention mechanism that does not rely on random walks or sampling to calculate attention values.
- Compared with all baseline models, GATULER achieved top performance on both the synthesized datasets with the highest Prec of 44.22% and 58.29%, respectively. However, GATULER’s Recall and F1-score are lower than TULMGAT and TULAM. These experimental results demonstrate a trade-off between accuracy and recall in GATULER. In fact, GATULER uses an attention mechanism and embedded history to achieve high confidence prediction, but it will also miss some samples.
- Ablation Analysis
4.2.2. FL-DBSCAN Algorithm Performance
- Comparison between FedTULGAC and typical federated learning method
- Analysis of Hyperparameters
- Accuracy variation and clustering factor
- 2.
- Loss variation with client selection ratios
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
Appendix A.1
| Algorithm A1: Subroutine: RandomSelect |
| Input: Number of clients to be selected , Total number of clients Output: while do Randomly select an index If client then end if end while return |
Appendix A.2
| Algorithm A2: Subroutine: LocalTraining |
| Input: Local datase , Global model weights , Local learning rate Local epochs , Batch size Output: For to do for each batch do end for end for return |
Appendix A.3
| Algorithm A3: Subroutine: TestAccuracy |
| Input: Locally updated model weights , Local test set Output: Load the local test set for each sample end for return |
Appendix A.4
| Algorithm A4: Subroutine: AdjustGlobalModelWeights |
| Input: Updated client model weights , Performance metric of the client for each client do end for |
References
- Wei, X.; Qian, Y.; Sun, C.; Sun, J.; Liu, Y. A survey of location-based social networks: Problems, methods, and future research directions. Geoinformatica 2022, 26, 159–199. [Google Scholar] [CrossRef]
- Deng, L.; Sun, H.; Zhao, Y.; Liu, S.; Zheng, K. S2TUL: A Semi-Supervised Framework for Trajectory-User Linking. In Proceedings of the 16th ACM International Conference on Web Search and Data Mining, Singapore, 27 February–3 March 2023. [Google Scholar] [CrossRef]
- Wu, H.; Xue, M.; Cao, J.; Karras, P.; Ng, W.S.; Koo, K.K. Fuzzy Trajectory Linking. In Proceedings of the IEEE 32nd International Conference on Data Engineering (ICDE), Helsinki, Finland, 22 June 2016. [Google Scholar] [CrossRef]
- Chen, W.; Huang, C.; Yu, Y.; Jiang, Y.; Dong, J. Trajectory-User Linking via Hierarchical Spatio-Temporal Attention Networks. ACM Trans. Knowl. Discov. Data 2024, 18, 22. [Google Scholar] [CrossRef]
- Gao, Q.; Zhou, F.; Zhang, K.; Trajcevski, G.; Luo, X.; Zhang, F. Identifying human mobility via trajectory embeddings. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, Melbourne, Australia, 19–25 August 2017. [Google Scholar]
- Zhou, F.; Gao, Q.; Trajcevski, G.; Zhang, K.; Zhong, T.; Zhang, F. Trajectory-user linking via variational autoencoder. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, 13–19 July 2018. [Google Scholar]
- Zhou, F.; Liu, X.; Zhang, K.; Trajcevski, G. Toward discriminating and synthesizing motion traces using deep probabilistic generative models. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 2401–2414. [Google Scholar] [CrossRef] [PubMed]
- Zhou, F.; Yin, R.; Trajcevski, G.; Zhang, K.; Wu, J.; Khokhar, A. Improving human mobility identification with trajectory augmentation. GeoInformatica 2021, 25, 453–483. [Google Scholar] [CrossRef]
- Miao, C.; Wang, J.; Yu, H.; Zhang, W.; Qi, Y. Trajectory-user linking with attentive recurrent network. In Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems, Auckland, New Zealand, 9–13 May 2020. [Google Scholar]
- Sun, T.; Xu, Y.; Wang, F.; Wu, L.; Qian, T.; Shao, Z. Trajectory-user link with attention recurrent networks. In Proceedings of the 2020 25th International Conference on Pattern Recognition, Milan, Italy, 10–15 January 2021. [Google Scholar] [CrossRef]
- Zhou, F.; Chen, S.; Wu, J.; Cao, C.; Zhang, S. Trajectory-user linking via graph neural network. In Proceedings of the ICC 2021—IEEE International Conference on Communications, Montreal, QC, Canada, 14–23 June 2021. [Google Scholar] [CrossRef]
- Li, H.; Cao, S.; Chen, Y.; Zhang, M.; Feng, D. TULAM: Trajectory-user linking via attention mechanism. Sci. China Inf. Sci. 2024, 67, 112103. [Google Scholar] [CrossRef]
- Li, Y.; Sun, T.; Shao, Z.; Zhen, Y.; Xu, Y.; Wang, F. Trajectory-user linking via multi-scale graph attention network. Pattern Recognit. 2025, 158, 110978. [Google Scholar] [CrossRef]
- Shi, H.; He, D.; Jin, F.; Hua, W.; Kim, J.; Wang, Q.; Zhou, X. A survey and experimental study on neural trajectory-user linking models. IEEE Trans. Knowl. Data Eng. 2025, 37, 6782–6798. [Google Scholar] [CrossRef]
- Chen, H. A Comprehensive Review of Machine Learning Privacy. In Proceedings of the 2024 6th International Conference on Machine Learning, Big Data and Business Intelligence, Hangzhou, China, 1–3 November 2024. [Google Scholar] [CrossRef]
- Tesfay, W.B.; Hofmann, P.; Nakamura, T.; Kiyomoto, S.; Serna, J. PrivacyGuide: Towards an Implementation of the EU GDPR on Internet Privacy Policy Evaluation. In Proceedings of the Fourth ACM International Workshop on Security and Privacy Analytics, Tempe, AZ, USA, 21 March 2018. [Google Scholar] [CrossRef]
- McMahan, B.; Moore, E.; Ramage, D.; Hampson, S. Communication-Efficient Learning of Deep Networks from Decentralized Data. arXiv 2017. [Google Scholar] [CrossRef]
- Khan, R.; Saeed, U.; Koo, I. FedLSTM: A Federated Learning Framework for Sensor Fault Detection in Wireless Sensor Networks. Electronics 2024, 13, 4907. [Google Scholar] [CrossRef]
- Kong, X.; Lu, L. Vehicle trajectory federated embedding learning and clustering under privacy protection. J. Nanjing Norm. Univ. 2022, 22, 80–86. [Google Scholar] [CrossRef]
- Wang, C.; Chen, X.; Wang, J.; Wang, H. ATPFL: Automatic Trajectory Prediction Model Design Under Federated Learning Framework. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 19–24 June 2022. [Google Scholar]
- Zhu, H.; Xu, J.; Liu, S.; Jin, Y. Federated learning on non-IID data: A survey. Neurocomputing 2021, 465, 543–556. [Google Scholar] [CrossRef]
- Nguyen, N.H.; Le Nguyen, P.; Nguyen, T.D.; Nguyen, T.T.; Nguyen, D.L.; Nguyen, T.H.; Pham, H.H.; Truong, T.N. FedDRL: Deep Reinforcement Learning-based Adaptive Aggregation for Non-IID Data in Federated Learning. In Proceedings of the 51st International Conference on Parallel Processing, Bordeaux, France, 29 August–1 September 2022. [Google Scholar] [CrossRef]
- Brown, L.C.; Green, D.E. Federated Optimization in Heterogeneous Networks. Adv. Neural Inf. Process. Syst. 2023, 345, 45–67. [Google Scholar]
- Yan, J.; Cheng, Y.; Zhang, F.; Li, M.; Zhou, N.; Jin, B.; Wang, H.; Yang, H.; Zhang, W. Research on Multimodal Techniques for Arc Detection in Railway Systems with Limited Data. Struct. Health Monit. 2025, 5, 26. [Google Scholar] [CrossRef]
- Gong, L.; Guo, S.; Han, X.; Lin, Y.; Lin, Y.; Liu, Y.; Lu, Y.; Wan, H.; Zhang, X. Mobility-llm: Learning visiting intentions and travel preferences from human mobility data with large language models. In Proceedings of the 38th International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 9–15 December 2024. [Google Scholar]
- Wang, H.; Song, Y.; Yang, H.; Liu, Z. Generalized Koopman Neural Operator for Data-Driven Modeling of Electric Railway Pantograph-Catenary Systems. IEEE Trans. Transp. Electrif. 2025, 11, 14100–14112. [Google Scholar] [CrossRef]








| Name | Version | Description |
|---|---|---|
| Ubuntu | 18.04.4 LTS | Operating System |
| Python | 3.11 | Programming Environment |
| TensorFlow | 2.4.1 | Deep Learning Framework |
| NumPy | 1.26.2 | Matrix Operations |
| Pandas | 2.1.3 | Data Preprocessing |
| Device | Configuration | Description |
|---|---|---|
| CPU | Model | 12th Gen Intel(R) Core(TM) i5-12450H |
| Cores/Number | 8 | |
| Threads/Number | 12 | |
| GPU | Model | NVIDIA GeForce RTX 3060 Laptop GPU |
| Memory/GB | 6 | |
| Quantity/Number | 1 |
| Evaluation Metric | Mean Value | Measurement Method | |
|---|---|---|---|
| Gowalla | Foursquare | ||
| Client-side Area Coverage Rate | 82.3% | 78.6% | Kernel Density Estimation |
| Client-side Location Overlap Rate | 21.4% | 18.7% | Jaccard Index |
| Client-side KL Divergence | 0.89 | 0.92 | Relative Entropy Calculation |
| Top-5 Location Coincidence Rate | 2.3 | 1.8 | Set Intersection |
| Clients No | Mixed Dataset (Total Users: 750) | |
|---|---|---|
| User Count | Percent (%) | |
| 1 | 103 | 13.73 |
| 2 | 74 | 9.87 |
| 3 | 55 | 7.33 |
| 4 | 44 | 5.87 |
| 5 | 37 | 4.93 |
| 6 | 32 | 4.27 |
| 7 | 28 | 3.73 |
| 8 | 25 | 3.33 |
| 9 | 23 | 3.07 |
| 10 | 75 | 10.00 |
| Model | Gowalla | Foursquare | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Acc (%) | ±Std | Prec (%) | Rec (%) | F1 (%) | Acc (%) | ±Std | Prec (%) | Rec (%) | F1 (%) | |
| TULER | 40.22 | 0.58 | 31.70 | 28.45 | 29.99 | 50.42 | 0.62 | 48.15 | 44.49 | 47.51 |
| TULVAE | 43.40 | 0.51 | 33.43 | 34.31 | 33.91 | 53.78 | 0.57 | 49.27 | 43.87 | 48.33 |
| STULIG | 44.84 | 0.49 | 34.13 | 35.25 | 34.68 | 56.12 | 0.53 | 53.36 | 46.82 | 50.21 |
| GTUL | 46.01 | 0.46 | 36.86 | 36.17 | 36.51 | 58.23 | 0.48 | 54.76 | 52.47 | 52.12 |
| TULMGAT | 50.09 | 0.42 | 42.95 | 42.65 | 40.25 | 60.22 | 0.43 | 56.47 | 56.57 | 54.01 |
| TULAM | 49.85 | 0.44 | 43.15 | 38.55 | 39.85 | 59.22 | 0.45 | 57.27 | 54.73 | 56.11 |
| GATULER | 50.56 | 0.38 | 44.22 | 41.96 | 39.58 | 60.21 | 0.36 | 58.29 | 55.18 | 55.33 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, H.; Xu, Y.; Jiang, H.; Liu, Y.; Wang, W.; Li, Y.; Luo, Y.; Ge, Y. FedTULGAC: A Federated Learning Method for Trajectory User Linking Based on Graph Attention and Clustering. Electronics 2025, 14, 4975. https://doi.org/10.3390/electronics14244975
Zhang H, Xu Y, Jiang H, Liu Y, Wang W, Li Y, Luo Y, Ge Y. FedTULGAC: A Federated Learning Method for Trajectory User Linking Based on Graph Attention and Clustering. Electronics. 2025; 14(24):4975. https://doi.org/10.3390/electronics14244975
Chicago/Turabian StyleZhang, Haitao, Yang Xu, Huixiang Jiang, Yuanjian Liu, Weigang Wang, Yi Li, Yuhao Luo, and Yuxuan Ge. 2025. "FedTULGAC: A Federated Learning Method for Trajectory User Linking Based on Graph Attention and Clustering" Electronics 14, no. 24: 4975. https://doi.org/10.3390/electronics14244975
APA StyleZhang, H., Xu, Y., Jiang, H., Liu, Y., Wang, W., Li, Y., Luo, Y., & Ge, Y. (2025). FedTULGAC: A Federated Learning Method for Trajectory User Linking Based on Graph Attention and Clustering. Electronics, 14(24), 4975. https://doi.org/10.3390/electronics14244975

