# Trajectory Generation of Ultra-Low-Frequency Travel Routes in Large-Scale Complex Road Networks

^{*}

## Abstract

**:**

## 1. Introduction

- A significant portion of ultra-low-frequency routes in the original trajectories do not appear in the synthetic trajectories;
- The number of journeys corresponding to the learned routes will change when ensuring the same generation quantity with the original trajectories and then the distribution between routes and the number of journeys is destroyed as a result.

- The problem of generating the trajectory dataset with an imbalance learning problem in a large-scale complex road network scenario has received attention for the first time, and the ULF-TrajGAIL framework provides a fixed and integral process to solve the problem;
- A trajectory dataset imbalance degree measurement method, a trajectory group generation difficulty judgment method, and a data augmentation method oriented to the distribution of routes and corresponding number of journey for the high-quality trajectory-generation task are proposed;
- A more comprehensive synthetic trajectory quality measurement metric system involving route, link, and OD pairs from multiple perspectives is proposed to evaluate the quality of the synthetic trajectories. The ability to generate ultra-low-frequency routes is focused and the impact of each augmentation method on the correspondence between route and journey frequency is also analyzed.

## 2. Literature Review

#### 2.1. Trajectory Generation

#### 2.2. Data-Level Approaches

## 3. Methodology

#### 3.1. Definitions

#### 3.2. ULF-TrajGAIL Framework

#### 3.2.1. Confirmation of the Imbalance Degree

#### 3.2.2. Theory and Application of TrajGAIL

#### 3.2.3. Difficulty Degree and Augmentation Method

## 4. Experiments

#### 4.1. Description and Augmentation of the Original Trajectory Dataset

#### 4.2. Experiments and Evaluations

#### 4.2.1. Descriptions of Experiments

#### 4.2.2. Evaluations

## 5. Conclusions and Discussions

## Author Contributions

## Funding

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Liu, X.; Chen, H.; Andris, C. trajGANs: Using generative adversarial networks for geo-privacy protection of trajectory data (Vision paper). In Proceedings of the Location Privacy and Security Workshop, San Francisco, CA, USA, 21–23 May 2018; pp. 1–7. [Google Scholar]
- Lee, B.; Fujiwara, A.; Sugie, Y.; Namgung, M. Route choice behavior model considering randomness and vagueness uncertainty. In Proceedings of the 13th Mini EURO Conference on Handling Uncertainty in Analysis of Traffic and Transportation Systems, Bari, Italy, 10–13 June 2002. [Google Scholar]
- He, D.; Wang, S.; Ruan, B.; Zheng, B.; Zhou, X. Efficient and robust data augmentation for trajectory analytics: A similarity-based approach. World Wide Web
**2020**, 23, 361–387. [Google Scholar] [CrossRef] - Chawla, N.V. Data mining for imbalanced datasets: An overview. In Data Mining and Knowledge Discovery Handbook; Springer: Berlin/Heidelberg, Germany, 2009; pp. 875–886. [Google Scholar]
- Li, D.C.; Hu, S.C.; Lin, L.S.; Yeh, C.W. Detecting representative data and generating synthetic samples to improve learning accuracy with imbalanced data sets. PLoS ONE
**2017**, 12, e0181853. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Zhu, T.; Luo, C.; Zhang, Z.; Li, J.; Ren, S.; Zeng, Y. Minority oversampling for imbalanced time series classification. Knowl.-Based Syst.
**2022**, 247, 108764. [Google Scholar] [CrossRef] - Fernández, A.; LóPez, V.; Galar, M.; Del Jesus, M.J.; Herrera, F. Analysing the classification of imbalanced data-sets with multiple classes: Binarization techniques and ad-hoc approaches. Knowl.-Based Syst.
**2013**, 42, 97–110. [Google Scholar] [CrossRef] - Choi, S.; Kim, J.; Yeo, H. Trajgail: Generating urban vehicle trajectories using generative adversarial imitation learning. Transp. Res. Part C Emerg. Technol.
**2021**, 128, 103091. [Google Scholar] [CrossRef] - Li, H.; Guensler, R.; Ogle, J. Analysis of Morning Commute Route Choice Patterns Using Global Positioning System–Based Vehicle Activity Data. Transp. Res. Rec.
**2005**, 1926, 162–170. [Google Scholar] [CrossRef] - Li, Z.; Ding, B.; Han, J.; Kays, R.; Nye, P. Mining periodic behaviors for moving objects. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2010; pp. 1099–1108. [Google Scholar]
- Giannotti, F.; Nanni, M.; Pinelli, F.; Pedreschi, D. Trajectory pattern mining. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, CA, USA, 12–15 August 2007; pp. 330–339. [Google Scholar]
- Chen, M.; Liu, Y.; Yu, X. Nlpmm: A next location predictor with markov modeling. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Tainan, Taiwan, 13–16 May 2014; pp. 186–197. [Google Scholar]
- Fan, X.; Guo, L.; Han, N.; Wang, Y.; Shi, J.; Yuan, Y. A deep learning approach for next location prediction. In Proceedings of the IEEE 22nd International Conference on Computer Supported Cooperative Work in Design (CSCWD), Nanjing, China, 9–11 May 2018; pp. 69–74. [Google Scholar]
- Jin, C.; Lin, Z.; Wu, M. Augmented intention model for next-location prediction from graphical trajectory context. Wirel. Commun. Mob. Comput.
**2019**, 2019, 2860165. [Google Scholar] [CrossRef] [Green Version] - Jun, L.; Yu-wei, G.; Wei, Y. Predicting Optimal Route Based on Link-to-link Transition Probability. J. Transp. Syst. Eng. Inf. Technol.
**2021**, 21, 36. [Google Scholar] - Hu, G.; Shao, J.; Ni, Z.; Zhang, D. A graph based method for constructing popular routes with check-ins. World Wide Web
**2018**, 21, 1689–1703. [Google Scholar] [CrossRef] - Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. Commun. ACM
**2020**, 63, 139–144. [Google Scholar] [CrossRef] - Cao, C.; Li, M. Generating Mobility Trajectories with Retained Data Utility. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, Singapore, 14–18 August 2021; pp. 2610–2620. [Google Scholar]
- Sutskever, I.; Vinyals, O.; Le, Q.V. Sequence to sequence learning with neural networks. Adv. Neural Inf. Process. Syst.
**2014**, 27. [Google Scholar] - Chen, X.; Xu, J.; Zhou, R.; Chen, W.; Fang, J.; Liu, C. TrajVAE: A Variational AutoEncoder model for trajectory generation. Neurocomputing
**2021**, 428, 332–339. [Google Scholar] [CrossRef] - Codevilla, F.; Müller, M.; López, A.; Koltun, V.; Dosovitskiy, A. End-to-end driving via conditional imitation learning. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia, 21–25 May 2018; pp. 4693–4700. [Google Scholar]
- Ng, W.; Hu, J.; Yeung, D.; Yin, S.; Roli, F. Diversified sensitivity-based undersampling for imbalance classification problems. IEEE Trans. Cybern.
**2014**, 45, 2402–2412. [Google Scholar] [CrossRef] [PubMed] - He, H.; Garcia, E.A. Learning from imbalanced data. IEEE Trans. Knowl. Data Eng.
**2009**, 21, 1263–1284. [Google Scholar] - García, V.; Sánchez, J.S.; Mollineda, R.A. On the effectiveness of preprocessing methods when dealing with different levels of class imbalance. Knowl.-Based Syst.
**2012**, 25, 13–21. [Google Scholar] [CrossRef] - Estabrooks, A.; Jo, T.; Japkowicz, N. A multiple resampling method for learning from imbalanced data sets. Comput. Intell.
**2004**, 20, 18–36. [Google Scholar] [CrossRef] [Green Version] - Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res.
**2002**, 16, 321–357. [Google Scholar] [CrossRef] - Engelmann, J.; Lessmann, S. Conditional Wasserstein GAN-based oversampling of tabular data for imbalanced learning. Expert Syst. Appl.
**2021**, 174, 114582. [Google Scholar] [CrossRef] - Seiffert, C.; Khoshgoftaar, T.M.; Van Hulse, J. Hybrid sampling for imbalanced data. Integr. Comput.-Aided Eng.
**2009**, 16, 193–210. [Google Scholar] [CrossRef] - Zhu, R.; Guo, Y.; Xue, J.H. Adjusting the imbalance ratio by the dimensionality of imbalanced data. Pattern Recognit. Lett.
**2020**, 133, 217–223. [Google Scholar] [CrossRef]

**Figure 4.**The study area of trajectory generation. The red bold solid lines refer to the boundaries of the study area, while the blue, green, yellow and purple solid lines represent the high-frequency link of Origin and Destination involved in trajectories from the northeast, southeast, southwest and northwest directions, respectively.

**Figure 5.**Distribution of the number of routes and journeys and the imbalance degree of each original expert trajectory dataset.

**Figure 7.**Calculation results of the difficulty degree of each journey frequency trajectory in ${E}^{\left(0\right)}$.

**Figure 8.**Links covered by ultra-low trajectories. All links covered by ultra-low-frequency trajectories are outlined in solid green lines.

**Table 1.**Basic information of trajectory-generation experiments. The proposed method and corresponding results are shown in bold.

No. | Processing Method | $\mathit{\eta}$ | Number of Routes of E | Number of Travels of E | Number of Travels of L |
---|---|---|---|---|---|

1 | – | 93.56 | 503 | 10,499 | 10,499 |

2 | Undersampling | 73.92 | 503 | 6267 | 10,499 |

3 | Hybrid Sampling | 74.00 | 503 | 10,595 | 10,499 |

4 | Targeted Expansion to $k=2$ | 74.25 | 503 | 10,667 | 10,499 |

5 | Targeted Expansion to $k=3$ | 65.25 | 503 | 10,919 | 10,499 |

6 | Targeted Expansion to $k=4$ | 59.54 | 503 | 11,207 | 10,499 |

7 | Targeted Expansion to $k=5$ | 55.43 | 503 | 11,528 | 10,499 |

8 | Undifferentiated Expansion | 93.56 | 503 | 20,998 | 10,499 |

9 | Combined Expansion | 74.25 | 503 | 21,334 | 10,499 |

10 | Extra Combined Expansion | 65.25 | 503 | 21,838 | 10,499 |

No. | Processing Method | ${\mathit{P}}_{\mathbf{route}}$ | ${\mathit{P}}_{\mathbf{traj}}$ | ${\mathit{P}}_{\mathbf{route}}^{\mathbf{ultra}}$ | ${\mathit{J}}_{\mathbf{route}}$ | ${\mathit{P}}_{\mathbf{link}}$ | ${\mathit{J}}_{\mathbf{link}}$ | ${\mathit{P}}_{\mathbf{OD}}$ | ${\mathit{J}}_{\mathbf{OD}}$ |
---|---|---|---|---|---|---|---|---|---|

1 | – | 0.5070 | 0.9333 | 0.2083 | 0.2694 | 0.8861 | 0.1268 | 0.6266 | 0.2543 |

2 | Undersampling | 0.5726 | 0.9517 | 0.3155 | 0.3100 | 0.9234 | 0.2459 | 0.7110 | 0.2917 |

3 | Hybrid Sampling | 0.4692 | 0.9302 | 0.2619 | 0.2885 | 0.9234 | 0.1253 | 0.7110 | 0.2703 |

4 | Targeted Expansion to $k=2$ | 0.6302 | 0.9565 | 0.4464 | 0.2670 | 0.9689 | 0.1239 | 0.7852 | 0.2510 |

5 | Targeted Expansion to $k=3$ | 0.6083 | 0.9477 | 0.4464 | 0.2891 | 0.9689 | 0.1300 | 0.7724 | 0.2686 |

6 | Targeted Expansion to $k=4$ | 0.5726 | 0.9517 | 0.3155 | 0.2707 | 0.9710 | 0.1420 | 0.8900 | 0.2517 |

7 | Targeted Expansion to $k=5$ | 0.6203 | 0.9444 | 0.5238 | 0.3063 | 0.9607 | 0.1473 | 0.8875 | 0.2828 |

8 | Undifferentiated Expansion | 0.6183 | 0.9636 | 0.3036 | 0.2445 | 0.8923 | 0.1205 | 0.6880 | 0.2334 |

9 | Combined Expansion | 0.7217 | 0.9707 | 0.5833 | 0.2501 | 0.9731 | 0.1268 | 0.8031 | 0.2376 |

10 | Extra Combined Expansion | 0.7594 | 0.9721 | 0.6607 | 0.2617 | 0.9814 | 0.1431 | 0.8875 | 0.2476 |

**Table 3.**Theimpact degree of the correspondence between routes and the number of journeys.The proposed method and corresponding results are shown in bold.

No. | Processing Method | Accuracy | Weighted-Precision | Weighted-Recall | Weighted-F1-Score |
---|---|---|---|---|---|

1 | – | 0.46 | 0.55 | 0.46 | 0.49 |

2 | Undersampling | 0.42 | 0.56 | 0.42 | 0.47 |

3 | Hybrid Sampling | 0.40 | 0.51 | 0.40 | 0.44 |

4 | Targeted Expansion to $k=2$ | 0.50 | 0.59 | 0.50 | 0.53 |

5 | Targeted Expansion to $k=3$ | 0.46 | 0.58 | 0.46 | 0.50 |

6 | Targeted Expansion to $k=4$ | 0.42 | 0.56 | 0.42 | 0.47 |

7 | Targeted Expansion to $k=5$ | 0.36 | 0.58 | 0.36 | 0.42 |

8 | Undifferentiated Expansion | 0.56 | 0.64 | 0.56 | 0.59 |

9 | Combined Expansion | 0.58 | 0.64 | 0.58 | 0.60 |

10 | Extra Combined Expansion | 0.48 | 0.61 | 0.48 | 0.52 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Li, J.; Zhao, W.
Trajectory Generation of Ultra-Low-Frequency Travel Routes in Large-Scale Complex Road Networks. *Systems* **2023**, *11*, 61.
https://doi.org/10.3390/systems11020061

**AMA Style**

Li J, Zhao W.
Trajectory Generation of Ultra-Low-Frequency Travel Routes in Large-Scale Complex Road Networks. *Systems*. 2023; 11(2):61.
https://doi.org/10.3390/systems11020061

**Chicago/Turabian Style**

Li, Jun, and Wenting Zhao.
2023. "Trajectory Generation of Ultra-Low-Frequency Travel Routes in Large-Scale Complex Road Networks" *Systems* 11, no. 2: 61.
https://doi.org/10.3390/systems11020061