Car Tourist Trajectory Prediction Based on Bidirectional LSTM Neural Network

COVID-19 has greatly affected the tourist industry and ways of travel. According to the UNTWO predictions, the number of international tourist arrivals will be slowly growing by the end of 2021. One of the ways to keep tourists safe during travel is to use a personal car or car-sharing service. The sensor-based information collected from the tourist’s smartphone during the trip allows his/her behaviour analysis. For this purpose, we propose to use the Internet of Things with ambient intelligence technologies, which allows information processing using the surrounding devices. The paper describes a solution to the car tourist trajectory prediction, which has been the demanding subject of different research studies in recent years. We present an approach based on the usage of the bidirectional LSTM neural network model. We show the reference model of the tourist support system for car-based attraction-visiting trips. The sensor data acquisition process and the bidirectional LSTM model construction, training and evaluation are demonstrated. We propose a system architecture that uses the tourist’s smartphone for data acquisition as well as more powerful surrounding devices for information processing. The obtained results can be used for tourist trip behaviour analysis.


Introduction
The Coronavirus Disease 2019 (COVID- 19) outbreak and the quarantine measures around the world have greatly affected the whole tourism industry. According to the United Nations World Tourism Organization (UNWTO) reports (https://www.unwto.org/covid-19-and-tourism-2020 (accessed on 6 June 2021)), international tourist arrivals decreased from 1.5 billion in 2019 to 381 million in 2020. About 27% of all destinations worldwide have kept their borders completely closed for international tourism. The total loss in tourism exports was $1.3 trillion. Most experts in the tourism industry do not expect international tourism to return to pre-COVID levels before 2023.
Internal tourism has begun to increase due to the growing numbers of vaccinated people in countries and still closed borders. According to Airbnb statistics (https:// news.airbnb.com/2021-travel/ (accessed on 6 June 2021)), more and more travellers are preferring a domestic or local destination as a goal of the tourist route. Around 20% of tourists want their destination to be within driving distance of home. One of the safest ways to travel within the country during the COVID-19 pandemic is to use a personal car or rental-car-sharing services. By applying protective measures such as using masks in case of travelling with unfamiliar people, maintaining physical distance, sanitizing surfaces and periodically ventilating the cabin, it would be possible to reduce the risk of COVID-19 infestation when travelling by car.
For tourists travelling by car, it is important to use smart travel services [1] to enhance and enrich the travel experience. Active use of a smartphone with Internet access allows tourists to remain mobile and receive information with tourist support services directly. Tourist interaction with smart services, direct actions in the form of generating audio/video content or attraction reviews and sensor data form a large amount of information [2] that can be studied and analysed using different machine learning techniques [3].
Based on historical tourist data, researchers can analyse and predict tourist behaviour [4] using machine learning instruments. The tourist behaviour analysis allows researchers and business stakeholders to better understand tourist intentions, look for patterns in the actions of tourists in the region and improve tourist smart services. Internet of Things (IoT) sensors from smart cities [5][6][7] can provide historical data about tourist actions.
The paper solves the car-based tourist trajectory prediction problem, which is a demanding task across the science community. Existing solutions are based on neural network usage and focus on pedestrian route prediction based on the camera view, which limits the behaviour analysis process in the scope of the tourist region. The authors' solution provides functionality to predict tourist routes and extract possible visited points of interest (POIs) and can easily be modified by providing the tourist and trajectory context. The given approach can be useful for research and tourist industry analytics.
This paper presents an approach based on bidirectional long short-term memory (LSTM) neural network usage [8]. Neural network analysis based on IoT sensors is widely used in different tasks [9]. The key aspect of LSTM is the ability to memorize previous states in the inner neuron cells that can help to memorize route trajectory changes over time. Bidirectional LSTM can preserve information from past and future, which helps to better understand the trajectory context information. As an input, the proposed bidirectional LSTM model works with the global world coordinates rather than data from the road cameras over the region. Some context parameters such as tourist trip weekday also are passed to the neural network model.
The following paper is structured as follows. Section 2 analyses the current research state in the scope of trajectory prediction. Section 3 proposes the tourist support system, which gathers tourist data and analyses the behavioural patterns. Section 4 describes the tourist data acquisition from the device sensors. Section 5 shows the LSTM-model construction and training and Section 6 presents the evaluation process. Section 7 discusses the gathered results and the proposed approach limitations and section 8 provides our conclusions.

Related Work
The selected papers analyse different tourist behaviour aspects such as tourist arrival, trajectory prediction, destination prediction, etc. Behavioural aspect identification can improve smart-tourism service usage for further travel experience improvement. One of the ways to obtain the necessary information about the actions of tourists in a smart city is the usage of various IoT sensors and information processing by ambient intelligence technologies [10]. Table 1 presents information about related work in the scope of tourist behaviour analysis. The first column contains the paper citation, the second column describes the paper problem, the third column shows the machine learning task type and the last column provides information about the problem solution. Table 1. Related work review.

Paper
Problem Machine Learning Tasks Solution [11] Inbound tourist arrival forecasting for the tourist region Time series event prediction Least-squares support vector regression with genetic algorithm [12] Real-time driver destination prediction Time series event prediction Attention-aware LSTM [13] Urban routing simulation Time series event prediction Iterative framework based on the learn heuristic solution, local search and artificial neural network (ANN) [14] Tourist sequential pattern analysis for pedestrian-and car-based trips POI route construction Convolutional neural network (CNN), LSTM neural network [15] Locally optimal tourist route construction POI route construction Shortest tree path construction [16] Temporal-spatial tourist behaviour analysis on micro-scale distances based on Global Positioning System (GPS) data Clustering DBSCAN [17] Tourist arrival forecasting Time series event prediction Extreme learning machine as predictor, self-adaptive method for empirical decomposition of tourist arrival [18] Urban vehicle trajectory prediction based on Bluetooth data Trajectory prediction Attention-based recurrent neural networks (RNNs) [19] Trajectory labelling Classification Fuzzy rule classifiers [20] Geo-semantic tourist analysis for studying relationship between traffic interaction and urban functions at the road segment based on GPS data Classification CNN, Skip-gram Word2Vec model [21] Camera-based pedestrian trajectory prediction Trajectory prediction Social generative adversarial network (SGAN) with attention mechanism [22] Trajectory prediction for heterogeneous traffic-agents based on camera frame data Trajectory prediction LSTM The reviewed papers solve different machine learning tasks such as classification [19,20], clustering [16], event prediction [11][12][13]17], POI route contraction/trajectory prediction [14,15] and trajectory prediction [18,21,22]. To solve the assigned tasks, the presented works use various neural networks such as LSTM or CNNs. However, the presented works do not practically use the contextual information of tourists to solve their stated problems and do not describe the data gathering process, which limits their application.
In the scope of trajectory prediction, the selected papers [21,22] mostly work with camera frame data, which limits the models' applicability at the tourism region level. The authors of [18] operate with raw data gathered by Bluetooth sensors that are similar to the authors' approach but work with the RNN model, which tends to perform only shortterm prediction with potential memorizing problems on the long sequences. The proposed approach shows the sensor-driven solution for the data gathering by describing the tourist support system, describes supported machine learning tasks and presents the bidirectional LSTM-based solution for the car-based tourist trajectory predictions.

Reference Model
The tourist support system reference model is presented in Figure 1. The suggested tourist support system aims to propose a POI route within the tourist region by taking into consideration context and historical data about the tourist, region and POI. The context describes the current situation around the tourist support system subject/object and historical data contains information about actions and events of system subjects from the past. The POI route construction is divided into 2 tasks: POI visitation list formation based on the tourist POI preferences and route planning among selected POIs.
The tourist interacts with the system using an electronic device such as a smartphone. The smartphone allows tourists to interact with the tourist system, displays a map of the area with surrounding points of interest and displays multimodal information about attractions. From the perspective of the proposed tourist support system, the tourist can be characterized by the following attributes: actions, sensor data, preferences and context information. The actions describe the tourist activities such as photo/video content generation, POI review, etc. The raw sensor data describe tourist movement by gathering information from smartphone sensors such as GPS, accelerometer, magnetometer, etc.
The attractiveness of a certain POI types is declared by preferences. The context information represents the additional parameters, which describes the situation around tourist actions.  The region represents a city and its surroundings with located attractions, which tourists can visit. The region manages smart-city sensors whose task is to monitor the actions of tourists. Traffic situations and weather conditions describe the region's context situation. The POIs represent places the tourist wants to visit during a trip. Each POI is characterized by its type, expert and tourist assessments, as well as contextual information such as the cost of a visit, opening hours, etc.
Information about tourist, region and the POI forms the digital pattern of life [23]. The digital pattern of life represents the tourist and the world around them in the virtual world at the concept level. The proposed concepts describe the tourist and the world with which the tourist interacts. All gathered information is stored in data lake storage without structural changes. The data lake approach allows extracting the current tourist situation and the historical data simultaneously, which creates an opportunity to perform event prediction tasks.
The artificial neural network (ANN) models allow researchers to learn patterns in tourist behaviour by working with the historical data from the digital pattern of life. The approach of preserving the original data structure in the data lake provides flexibility for the ANN models' data preparation and training. In the proposed tourist support system the ANN models are used for the following behaviour analysis tasks: route classification, tourist clustering, POI visiting prediction and tourist route trajectory prediction. Each task requires both historical and context data for the proper work.
The POI route construction takes into account the tourist restrictions (time, locations, etc.) and works with the current state and the historical changes of the region. The route construction is enhanced by using the results of the ANN models. The POI visiting prediction, tourist trajectory prediction and tourist clustering influence the recommendation system choices of attraction set creation. The route classification is used for evaluating the quality of the generated route.
In the scope of this paper, the authors describe the process of solving tourist trajectory prediction tasks defined in the typical behaviour analysis tasks. The existing tourist trajectory is described by GPS sensor values, gathered from the smartphone by retrieving historical data from the data lake based on the digital pattern of life concepts. The prediction results are intended for the tourism services to improve. As an example, the predicted routes can be analysed for the frequency of visiting certain POIs and used as an additional factor for the tourist recommendation services.

Digital Pattern of Life Formation
We propose to use the earlier developed Drive Safely system [24,25] for the digital pattern of life formation. The Drive Safely system is aimed at supporting drivers by warning of critical events monitored by the smartphone camera and sensors. Different events such as distraction, drowsiness, seat belt not fastened, eating/drinking and smoking are monitored by the drivers' electronic device, such as a smartphone (Figure 2). The data gathering process is presented in Figure 3. Each car has an electronic device with a camera and a set of sensors that track the movement of the car. The camera saves the critical event video files and sensors such as GPS, accelerator and gyroscope measure the chaining in the global positioning and speed/acceleration. The Drive Safely system can process data from different drivers simultaneously.  Driver data are sent to the Representational State Transfer Application Programming Interface (REST-API) web service during the trip with a periodicity of ten times per second. The Python programming language is used for the backend implementation of the web service. The asynchronous package Aiohttp is used for HyperText Transfer Protocol (HTTP) server implementation, which simplifies the sensor data processing from different drivers at the same time. The Nginx is used as the reversed proxy server for the Pythonbased backend, managed by the Gunicorn application server, which runs several backend instances for the parallel data processing from the different sources. All data processed from the backend instances are stored in the Postgresql database.
The processed sensor data are presented in Table 2. The table describes the tourist state gathered by sensors in the given period. The first column notes the row index, the second column represents the data, the third column declares the measurement unit, the fourth column represents the sensor type used for the data gathering and the fifth column provides the sensor data example. The GPS, accelerometer, gyroscope and magnetometer sensors are used for gathering raw data. The electronic device sends sensor data to the Drive Safely backend instance with the specific timestamp every 1.5 s using the batch of data. We collected sensor data 10 times per second. All gathered data during the tourist car trip can be treated as historical data according to the reference model. The selected set of sensors can be used for restoring trip routes and driver behaviour pattern mining. The latitude, longitude and altitude represent the tourist position in the global space. The coordinates are recorded in the World Geodetic System 1984 (WGS84). Location accuracy defines the radius of tourist position measurement error by using GPS sensors. The datetime stores a Unix timestamp in seconds that represents sensor measurement time. Speed and acceleration represent the car speed and acceleration on the track at the given moment in time. Light level measures the illumination inside the car.

LSTM-Based Trajectory Prediction Approach
Trajectory prediction is implemented based on information from the tourist's digital pattern of life. We analyse tourist routes and predict their future behaviour as part of the behaviour analysis tasks. The bidirectional LSTM-based approach was chosen as the car trajectory prediction solution. The LSTM is one of the possible ways to implement recurrent neural networks (RNNs), which are capable of maintaining data within a short period. The LSTM models are designed due to the RNN incapability to memorize the longperiod data by using an improved inner cell structure.
The general structure of an LSTM cell is presented in Figure 4. x t is an input sequence value in the given moment, h t represents the output value, produced from the LSTM cell, C t represents an LSTM cell inner status. Based on the given input, the LSTM cell performs three main operations ("gates"): • the forget gate decides which kind of information will be removed from the cell state C t−1 based on h t−1 and x t ; • the input gate determines the amount of information that will be stored in the LSTM cell based on h t−1 and x t ; • the output gate computes the new cell output C t based on the mix of the previous cell states and output of the input gate.  In the LSTM models, the data flows through the LSTM cell only in the forward direction. The bidirectional LSTM ( Figure 5) is the LSTM extension, which provides the output level information from the previous and future inner states. The input information for the sequence is simultaneously passed to the forward and backward layers. The activation layer merges the processed data from the previous layers that will be passed to the model output. The activation layer can sum results (values just simply added), multiply results, concatenate results (in this case the model doubles the output cell's number), and calculate the average of the given values. This ANN modification allows working with the context information of the sequence input.

Car Route Data Preprocessing
The driver trajectory dataset for neural network training and validation contains routes that have been made in the city of Saint Petersburg, Russia. The dataset contains 1085 tourist routes, recorded from January to March of 2021. For each route, the tourist car trajectory was simplified by using the PostGIS function ST_Simplify (https://postgis.net/ docs/ST_Simplify.html (accessed on 6 June 2021)).
The proposed bidirectional LSTM neural network demands a geodetic coordinate transformation because spherical coordinates transform into 2D space coordinates with an inaccuracy that leads to a potentially large prediction error. Another approach for geodetic coordinate preprocessing is measuring the geographical distance and forward azimuth angle between two coordinates by using the inverse method of the Vincenty formula [26].
The Vincety formula works with the ellipsoidal model of the Earth and measures the azimuth and distance between points more accurately than the Haversine formula, which uses the spherical model of the Earth. An example of preprocessed coordinates is given in Table 3. The first column represents the position number in the car route. The second and third columns in the row represent the car longitude and latitude, which are decoded in the WGS84. The fourth column represents the forward azimuth in degrees between coordinates on the current and next rows and takes values in the range [−179 • ; 180 • ]. The fifth column represents the distance between coordinates in the current and next rows and measures in meters. For the last route position azimuth and distance are not calculated.

Neural Network Model Configuration
The Python programming language with Tensorflow and Keras platforms was used for the bidirectional LSTM model construction. Pandas and geopandas packages provide dataset manipulation tools. Sklearn and NumPy packages implement the data manipulation into the convenient neural network training form. The Shapely package is used for the trajectory drawing on the interactive maps.
For the correct bidirectional LSTM neural model training, the input data normalization process is required. Different input features may have a different scale and distribution and large input values can influence the neural model in a way that means the model will learn large weight values. When the model has large weight values it acts very unstably on the prediction tasks, which leads to a huge generalization error.
The forward azimuth and distance between coordinates are re-scaled by using minmax normalization [27]. In Equation (1), x scaled is a value of x from the old scale, which is mapped on the new scale [0; 1]; x min represents the minimum value from the input sequence and x max refers to the maximum value.
The categorical weekday value is transformed by using a hot-encoding technique. The hot-encoding technique replaces the categorical value with a new binary vectored value. The proposed vector has a length of total categories. The neural network model is described by the input layer, several inner layers and the output layer, which contains the predicted trajectory ( Figure 6). The input layer requires the transformed part of the tourist trajectory, which contains the next 3, 5 or 9 rows from Table 3. It does not make sense to use a trajectory with information for 1 or 2 points for prediction due to the small amount of input information. Using information for more than 9 points makes it difficult to use the model in real applications due to the need to wait for data to fill. The models with inputs defined by 4 and 6 trajectory points show close results to models with 5 and 9 points, respectively. Each row has been supplemented with the trip weekday. Selected trajectory rows flatten into the 1d array length of traj size × w l × input size , where traj size is the potential point sequence, w l is the length of the hot-encoded weekday and input size represents the trajectory characteristics at the given moment in time (azimuth and distance between two points). The inner neural network model layers consist of the 3 bidirectional layers with 128 LSTM cells in each layer and a dense layer with 64 neurons. The output layer has input size length with predicted azimuth and distance to the next point. Inner ANN layers Input layer Figure 6. Approach of the ANN architecture.

Trajectory Prediction Results
The training part of the tourist car dataset contains 220,296 entries, the validation part contains 73,432 entries and the test part consists of 53,822 entries. Each neural network model with a different input size has been trained on 100 epochs with a batch size of 64.
The mean squared error (MSE) between expected and predicted azimuths and distances between points (Equation (2)) and average displacement error (ADE) for each model with different input lengths were calculated using Equation (3). For Equation (2), n is sequence length, Y i denotes values of the variable being predicted andŶ i denotes predicted values. For Equation (3), l is trajectory length, (x i , y i ) is the ground truth point of the trajectory on the i-th step and (x i ,ŷ i ) is the predicted point of the trajectory on the i-th step.
The results of the models' comparison are presented in Table 4. The experiments show that the model using five-point prediction is the most accurate. Due to the inaccuracy of GPS measurements and the subsequent lack of filtering of trajectory emissions, there is a discrepancy between the small error in prediction and the total error in the distance between the predicted and real points. The car trajectory prediction result is shown in Figure 7. The trajectory represents the tourist travel around St. Petersburg, Russia. The blue coloured line shows the original tourist trajectory in the city and red-coloured line stands for the predicted trajectory given by the bidirectional LSTM model. The visualization of the trajectories comparison shows that on straight sections of the route the initial and predicted results practically coincide. However, the proposed model starts to make mistakes when it tries to predict turns. This situation can be explained by the high noise in the initial data, which were extracted from the electronic device GPS.
The MSE error is comparable to existing solutions for predicting motion trajectories based on the ANN solutions, but the ADE error is greater. Such a situation can be explained by different approach scopes. The existing solution usually predicts the movement patterns around the camera frame within fixed dimensions. The authors' approach tries to predict tourist car-trajectory in the tourist region in the global map scope. Due to the scale of the region, even when using such precise coordinate transformation methods as the Vincenty formula, a small prediction error can lead to a large final trajectory error. Another factor leading to a large deviation in the predicted trajectory is the general noise level of the GPS data and the lack of smoothing.

Discussion
The proposed trajectory prediction model successfully allows prediction of tourist trajectories, which was stated as the goal of this paper. The collected driver trip trajectory dataset with GPS data and further coordinate transformation by using the inverse method of the Vincenty formula allows training of the bidirectional LSTM-model with acceptable results for the behaviour analysis. The research has confirmed that ANN-based solutions can be used for map-based trajectory predictions. We showed that MSE and ADE errors are quite small and the model can be used in real systems. The proposed model can be used for accounting and analysis of the most frequently visited routes based on historical data. The gathered results can be used for controlling possible tourist flow within the city. Another possible model application is an analysis of indirectly visited tourist POIs that are close to the predicted trajectory. The possible visited POI can be used for the recommendation system tuning and the following tourist clustering. The proposed method does not depend on the tourist region and relies on the joint use of coordinates collected from the GPS of an electronic device and the inverse method of the Vincenty formula. Based on these facts, researchers can use the proposed bidirectional LSTM model for further research.
However, the presented approach has the following limitations and weaknesses. Firstly, the tourist trajectory car dataset has to contain relatively close routes, within one city for example. If this condition is not met, the min-max scaler usage can transform the close route points incorrectly, which can potentially lead to the wrong trajectory predictions.
Secondly, the trajectory has to be filtered from route point emissions because they provide incorrect data dependencies to a neural network that can lower the prediction accuracy. The absence of filtering and softening of these trajectories leads to a greater prediction error. Increasing the number of predicted trajectory points, as well as displaying several possible trajectories with probabilistic estimates, can also help improve model prediction results [28].
The following actions can be taken for improving the overall model prediction applicability. The current model considers only weekday usage as a context parameter. However, the route characteristics such as speed, acceleration and other sensor-based metrics can be used as additional context parameters. Another possible context expansion is a road-type consideration (highway, country road, square, etc). The proposed context parameters can drastically improve the model prediction capability.
Another potential improvement is neural network hyper-parameter tuning by using genetic algorithms. Due to the limitation in the form of training the proposed neural model within one city, the tourist support system needs to train different neural networks for each supported city. Due to the different city topologies and heterogeneity of training data, the bidirectional LSTM hyper-parameter optimisation can be time-consuming for manual adjusting. Genetic algorithms can facilitate the task of selecting optimal parameters.

Conclusions
In this paper we presented an approach to tourist trajectory prediction based on LSTM neural networks. We showed the reference model of the approach as well as the neural network architecture. We evaluated our approach based on data generated by the Drive Safely system that tracks drivers' movements as well as detecting dangerous situations in vehicle cabins based on the different sensors. The evaluation shows that the presented approach allows prediction of the tourist trajectories with acceptable accuracy for further trajectory analysis by experts. The existing trajectory prediction models focus on the forecasting process within a small area such as a camera frame. The proposed solution considers trajectories at the level of the tourist region and provides the possibility of modifying the input parameters. In addition to using the trajectory, the proposed tourist support system can provide additional context and historical data based on the digital pattern of life concepts, such as road type, various movement characteristics during the trip, critical events frequency such as drowsiness/distraction, etc. Experiments have been implemented in a dataset of 220,296 entries. In the future, the authors want to add context support such as road type, speed and/or acceleration as the bidirectional LSTM-model input, and extend the model output length with support for the probabilistic derivation of possible tourist trajectories.