Indoor Location Prediction Method for Shopping Malls Based on Location Sequence Similarity

Fast and accurate indoor location prediction plays an important part in indoor location services. This work proposes an indoor location prediction framework named Indoor-WhereNext. First, a novel algorithm, “indoor spatiotemporal density-based spatial clustering of applications with noise” (Indoor-STDBSCAN), is proposed to detect the stay points in an indoor trajectory and convert them into a location sequence. Then, a spatial-semantic similarity (SSS) method for measuring the similarity between location sequences is defined. SSS comprehensively considers the spatial and semantic similarities between location sequences. Finally, a clustering algorithm is used to obtain similarity user groups based on SSS. These groups are used to train different prediction models to achieve improved results. Extensive experiments were conducted using real indoor Wi-Fi positioning datasets collected in a shopping mall. The results show that the Indoor-WhereNext model markedly outperforms the three existing baseline methods in terms of prediction accuracy and precision.


Introduction
In recent years, with the rapid development of e-commerce, traditional "brick-and-mortar" industries have been severely affected [1,2]. These industries urgently need to develop ways to help merchants establish relations with customers and provide them with a personalized offline shopping experience in order to improve its marketing ability [3]. With the continuous development of indoor positioning technology and the popularization of mobile terminal equipment, indoor mobile user trajectory data have shown explosive growth [4][5][6]. Indoor trajectory data are an important basis for indoor location-based services and provide new opportunities for the development of "brick-andmortar" industries [7][8][9][10].
Location prediction technology can infer the next location of a user according to the historical trajectory. It can thus provide flexible services for users, which has been a research concern in this field. Research shows that user behavior is predictable [11]. To date, location prediction technology has been widely used in trajectory reconstruction [12,13], location recommendation [14][15][16], intelligent transportation [17,18], and provision of security services [19]. Indoor location services, for example, can predict the next location of a user and push information about shops of interest to the user. This not only provides the user with a personalized shopping experience but also aids the merchant in earning profits [14][15][16].
Location prediction methods can be classified according to forecasting needs into the following two categories [20]: (1) those that predict the location that the user will visit next [10,[21][22][23][24][25] and (2) those that predict the location of the user in the next time interval [26]. The main difference is that the former transforms an individual trajectory into a location sequence, while the latter treats it as a time interval sequence with relevant positions. In this study, only the first type of location prediction was considered. Existing research mainly uses data mining algorithms, such as the association rule [27,28], hidden Markov model (HMM) [29], or recurrent neural network (RNN) [30,31], to mine patterns between location sequences and then serve location predictions. In contrast with the existing research, the approach in this work focuses mainly on location prediction for indoor spaces, serving the location services of the offline industry, such as large shopping centers. Unlike outdoor location prediction, an indoor trajectory typically has three-dimensional features, which makes it difficult for existing stay point recognition algorithms to convert an indoor trajectory into a location sequence. Second, the user trajectory implies the user preference [32]. When there is a large number of users, it is easy to find similarity groups. It is easier to mine location patterns within similarity group [29].
Therefore, this paper proposes an indoor location prediction framework, called Indoor-WhereNext. The work makes a number of significant contributions, which are summarized as follows.
(1) A novel spatial-semantic similarity (SSS) method is defined. It combines spatial and semantic information to calculate the similarity between location sequences and find similarity groups of indoor users. (2) Long short-term memory (LSTM) is used to model each group of users to improve the accuracy of indoor location prediction. (3) The performance of the Indoor-WhereNext is evaluated using real indoor trajectories. The results demonstrate the advantages of our approach compared with baselines.
The rest of this work is organized as follows. In Section 2, the current literature focusing on models for location prediction from trajectories is reviewed. In Section 3, a new methodological framework for indoor location prediction is proposed. The performance of the method proposed in the current work is compared with that of methods proposed in previous research using real indoor Wi-Fi positioning data; these results are presented in Section 4. In Section 5, the work is summarized, and suggestions are made for possible future studies.

Related Work
Existing location prediction models that predict where users will visit can be roughly divided into two types: Individual-based and group-based prediction models.
Individual-based prediction models consider the movement behavior of each individual to be independent and, thus, use only the movement history of the user to predict her or his next location. The core issue of individual-based models is that they are mainly used to mine the periodic behavior of individual users. For instance, Lee et al. [22] proposed a spatiotemporal-periodic (STP) pattern to capture the periodic behavior of the individual. It used an association rule algorithm to mine periodic patterns in STP. Vu et al. [33,34] presented a novel framework, Jyotish, to find the periodic movement of people based on Wi-Fi/Bluetooth positioning data. Bayesian classifiers and support vector machines were used to give the most likely next location. Do et al. [35] redefined the location prediction problem from a new perspective and proposed a probabilistic kernel method for learning the dependence between user location and multivariate context variables from sparse data. Wu et al. [36] proposed a spatial-temporal-semantic neural network algorithm (STS-LSTM) for location prediction. Zhang et al. [17] combined the respective superiorities of support vector regression and deep learning to present a novel data embedding and ensemble learning method. Yang et al. [37] defined a novel Markov chain via Markov transition matrix multiplication and proposed the DestPD model. However, there are certain deficiencies in the individual-based models. First, these models require long-term movement trajectories of individual users, and these are difficult to obtain in practical applications. Second, individual-based models require an independent model to be built for each individual, which is also unrealistic in practical applications.
Group-based prediction models consider movement behavior to "follow the crowd" to some degree and, thus, use movement history of other users to predict a user's next location. These models are mainly used to mine similarity behaviors of groups of users. For example, Morzy [28] proposed an improved Apriori algorithm that uses association rules to predict the next location of a group of users. Ang et al. [38] utilized a Markov chain to convert location sequences into conversion probabilities for location prediction. Qiang et al. [30] proposed spatiotemporal RNN (ST-RNN) based on RNN [31] to model the location of groups of users. Ying et al. [23] proposed a geographictemporal-semantic-based location prediction framework to predict the next location of a group of users. Unlike single-object models, group-based models can reveal the movement of a group of users in some scenarios [39]. In addition, group-based models do not need long-term movement trajectories of individual users. However, there are several shortcomings in the aforementioned group-based models. They build a model for all users, ignoring the existence of similarity subgroups. Therefore, some models were obtaining movement trajectories of only those who are in some way related to the user. Zhang et al. [40] found that there was a strong correlation between the calling patterns and cocell patterns of users. Based on this finding, they proposed the NextCell model, which aims to enhance location prediction by harnessing the social interplay revealed in cellular call records. Wen et al. [41] proposed a fallback social-temporal-hierarchic Markov model (FSTHM), which introduced modified cross-sample entropy to quantify the similarities between the individual and his friends to enhance the predictive performance. Li et al. [42] used a linear regression model, which was constructed with a subset of users related to the predicted user, to predict the next location.
In this study, a novel indoor location prediction framework, Indoor-WhereNext, was developed. In the proposed framework, previously collected historical location sequences are first grouped according to their characteristics (i.e., according to the similarity of historical location sequences). Afterward, different user groups are used to train different prediction models to achieve improved results.

Definition 1 (trajectory): A trajectory
= is an ordered sequence of points = , , , , , where is a unique user identifier; is the time at which was collected; and , , correspond to the longitude, latitude, and floor, respectively, of the user at time .
Definition 2 (stay point): In general, a stay point = , , , , stands for a geographic region where a user stayed over a certain time interval, where is a unique user identifier; , , correspond to the longitude, latitude, and floor, respectively, of the user's stay; and , represent the user's arrival and departure times, respectively, with respect to the geographic region. For example, the stay point of a user shown in Figure 1a is expressed as  In this section, the Indoor-WhereNext framework for indoor location prediction is constructed. The overall architecture is shown in Figure 2. The framework is based on the bottom-up design principle and is divided into two modules: SSS-based location modeling and SSS-based location prediction. In the SSS-based location modeling phase, user trajectories are converted to location sequences via the Indoor-STDBSCAN algorithm. The user location sequences are clustered to obtain similarity user groups based on the SSS, and each group trains a model. An exemplar is also generated by each group to represent itself. In the SSS-based location prediction phase, the similarity matrix between the location sequence and each group is calculated by exemplars, and then different models are used to predict the next possible location according to the similarity matrix.

Stay Point Detection
When a user stays at a particular location, there is a greater probability that the user will view the location service information [43]. Therefore, stay point detection is a key step in location sequence conversion. When the user stays in a certain place for a certain length of time, the mobile terminal records more trajectory points in a limited area. This results in clustering of trajectory points. Therefore, a clustering algorithm is applied to detect the stay points. However, in contrast with an outdoor trajectory, an indoor trajectory typically has three-dimensional characteristics, which makes existing outdoor detection algorithms difficult to apply to the indoor trajectory. Therefore, a novel indoor trajectory stay point detection algorithm, Indoor-STDBSCAN, is proposed.
The Indoor-STDBSCAN algorithm is an improved version of the spatiotemporal density-based spatial clustering of applications with noise (ST-DBSCAN) algorithm [44,45]

. Location Sequence Conversion
The stay point obtained by the Indoor-STDBSCAN algorithm only contains the spatial information and does not contain semantic information, so it is necessary to assign semantics to the stay point. For this, a matching method is defined. There are two spatial relationships between a stay point and a shop: Inside and outside the shop. For example, the stay point sequence of a user shown in Figure 3

Location Sequence Similarity Calculation Method
The core of the Indoor-WhereNext framework is to cluster the location sequences better. Users whose location sequences fall within the same group have high similarity and vice versa. Hence, the location patterns of users with location sequences falling within the same group are easier to mine. The Indoor-WhereNext framework achieves improved prediction accuracy by modeling similar users. The user location sequence implies the spatial information and semantic information of the shop. The spatial information of the shop describes the spatial location of the shop inside the shopping mall, which restricts the user's range of movement in the indoor space. The semantic information of the shop describes the semantic characteristics of the shop, which to some extent reflect the shopping habits of the user. Spatial information and semantic information are comprehensively considered to define a novel SSS method to measure the similarity between location sequences for cluster formation. The SSS method is divided into two parts: Spatial similarity and semantic similarity.
Spatial similarity mainly calculates the similarity of spatial information implied in the location sequence and describes the similarity of the movement trajectories of the two sequences in geospatial space. When users stay in the same shop, they show a certain degree of spatial similarity. The more shops there are between the location sequences, the higher the spatial similarity. Therefore, the longest common subsequence (LCSS) [46] is used to calculate the spatial similarity between location sequences. The spatial similarity between user and user is calculated as defined in Formulas (1) and (2).
where { ℎ } and ℎ represent the location sequences of users and , respectively; and represent the numbers of shops visited by users; ( , ) is a function for obtaining the maximum of the values and ; is the spatial similarity matrix; and represents the spatial similarity between users and .
Semantic similarity mainly calculates the similarity of semantic information implicit in the location sequence and describes the degree of similarity between two users in interests and behaviors. In this paper, semantic information is not a categorical attribute of the shops, because we believe that their attribute information is artificially specified and subjective. The semantic information to which we refer is an implicit message that is expressed through user behavior. Generally, users hop more frequently between the same types of shop (purposeful consumption), which reflects the semantic similarity between those shops. In view of this, the location sequences of all users are constructed into a weighted network ( , , ), where represents the shop set, represents the transfer set between shops, and represents the transfer times between shops. With an increase in the number of location sequences, the weight between shops can reflect the similarity between them; that is, the higher the similarity, the higher the weight. Based on these characteristics, the node2vec [47] method is used to vectorize the shops. As shown in Figure 4, when the weight between shops is larger, the distance between the shops' corresponding vectors is less. After vectorization by node2vec, each shop uniquely corresponds to a vector, and the semantic similarity between location sequences can be calculated by the corresponding vector sequences. In this work, the dynamic time warping (DTW) [48][49][50] algorithm is used to calculate the semantic similarity between location sequences. The semantic similarity between user and user is calculated as defined in Formulas (3) and (4).
where { ℎ } and ℎ represent the location sequences of users and , respectively; and represent the numbers of shops visited by users; is used to calculate the Euclidean distance between the corresponding vectors of shops ℎ and ℎ ; ( , , ) is a function for obtaining the minimum of the values , , and ; is the semantic similarity matrix; and represents the semantic similarity between users and . After the semantic and spatial similarities between sequences have been calculated, the final location sequence similarity is superimposed by two parts, as defined in Formula (5).
where , , and represent the spatial similarity matrix, semantic similarity matrix, and spatial-semantic similarity matrix, respectively; ( ) and ( ) are functions for obtaining the minimum and maximum values, respectively, in matrix ; and is a weight coefficient that represents the contribution of spatial similarity to the location sequence similarity. The default value of is 0.5-that is, the contributions of semantic similarity and spatial similarity to the location sequence similarity are equal.

SSS-Based Location Modeling
After the SSS method has been defined, to divide users into different groups, several requirements should be considered. First, the number of groups cannot be known in advance. Second, each group needs to have a representative user. The representative user mainly helps new users know which model they use. Based on the above two points, the affinity propagation (AP) [51] algorithm is used to cluster the location sequences of all users. After clustering, LSTM [52] is used to train the prediction model for users in each group. The training process of the Indoor-WhereNext framework is shown in Algorithm 2.

SSS-Based Location Prediction
After modeling, there is a one-to-one correspondence between models and exemplars-that is, corresponds to . Given a new user trajectory, the goal is to determine where the user is likely to visit next. First, a group is found that is more likely to be associated with the particular sequence of visits being considered in the forecasting task, and then the corresponding LSTM model is used to predict the most likely location. The prediction process of the Indoor-WhereNext framework is shown in Algorithm 3. Here, { _ } is used to determine to which group the new user belongs. In essence, exemplars are specific location sequences, so the similarity between the new location sequence and the exemplars is calculated. Then, the model with the highest similarity is chosen for location prediction.  Table 1, the data field included the unique identifier of the user, the record upload time, the user's X,Y coordinates, and the unique identifier of the floor. As shown in Table 2

. Data Preprocessing
The indoor users' original trajectory data were collected via Wi-Fi positioning. Due to the instability of the mobile terminal signal and an artificial shutdown of the Wi-Fi signal, abnormal, erroneous, and invalid data was easily generated. The statistical characteristics of the users' original trajectories are shown in Figure 5. After data preprocessing, a total of 345, 824 user trajectories were obtained.
(1) The sampling interval for trajectory points was mostly concentrated between 1 and 5 s, accounting for approximately 82.5%, but there still were abnormal data with large sampling intervals and sampling intervals of 0 s. For example, trajectory points with sampling intervals of 0 s accounted for approximately 7.3%. (2) The number of trajectory points contained in a trajectory was between 1 and 7 in most sets, accounting for more than 97%. In other words, a large number of trajectories contained only a few trajectory points and could not be used to train the model. In our work, trajectories where the number of trajectory points was less than 50 were deleted. (3) The time span for trajectory points recorded in the shopping mall was 24 h-that is, there were records generated even during nonbusiness hours for the shopping mall, and the records generated in this process were invalid.

Evaluation Metrics
In this work, @ and @ (top k locations) were used as quantitative indicators of the evaluation model. @ is used to evaluate the top-k prediction locations, to determine if they represent real locations. @ uses macro-averaging to evaluate the performance of models from the perspective of multiple classifications-that is, indoor location prediction problems. @ , and @ are defined in Equations (6) and (7).
where represents the total number of locations and the number of shops; represents the number of samples in which the model correctly predicts that a user will visit location ℎ ; represents the number of samples in which the model incorrectly predicts that a user will not visit location ℎ .

Variable Estimation
The value of hyperparameters has a considerable impact on the predictive performance of the model. When the value of the hyperparameters is not suitable, the model exhibits poor prediction performance. In this section, we calibrate the hyperparameters in the framework and analyze the impact of the hyperparameters on the prediction performance. The main hyperparameters of the Indoor-WhereNext framework are the radius , the time window , the minimum number of points , and the weight coefficient . To determine the optimal hyperparameter of the framework, the control variable method was used to obtain the combination of parameter values with the best prediction accuracy. In the parameter estimation stage, first , , and in the Indoor-STDBSCAN algorithm were determined. Then, using these values, the weight coefficient was adjusted to test the influence of semantic and spatial similarities on prediction accuracy.

Calibrating the Parameters of Indoor-STDBSCAN
In the Indoor-STDBSCAN algorithm, the main test time window influences the prediction accuracy-that is, the test stay time influences the prediction result. In the parameter calibration process, the weight coefficient was fixed to 0.5, the space radius was fixed to 5 m with reference to the average distance between indoor shops. The time window was the best parameter found in [1 min, 3 min, 5 min, … ,13 min]. The minimum number was set to a fixed value according to the data average sampling interval, and the time window -that is, = . The effect of the time window on the prediction accuracy @ is shown in Figure 6. When ∈ {1,3,5,7,9}, @ increased initially and then became stable.
When 5 min, the prediction accuracy of the framework did not change much. However, as the time window increased, the number of location sequences tended to decrease-that is, the number of training data decreased. To ensure the prediction accuracy and the number of training data at the same time, the time window was set to 7 min. After the Indoor-STDBSCAN parameter was calibrated, we further filtered the trajectory with too few stay points. A total of 45,315 trajectories was finally used for the experiment.

Calibrating the Weight Coefficient
The weight coefficient mainly tests the influence of spatial similarity and semantic similarity on prediction accuracy. First, the hyperparameters in the Indoor-STDBSCAN algorithm are fixed. Then, finds the optimal parameter from [0,0.1,0.2, … ,1]. When is set to 0 or 1, it means that only one similarity is considered to affect the accuracy of the prediction. The influence of weight coefficient on prediction @ is shown in Figure 7. When ∈ {1,3,5,7,9}, @ showed a trend of first increasing and then decreasing. When 0.3 ≤ ≤ 0.6, @ of the framework was relatively high. When = 0.4, @5 reached 67.6%, an improvement of 17.6 and 24.1 percentage points, respectively, over that with = 0 and = 1. This indicates that both semantic and spatial similarity contributed to the accuracy of the model.

Performance of Indoor-WhereNext
After calibration of the framework parameters, the change in the prediction accuracy of the Indoor-WhereNext framework with the number of iterations was analyzed. The results are shown in Figure 8.
(1) For the training dataset, the prediction accuracy showed a continuous upward trend with the increase in the number of iterations. (2) For the test dataset, the prediction accuracy increased initially, then remained constant and finally decreased as the number of iterations increased. The framework tended to overfit as the number of iterations increased, improving the prediction accuracy of the model in the training dataset while worsening the prediction accuracy in the test dataset.

(3) Comparing
@ on the test dataset, when ∈ {1,3,5}, the prediction accuracy of the model was greatly improved; at @5, the prediction accuracy was 67.6%. Compared with @1 and @3, the prediction accuracy increased by 32.5% and 22.1%, respectively. However, as continued to increase, the prediction accuracy of the model increased slowly. Compared with @5, @7 and @9 only increased by 0.9% and 1.5%, respectively, because the shop that the next user visits in the mall is often a collection of shops rather than a specific shop. In the predicted set of shops, the user destination has a certain randomness.

Comparison with Baselines
To ascertain the efficiency of the proposed Indoor-WhereNext framework, it was compared with three existing prediction models for datasets: HMM (original-HMM), the improved hidden Markov model (improved-HMM), and the LSTM model (original-LSTM). Of these, original-HMM and original-LSTM use the shop sequences to build a model to predict the next location. Improved-HMM replaces LSTM in the Indoor-WhereNext framework with HMM and builds models based on the SSS to predict the next location. The prediction accuracy of HMM is related to the number of states. In the comparison experiment, the number of states in HMM was varied among 10, 20, 30, and 40 states. Figure 9 shows the prediction accuracy of the four models. It can be seen that, because the original-HMM and the original-LSTM models consider location prediction as a time series modeling problem, they ignore the influence of the similarity between location sequences on the location prediction. Therefore, their predictive performance was worse than those of improved-HMM and the proposed Indoor-WhereNext framework. In particular, when the number of states in HMM was 10, the @5 of the Indoor-WhereNext framework was 31.2% higher than that of original-HMM and 23.8% higher than that of original-LSTM. Improved-HMM accounts for the similarities between location sequences and builds a model for similar users. However, when the number of states in HMM was 40, the @1, @3, and @5 values of the Indoor-WhereNext framework were still 3.2%, 2.5%, and 13.8% higher, respectively, than those of Improved-HMM. The reason is that the LSTM model in the Indoor-WhereNext framework is used to model the location sequences, which makes it easier to capture the movement patterns in a long location sequence. In general, the Indoor-WhereNext framework greatly improved indoor location prediction by enhancing the Accuracy@1 by between 3.2% and 15.1%, the Accuracy@3 by between 2.4% and 18.3%, and the Accuracy@5 by between 13.8% and 31.9%. Figure 10 compares the prediction precision of the four models. As in the case of accuracy, the precision of the framework can be improved by 7-27.3%, 17.8-20.9%, and 6.9-14.7% compared with the baseline experiments. In particular, when k = 5, the prediction precision of the model was 61.6%. However, compared to the accuracy of the framework, the precision of the Indoor-WhereNext framework is reduced by 6%. This reduction in accuracy can be attributed to the fact that the indicator precision regards location prediction as a multi-classification problem, and the test samples in each classification are unbalanced, resulting in a slight reduction in the precision. Overall, the Indoor-WhereNext framework significantly outperforms the three existing baseline methods in terms of prediction precision.

Conclusions and Future Work
The Indoor-WhereNext framework was proposed for indoor location prediction. First, considering the three-dimensional characteristics and the relative error of indoor trajectories, the Indoor-STDBSCAN algorithm was proposed in order to identify the stay points of the indoor user and convert the user trajectory into a location sequence, thereby overcoming the problem that it is difficult to identify indoor stay points using the existing methods. Then, considering the spatial and semantic similarities of location sequences, the SSS method was defined to obtain the similarity matrix between location sequences. Finally, the AP algorithm was used to obtain similarity user groups based on the similarity matrix, and the groups were used to train different prediction models to improve the accuracy of location prediction.
In the experimental section, a two-week period of real indoor trajectories was used to verify the efficiency of the proposed framework. First, the control variable method was used to obtain the combination of parameter values with the best prediction accuracy. When the optimal parameters were used, the @5 reached 67.6%. Then, a comparison with three existing baseline methods was conducted. Compared with original-HMM, original-LSTM, and improved-HMM, the proposed framework delivered improved accuracy and precision, with Accuracy@5 increasing by 31.2%, 23.8%, and 13.8%, and Precision@5 increasing by 27.3%, 20.9%, and 14.7%, respectively. This demonstrates the efficiency of the Indoor-WhereNext framework.
The following aspects can potentially be investigated further in future work: (1) further validation of the proposed framework with more types of data such as hospital indoor trajectories and airport indoor trajectories, (2) comprehensive comparison with other location prediction models such as ST-RNN and Markov chain, (3) comparison of the models with more comprehensive evaluation indicators, and (4) integration of more factors into Indoor-WhereNext to achieve a more robust model that further improves the accuracy of indoor location prediction.
Author Contributions: Peixiao Wang contributed to data preprocessing, the experiment, and the writing of the manuscript; Sheng Wu gave advice on the experimental discussion and materials; Hengcai Zhang formulated the general research idea and contributed to writing the manuscript; Feng Lu contributed to the manuscript revision.