Mobile User Location Inference Attacks Fusing with Multiple Background Knowledge in Location-Based Social Networks

: Location-based social networks have been widely used. However, due to the lack of e ﬀ ective and safe data management, a large number of privacy disclosures commonly occur. Thus, academia and industry have needed to focus more on location privacy protection. This paper proposes a novel location attack method using multiple background options to infer the hidden locations of mobile users. In order to estimate the possibility of a hidden position being visited by a user, two hidden location attack models are proposed, i.e., a Bayesian hidden location inference model and the multi-factor fusion based hidden location inference model. Multiple background factors, including the check-in sequences, temporal information, user social networks, personalized service preferences, point of interest (POI) popularities, etc., are considered in the two models. Moreover, a hidden location inference algorithm is provided as well. Finally, a series of experiments are conducted on two real check-in data examples to evaluate the accuracy of the model and verify the validity of the proposed algorithm. The experimental results show that multiple background knowledge fusion provides beneﬁts for improving location inference precision.


Introduction
With the rapid development of mobile intelligent terminal and communication equipment, Location based Social Networks (LBSN) have become popular, which put information and friends in easy reach. Users can use LBSN to share their locations (usually referred to as a "check-in") with their friends, leave comments, and find coupons [1,2]. Many of these services have already been adopted and used by millions of users, and the number keeps growing steadily. However, due to the lack of effective and safe data management, users' privacy is under the risk of leakage. Many research efforts have been devoted to investigating location privacy protection [3][4][5][6].
The simplest method to protect a user's location privacy in LBSN is to make sure that users do not check in to the places which they regard as being sensitive (e.g., churches, hospitals, or bars) after they visit them. However, adversaries can still infer the hidden locations where a user has been visited without check-ins through linkages of multiple background information. Reference [7] refers to this kind of attacks as hidden location inference attacks.
Here, we use an example to illustrate hidden location inference attacks. As shown in Figure 1, a user u moves to the location l j+1 after he/she checks-in to the location l j on the road network. During the path from l j to l j+1 , user u visited a bar l m on this path. In fact, u has not checked-in at l m due to privacy protection concerns. However, if the attacker knows that u is a nightclub enthusiast, Mathematics 2020, 8, 262; doi:10.3390/math8020262 www.mdpi.com/journal/mathematics the attacker can infer that u would visit the bar l m with a high probability when u checks-in at the location l j+1 [7]. u expects to avoid checking-in at any sensitive locations. However, u cannot know whether his/her future check-in behavior will lead to the disclosure of his/her hidden locations. There are two challenges for resolving this problem. One is how to find out if there is potential hidden location leakage with a suitable check-in precision while retaining a good user experience; the other challenge is how to measure diverse background knowledge. Reference [7] proposed a privacy alert mechanism to protect against hidden location inference attacks. However, it only employs geographical and social information, and much diverse background knowledge is not considered, which limits the inference accuracy.
Mathematics 2020, 8, x FOR PEER REVIEW 3 of 13 users. We use a mechanism of advanced alerts to remind the user that his/her privacy is in danger. The users could make a decision according to their privacy preferences.

Literature Review
In the early stage, to protect location privacy, Gruteser and Grunwald [8] proposed a location kanonymity model which was developed from the well-known k-anonymity concept, that is, the location is made indistinguishable by using the location information of at least k − 1 other users. Then, a great of research efforts [9,[10][11][12][13] have been devoted in investigating location privacy protection. The existing location privacy protection mechanisms can be classified into a priori protection and a posterior screening [14]. In a priori protection mechanism, an actual location is replaced with an obfuscated location before the location is released. The mainstream idea for location obfuscation includes generalization, cryptography, generating dummies, and adding noise. In contrast, in a posterior screening mechanism, the user's location is protected from a service result perspective, assuming that a service request is answered. Our work focuses on the location attack method to infer the hidden locations of mobile users, and employs a warning mechanism to protect privacy. Thus, our work falls into the category of a posterior screening.
Privacy is an important issue when users consider using LBSN. Thus, location privacy in LBSN keeps attracting more attention from both academia and industry. Reference [15,16] focuses on helping management privacy, where a user can set location sharing privacy preferences. Reference [17] introduced a machine learning approach to control the sharing policy. From the attack models aspect, absent privacy attacks, nearby friends' attacks, and dynamic location inference attacks are studied widely.
Reference [18] is the first work that studied absent privacy protection model in LBSN. Absence privacy is a special kind of location privacy, which means that an attacker can know that a user is not in a position during a period of time. Reference [18] proposed an algorithm WYSE (Watch Your Social stEp) based on the spatial and temporal generalization. However, since this method needs to postpone releasing the generalized location, the quality of service (QoS) decreases. In order to protect from the absent privacy problem, Reference [19] proposed a POI-based absent privacy protection After analyzing the LBSN service, we observed that attackers can collect background knowledge from two areas: the users' behavior characteristics (i.e., historical check-in sequence, personalized point of interests (POIs) preferences, and social networks) and the features of the physical world (i.e., geographical locations and POIs popularity). We elaborate on each background knowledge area one by one: (1) The historical check-in sequence. The check-in sequence is constituted by the user's check-in locations sorted by the timestamps. Obviously, the check-in sequences reflect the user's behavior. For example, from the historical check-in records, the number of check-in sequences including the positions l j and l j+1 both is 10. Then, we find that the user checks-in at l m 3 times between l j and l j+1 . Thus, we guess that the user will check-in to the position l m from the location l j to the location l j+1 with the probability 0.3(=3/10) in the future. (2) Personalized POIs preferences. In LBSN, each POI is labeled with several service categories.
Generally, users' preferences for the service categories are different. For example, nightclub enthusiasts prefer to visit different bars, while travel enthusiasts like to visit different tourist attractions. Therefore, the personalized preference of the service category can also be used to infer the probability of the user visiting the hidden location l m . (3) Social networks. Generally, the behaviors of friends are similar. Specifically, users often wander around various streets with their friends and go to a good restaurant or a shopping mall, etc. A user is more likely go to a place recommended by his/her friends. Suppose that a user u's friend u* always checks-in to the l m . Then, although u does not check-in at l m during the movement from l j to l j+1 , the attackers can also infer the likelihood that u visited l m based on the user similarities between u and u*. (4) Geographical location. In general, the geographical proximity of POIs has a significant impact on the user's check-in behavior. The probability of accessing l i+1 after checking-in at l i depends on the distance between the two POIs. For example, users usually go to a mall or a movie theater nearby for convenience. On the other hand, the reachability of a position can be used to prune a sensitive hidden location. Specifically, if a user takes a short traveling time that is less than the minimum time required between the two locations l i and l j+1 , then the user u certainly cannot visit l m . That is, the location l m is not reachable. (5) POIs Popularity. If a POI is prevalent, then the POI is more attractive to a user. That means, the visiting probability of the users to this POI will be high. Therefore, we can use the popularity of POIs to infer the accessing probability as well.
Employing the above background knowledge, we propose two hidden location inference attacks models, namely WBI (weighted Bayesian hidden location inference model) and HLPI (hidden location prediction inference model through multi-factor fusion), to infer the hidden location of a user. WBI infers the access probability of the hidden locations based on the weighted Bayesian model, which considers the user's historical check-in sequences. The user's social networks, the personalized POI preference, and the POI popularity are used as the affected weights in WBI. HLPI is a hidden location inference model with multi-information fusion. The prior probability of the hidden location is computed based on the geographic location and the social network, and the POI popularity, respectively. A posterior probability is calculated according to the location reachability. Based on the two inference models, a hidden location inference attack algorithm is further given. According to the user's current check-in location l j+1 and the previous check-in location l j , the probability of visiting the hidden location by the user is inferred. The most probable leaked hidden locations are pushed to users. We use a mechanism of advanced alerts to remind the user that his/her privacy is in danger. The users could make a decision according to their privacy preferences.

Literature Review
In the early stage, to protect location privacy, Gruteser and Grunwald [8] proposed a location k-anonymity model which was developed from the well-known k-anonymity concept, that is, the location is made indistinguishable by using the location information of at least k − 1 other users. Then, a great of research efforts [9][10][11][12][13] have been devoted in investigating location privacy protection. The existing location privacy protection mechanisms can be classified into a priori protection and a posterior screening [14]. In a priori protection mechanism, an actual location is replaced with an obfuscated location before the location is released. The mainstream idea for location obfuscation includes generalization, cryptography, generating dummies, and adding noise. In contrast, in a posterior screening mechanism, the user's location is protected from a service result perspective, assuming that a service request is answered. Our work focuses on the location attack method to infer the hidden locations of mobile users, and employs a warning mechanism to protect privacy. Thus, our work falls into the category of a posterior screening.
Privacy is an important issue when users consider using LBSN. Thus, location privacy in LBSN keeps attracting more attention from both academia and industry. Reference [15,16] focuses on helping management privacy, where a user can set location sharing privacy preferences. Reference [17] introduced a machine learning approach to control the sharing policy. From the attack models aspect, absent privacy attacks, nearby friends' attacks, and dynamic location inference attacks are studied widely.
Reference [18] is the first work that studied absent privacy protection model in LBSN. Absence privacy is a special kind of location privacy, which means that an attacker can know that a user is not in a position during a period of time. Reference [18] proposed an algorithm WYSE (Watch Your Social stEp) based on the spatial and temporal generalization. However, since this method needs to postpone releasing the generalized location, the quality of service (QoS) decreases. In order to protect from the absent privacy problem, Reference [19] proposed a POI-based absent privacy protection algorithm.
Nearby friends services as one of important types of LBSN can lead to user location disclose as well [14]. Reference [14,20] proposed a location privacy protection method to protect against these nearby friends attacks. Meanwhile, two location privacy protection algorithms based on symmetric encryption algorithm are proposed [20]. Reference [21] proposed a method to infer the location of users in LBSN, which employs the friends' location and attribute information. A dynamic Bayesian network model is used to calculate the user's access probabilities and trained by a real check-in data set.
Dynamic location inference attacks include hidden location inference, target location inference attack, and continuous location attacks. The most related work to ours is Reference [7], where a hidden location inference attack model is proposed. Reference [7] puts forward four inference algorithms, which are based on the simple Bayesian inference, collaborative filtering algorithm, and Markov inference model respectively. Compared with Reference [7], the method proposed in this paper fuses more diverse background knowledge, including a location check-in sequence, temporal factors, user social networks, personalized service preference, and the popularity of POIs. Reference [22] proposed a customizable and continuous privacy-preserving check-in data publishing framework through obfuscating user check-in data. The protection mechanism between reference [22] and our work is different. Reference [23] proposed a novel destination prediction attack and corresponding location privacy protection method. However, the research targets between reference [23] and our work are different. Hidden location inference aims to protect previously visited POIs in the past instead of prediction risk in the near future.

Definition 1 (Check-in sequence). A user's check-in sequence CS is a sequence of POIs sorted by timestamps.
That is, CS = {u i , (l 1 , t 1 ), . . . , (l j , t j ), . . . , (l n , t n )}. u i is the user's identity. l j = (x j , y j ) is the user's check-in location, which is a pair of longitude and latitude. t j is the check-in timestamp.

Definition 2 (Hidden Location).
Given a user's two check-in locations l j and l j+1 at t j and t j+1 respectively, the hidden position l m is a POI being visited and not being checked-in at time t m (t j < t m < t j+1 ) on the path from l j to l j+1 .

Definition 3 (Check-in Matrix).
Given the historical check-in records of LBSN, we get a check-in matrix W |U|×|L| , where U and L denote the user set and POIs set, respectively. An entry w u,l in W |U|×|L| is the frequency of the user u checks-in at location l∈L.

Definition 4 (Social Matrix).
A user's social matrix F |U|×|U| can be obtained from LBSN. If u and u' are friends, the entry f u,u' is 1, otherwise f u,u' is 0.
In fact, the social matrix is sparse, since most of the elements in F |U|×|U| are zero. If two users share more common friends and more common check-in locations, the more similar the two users are. As a result, the two users are more likely to visit the same place [24]. Therefore, the user similarity is defined from the view of the user's check-in behavior and their social networks.

Definition 5 (User Similarity Matrix).
Given the check-in matrix W |U|×|L| and the social matrix F |U|×|U| , s u,u' is an entry in the user's similarity matrix S |U|×|U| , which is defined as follows: where θ is the system parameter between [0,1], F u represents the friend set of user u, and L u represents the check-in location sequences of user u. Obviously, 0 ≤ s u,u' ≤ 1. The larger the entry s u,u' is, the more similar the two users are.

Definition 6 (Service Preference Matrix).
Given check-in records and POI categories, the data entry c u,c in the user's service preference matrix C |U|×|C| is the category frequency, i.e., how many times the user u visits a c(∈C)-type categorical POI [25]. Specifically, c u,c equals the number of c categorical POI visited by u divided by the total check-in number of u. It is worth noting that a POI can be labeled by using multiple categories.
Example 1. Figure 1 shows three users' check-in records, and each POI is associated with several service categories. The user u 1 checks-in at location {l 1 , l 2 , l 3 }. According to the service categories of POI, l 1 is labeled by {a, b, c}, l 2 is label with {a, c, d} and l 3 is labeled with {b, c}. Thus, as Definition 6, the categorical preference for user u 1 is { 2 3 , 2 3 ,1, 1 3 } for the service category {a, b, c, d,} respectively.

Definition 7 (Popularity Matrix).
The popularity matrix is denoted as P |C|×|L| . The element p c,l in P |C|×|L| represents the popularity of a POI in the view of c category. Specifically, p c,l is the number of the l POI check-in over the total check-in times for all POIs with the label c [25].
Example 2. Continuing with the above example in Figure 1, l 1 and l 2 are both labeled with the category a. From Figure 1, the category a has been checked-in four times. That is, l 1 is checked-in two times and l 2 is checked-in two times, respectively. Thus, the popularity of l 1 with the service category a is 1/2. In the same way, the popularity of l 1 with other service categories (i.e., b, c, and d) can be calculated. Thus, the popularity of l 1 with the service categories {a, b, c, d} is {1/2, 2/5, 2/7, 0}, respectively.

WBI: Weighted Bayesian Hidden Location Inference Model
The simple Bayesian model can be used to derive the visiting probability of a hidden location from the user's historical check-in records. Given a hidden location l m and the upper bound of the time difference ∆t between two check-in locations, the probability for the user visiting the hidden location l m can be calculated as [7]: In the formula (2), ∆s is the check-in time difference between the two locations l j and l j+1 . = 0. We observe many factors can affect the user's visiting behavior, including personalized service preferences, POI popularity, and social networks. Intuitively, a popular POI is more attractive to a user. Thus, the probability of users accessing a popular location could be high. In addition, Reference [7] verified that friends have an important influence on each other's behaviors. Therefore, we propose a weighted Bayesian hidden location inference model WBI. The affecting factors are used as the weights in the simple Bayesian model. As a result, the user similarity, the personalized service preference, and the POI popularity are merged together to infer the hidden location visiting probability: In the formula (3), u' and u are friends. s u,u is the user similarity between u' and u, which is calculated by Definition 5. c u,c is the service preference of the user u on the category c, which is calculated by Definition 6. p c,l is the POI popularity and is calculated by using Definition 7.

Hlpi: Hidden Location Inference Model Based on Multi-Factor Fusion
The probability of a user visiting a hidden location is influenced by various of background knowledge. Finding a proper model to fuse these background knowledges together is a challenge task. The existing work about location recommendation inspires us. Inspired by References [25,26], we propose a hidden location inference model HLPI. The basic idea is as follows. We first calculate the prior probability of visiting the hidden location l m employing the association of geo-location, the social association of the user and the popularity and category of the POI. Then, the user's posterior probability of visiting the hidden location is computed using the prior probability and the reachability between two locations. The posterior probability indicates the likelihood that a user will visit the hidden location.
In general, the geo-location proximity between POIs has a significant impact on users' check-in behavior [25]. Two closer POIs are more likely to be visited consecutively than the ones that are far away. For example, a user usually goes to a nearby movie theater after shopping in a mall. Therefore, the visiting probability P(x u,l ) for the hidden locations can be calculated by analyzing the geographical association between the checked-in POIs and the hidden locations.
In real life, the user's social relationship also has an impact on the user's behavior. For example, a user is likely to visit a POI (such as restaurants, shopping malls, etc.) which is recommended by his/her friends. That is, if two friends share more common checked-in POIs, they are more likely to visit the same location. We use the user's social relationship as an another factor to infer the probability P(y u,l ) of a user visiting a hidden location l.
In addition, personalized preferences and the popularity of POI categories also have an influence on user's checked-in behavior. For example, nightclub enthusiasts will often hang out in different bars, and traveling enthusiasts will more likely travel to different tourist attractions. Meanwhile, more popular POIs are more attractive. Therefore, the personalized preference for the service category and the popularity of POI categories are also used to infer the visiting probability P(z u,l ) of a hidden location l.
Reference [25] proves that the above three factors are independent of each other and follow identically distributed principles. Therefore, we integrate the factors together by multiplying the three probabilities. The prior probability of the user visiting the hiding location l m is calculated as formula (4): P prior u,l = P(x u,l ) · P(y u,l ) · P(z u,l ).
When a user checks-in at positions l j , l j+1 consecutively, the posterior probability of the user visiting the hidden location l m is calculated as follows: Combining formula (4) and formula (5), the posterior probability of visiting a hidden location is given in formula (6): P u,l = 3 P prior u,l · P(∆s ≤ ∆t) = 3 P(x u,l ) · P(y u,l ) · P(z u,l ) · P(∆s ≤ ∆t) It is notable that in order to balance the effect of the prior probability and the temporal probability, the cube root of the prior probability is used.
Specifically, P (x u,l ) is estimated using the adaptive kernel estimation method. The ideas for computing P(y u,l ) and P (z u,l ) are as follows. We first get a probability density functions through the frequency distributions of the check-in frequency under the influence of user association and the category popularity, respectively. Then, the probability density functions are used to calculate the cumulative distribution so as to obtain the corresponding probability estimation [25].

Hidden Location Inference Attack Algoriyhm
In order to protect against hidden location leakage, we precede the check-in operation with a warning message, indicating that the user's privacy is protected. We employ the client-authorization server-LBSN server system architecture [7]. The specific system procedure is as follows: (1) The sensitive category set SS u which the user u regards to be sensitive is saved in the authorization server. The authorization server is trusted. When the user u wants to use the check-in services, u can send the check-in request with pre-check-in location l j+1 at time t j+1 to the authorization server. (2) When the authorization server receives users' check-in requests, Algorithm 1 is utilized.
The hidden locations between the user's previous check-in location lj and pre-check-in location l j+1 are computed. The inferred hidden POIs are sorted by the computed visiting probabilities. The authorization server will send a privacy warning message to u when the categories of the hidden POIs fall into the sensitive category set SS u . The most probable POIs whose category is sensitive are pushed to u in the warning message. The warning message will ask whether the use still wants to check-in at POI l j+1 at time t j+1 . (3) The users can make a choice by themselves. If the user still wants to check-in at location lj, the authorization server will forward the check-in request to the LBS server. Otherwise, the authorization server will drop this check-in request, meaning the check-in service is sacrificed while the user's privacy is protected.
Algorithm 1 shows the algorithm of hidden location inference. First, given the user's pre-check-in location l j+1 and the previous check-in location l j , the shortest path and popular path {SP 2 } between the two locations are computed on the road network. We employ the methods in Reference [7] getting the set of hidden locations {L m } from {SP 2 }. If the WBI attack model is applied, we use the formula (3) to calculate the visiting probability for each position l j in {L m }. If the HLPI attack model is used, the formula (6) is used for computing the visiting probability. The matrices used in the formula (3) and the formula (6) are initialized through aggregating the users' historical check-in data. Finally, a pair set {<l m , p m >} of the hidden location and accessing probability is returned.

Setting
We use two real datasets to verify the efficiency and the effectiveness of the two proposed models in our experiments. One is the check-in data from Foursquare [27] using the road network of New York. The other one is the check-in data from Yelp [25], using the Phoenix road network. Table 1 lists the statistics of the two data sets. We select 5000 users randomly. Since real hidden locations are invisible in check-in datasets, we make a similar assumption as in Reference [7], that a user visits a POI if and only if the checks in the POI. A hidden location dataset is generated as follows: given a user u k , a hidden location set is generated by marking off l (5 ≤ l ≤ 25) POIs that u k has checked in randomly. After the POIs are marked off, the time interval between two consecutive check-in POIs cannot be larger than five times the average time interval. The hidden location attack models and algorithms proposed in this paper are all implemented in MATLAB and run on a Windows 7 with a 2.4 GHz processor and 4 GB of memory. The proposed models WBI and HLPI are compared with WFI and CFI in Reference [7]. WFI is a Bayesian inference method that only takes the friend similarity of user into account. CFI is a user-based collaborative filtering inference model that infers the hidden locations of user using the user's check-in similarity.

Accuracy
We evaluate the accuracy of the proposed infer models from two aspects (i.e., precision and recall) [28][29][30][31][32]. Precision refers to the percentage of the hidden locations generated by proposed models which are marked off POIs, and recall refers to the percentage of total marked off POIs correctly identified by the proposed models. Precision = number of true hidden locations returned from the model total number of locations returned from the model (7) Recall = the number of hidden locations returned from the model the total number of marked off locations (8) Figure 2 shows the accuracy of the HLPI, WBI, WFI, and CFI with different numbers of hidden locations. As we can see from Figure 2, HLPI has the highest accuracy and recall among the four models no matter using the dataset from Foursquare or Yelp. That is because HLPI fuses more background knowledge than the other three models, including the user check-in records, the geographical location association, the user similarity, and the popularity of POIs. WBI comes second. The accuracy of WBI is higher than WFI, since WBI is improved from WFI by taking into consideration of the user category preferences and POI popularity. In the two check-in datasets, the precision and the recall of CFI are lowest with the existence of sparse matrices. shown in Figure 4, the accuracy of the four methods increases with the increase of the number of hidden locations. Moreover, the precision and recall of the HLPI, which fuses user social network and POI popularity together, are the best. Comparing the experimental results in the both datasets, it can be found that the influence of the user social network on the accuracy is more obvious than the geo-location association and the popularity of POIs in Foursquare. However, the opposite is the case on Yelp. That implies that user's check-in behavior is affected by various aspects, including user social network, geographic location, and POI category popularity association. In order to ensure prediction accuracy of hidden location inference, various backgrounds should be considered when calculating the visiting probability of a user.   Figure 3 shows the accuracy of WBI changes with different background knowledge under the different numbers of hidden locations. Specifically, we denoted WBI as Su when the friend similarity is only considered in the formula (3), is denoted as Cu when the user service category preference is only considered, and is denoted as Pc when the POI category is only considered. From Figure 3, the accuracy of the four methods increases with the increase of the number of hidden locations. When the three kinds of background are fused together, i.e., the accuracy of WBI model is highest. In addition, we can see that the accuracy of Cu is higher than that of Pc and Su in Figure 3, both in the Foursquare data set and the Yelp data set. This indicates that the user's personal service category preferences have the most impact on the probability of user visiting to hidden location. From Figure 3, the accuracy of WBI increases with the increase of the number of hidden locations. When the number of hidden locations is small, WBI is disturbed by users' un-common check-in behavior. The precision and recall are low. With the increase of the number of hidden locations, many true hidden locations are returned. Precision and recall increase both.   Similarly, Figure 4 shows how the accuracy of the HLPI model changes with the number of hidden locations increasing when different background knowledge is considered. Geo only uses the association of geo-location in Equation (6); as a result, it only takes the user social association into account in Equation (6), and Ca only considers the POI categories' popularity in Equation (6). As shown in Figure 4, the accuracy of the four methods increases with the increase of the number of hidden locations. Moreover, the precision and recall of the HLPI, which fuses user social network and POI popularity together, are the best. Comparing the experimental results in the both datasets, it can be found that the influence of the user social network on the accuracy is more obvious than the geo-location association and the popularity of POIs in Foursquare. However, the opposite is the case on Yelp. That implies that user's check-in behavior is affected by various aspects, including user social network, geographic location, and POI category popularity association. In order to ensure prediction accuracy of hidden location inference, various backgrounds should be considered when calculating the visiting probability of a user.

Effectiveness
This section evaluates the efficiency of Algorithm 1 under different numbers of hidden locations. The number of hidden locations increases from 5 to 25. Algorithm 1 is denoted as HLIA and WBIA when HLPI and WBI are used respectively. Figure 5 shows that the average processing time of the two algorithms increases with the increase of the number of hidden locations. In both datasets, the average processing time of the HLIA is higher than that of the WBIA. This is because HLPI needs to calculate the visiting probability from the geographical location, the social network of the user, the popularity of the POI, and the POI category, respectively. Then, the four aspects are fused together. However, WBIA needs the use of formula (3) only.

Effectiveness
This section evaluates the efficiency of Algorithm 1 under different numbers of hidden locations. The number of hidden locations increases from 5 to 25. Algorithm 1 is denoted as HLIA and WBIA when HLPI and WBI are used respectively. Figure 5 shows that the average processing time of the two algorithms increases with the increase of the number of hidden locations. In both datasets, the average processing time of the HLIA is higher than that of the WBIA. This is because HLPI needs to calculate the visiting probability from the geographical location, the social network of the user, the popularity of the POI, and the POI category, respectively. Then, the four aspects are fused together. However, WBIA needs the use of formula (3) only.

Conclusions
Location-based social networks have been widely used, as a result the privacy leakage and protection raise more and more researchers' attention. Leakage of the hidden location threatens more dangerous to mobile users, since the users expect to hide these locations deliberately. This paper focuses on mobile user hidden location inference attacks when the attackers obtain various types of

Conclusions
Location-based social networks have been widely used, as a result the privacy leakage and protection raise more and more researchers' attention. Leakage of the hidden location threatens more dangerous to mobile users, since the users expect to hide these locations deliberately. This paper focuses on mobile user hidden location inference attacks when the attackers obtain various types of background knowledge. Considering location check-in records, reachability between to check-in locations, social networks, personalized service category preference, and POIs popularity, we propose two hidden location inference models and a hidden location inference attack algorithm. Finally, the accuracy of the models and the efficiency of the algorithm are evaluated using two real check-in datasets. The experimental results show that the prediction accuracy of the HLPI model is better than WBI, while the efficiency of HLPI is acceptable. In our current warning mechanism, users have to give up services when their privacy requirement is violated. Our future work will focus on developing a new protection method using a cryptography technique (e.g., geo-indistinguishability) in a new system architecture with more strong privacy protection and high service utility.