1. Introduction
The Social Internet of Things (SIoT) is an emerging paradigm of the Internet of Things (IoT) in which heterogeneous IoT devices can communicate with each other, collaborate on behalf of their owners, establish relationships based on common interests, and autonomously perform service trading. SIoT is expected to enhance the features of existing distributed systems, such as service discovery and composition [
1,
2,
3], information management [
4,
5,
6,
7], and service trustworthiness management [
8,
9,
10]. Although SIoT has begun to be adopted in some domains, such as smart vehicles [
11,
12,
13,
14], smart homes [
15], smart factories [
16], and integrated transportation [
17], current SIoT systems encounter numerous challenges that affect their usability and reliability in existing SIoT domains [
18,
19].
In general, IoT applications are developed to solve specific problems and usually do not share and use data from other IoT services to generate recommendations. Thus, the coordination among IoT services is inefficient because efforts to obtain similar datasets overlap [
20]. SIoT systems can improve coordination among IoT services because these systems comprise an object profile based on the IoT data and accessibility of each IoT device or component. SIoT networks enable objects to establish social relationships autonomously and thereby gain object popularity through coordination. These objects perform data exchange by joining different SIoT networks. SIoT networks can provide recommendation services to different IoT applications by referring to the data accessibility in each object profile. The content of the object profile gradually improves over time according to its owner’s experiences with and feedback on each previous recommendation. This demonstrates the importance of making credible and quality recommendations as a means of acquiring the object equivalent of social capital and attaining object popularity.
Personalization and recommendation are two key prerequisites in SIoT systems that enable delivery of a promising service [
21]. Both prerequisites are essential to producing a high satisfaction level for SIoT solution that matches the preferences of the user. IoT applications in the community should establish trust to ensure reliable interactions between relevant stakeholders to reduce exposure to malicious entities. A crucial problem in personalized recommendations is the generation of alternative solutions when the service provider fails to provide the requested service. Furthermore, the alternative solution must be calibrated to fulfill the preferences of the user. The aforementioned problem has a broad scope and poses greater challenges to existing recommender systems. The difference between generic and personalized recommendation is shown in
Figure 1, as personalized recommendation requires input from a user profile to infer relevant output to a user. A user profile contains relevant personal data (e.g., user behavior, location history, and transaction) collected from different sources (e.g., user activities, IoT devices, and service interactions), and allows recommendation by referring to user preferences.
A recommender system is mainly based on information discovery and information filtering. According to Bobadilla et al. [
22], a recommendation is influenced by the data collection method (including the data preprocessing and ranking methods), data filtering algorithm (e.g., content-based, collaborative, and hybrid algorithms), selected data model (e.g., memory-based and model-based methods), techniques employed for reasoning (e.g., probabilistic approaches and neural networks), data sparsity, and system performance management. According to our previous study [
23], location-based smart information systems can use the mobile trajectories of users to recommend several points of interest according to user preferences and conditions. The trajectory data can be obtained from relevant sensors embedded in mobile phones, wearable gadgets, and smart environment (e.g., buildings, and check-in points). Many existing solutions detailed in the literature focus on modifying the service layer. These solutions involve providing recommendations to users based on static information, such as preloaded service details (e.g., location and type) in an area and the current user position. Often, limited choices are generated that do not comport with actual human needs.
The major contribution of this article is a personalized recommendation system suitable for service discovery in a smart community, specifically SIoT networks. In particular, the novelty of this study lies in the following aspects:
- (a)
a trajectory analysis framework that applies user location histories, specifically the trajectories of users with similar behavior and movement patterns,
- (b)
the adoption of the knowledge–desire–intention (KDI) model [
23] to collect user data explicitly (e.g., ratings for items) and implicitly (e.g., location history and number of orders) from profile users, and
- (c)
a hybrid reasoning approach to leverage the available trajectory-based and contextualized data in performing personalized recommendations.
We adopt the link analysis (LA) method proposed by Zheng et al. [
24,
25] to capture the location correlation to achieve more effective and accurate item-based collaborative filtering (CF) [
26], which can generate both generic and personalized recommendations. However, our framework is different from that of Zheng et al. in three aspects. First, we adopt KDI hierarchical belief modeling [
27] for user profile. The Slope One algorithm [
24] is applied with a simple linear regression model to solve the recommendation problem. Second, we use a user feedback mechanism for fine-tuning items’ weight vectors after each session of recommendation generation to avoid the issue of local optimization. Third, the proposed framework is based on domain-independent user trajectory analysis, which is suitable for all types of IoT applications. Such a framework is appropriate for SIoT environments with various domains of intelligent systems that can interact closely.
Besides, we examine the proposed personalized recommendation framework in two stages. In a previous study, we investigate the performance of the recommender engine in terms of its ability to handle a smart campus dataset from UniCAT [
23]. We also conduct a study previously in which we experimentally investigate the characteristics of various filtering algorithms for recommending a place to visit. In the experiment, the trajectory records of 100 active users over 1 year are used for evaluation. The UniCAT dataset contains many new student profiles and only a limited number of user trajectory records, representing a cold start scenario. Under the same settings, our hybrid approach outperformed the baseline and CF methods (the proposed approach had higher precision and recall). This result indicates that the accuracy of the CF method can be improved using a more sophisticated knowledge base to support the personalization process. An increase in accuracy allows an improvement in the satisfaction with the overall result. In the second stage, we enlarge the scale of the recommender engine through several experiments and benchmarking processes to support different datasets. The four selected datasets, namely GeoLife [
28], Weeplaces [
29], Brightkite [
30], and Gowalla [
29], suitably represented the application of intelligent services in a smart community. The experimental precision and recall results indicated that the proposed hybrid technique can achieve up to an approximately 28% higher F-score than conventional approaches can. In general, the proposed personalized recommendation method outperforms other methods.
The rest of this paper is organized as follows:
Section 2 introduces the background of the study that involves SIoT architecture and a use case scenario, and various recommendation methodologies.
Section 3 illustrates the overall implementation and challenges of personalized user trajectory analysis in a smart community.
Section 4 describes the proposed personalized recommendation framework for smart communities, including its relevant components.
Section 5 describes the implementation of the proposed SIoT system as well as comparison and measurement criteria.
Section 6 presents the experimental results and a comparison of the proposed method with several benchmarking approaches.
Section 7 outlines the conclusions and future research directions.
3. User Trajectory Analysis
Figure 4 illustrates the overall implementation of personalized user trajectory analysis within a smart community. User movement trajectories are collected through indoor and outdoor positioning data. A common Global Positioning System (GPS)–enabled device is used to obtain location information from a satellite network. An assisted GPS is used for obtaining location information when the network device is in a location where the penetration of satellite signals is limited. Information for indoor positioning is obtained from Bluetooth beacons installed on the walls or ceilings of buildings and POIs [
80]. Several Estimote iBeacons (available from:
https://developer.estimote.com/ibeacon/) are deployed at the main entrance of buildings (
Figure 5) to capture the indoor trajectories of users. We adopt Wi-Fi-based trajectory alignment and calibration [
81] to improve the accuracy of indoor positioning. Data from the location logs are delivered to Firebase (mobile and web platform) for further processing.
Figure 4 also indicates that the internal and external IoT devices and services can communicate with each other through web service calls (e.g., REST and SOAP) as well as the UniCAT smart community app. At the physical layer, the fundamental IoT modules (e.g., the information sharing, e-commerce, location navigation, transportation, and social networking modules) embedded with sensing, actuating, processing, and networking capabilities can offer different types of services that can be used by users and things to accomplish everyday activities, as displayed in
Figure 6. Modern societies are heterogeneous, dynamic, and complex. People engage in interactions and establish unique social relationships with each other in communities developed according to several factors (e.g., common objectives, interests, needs, and influence). Social networking users interact and collaborate with each other to solve complex problems. Applications providing interactive and collaborative features are called SNSs. The concept of social networking can also be applied to IoT ecosystems. The social features of the IoT paradigm have given rise to a new concept of social networking with smart things and services, which is referred to as SIoT. The current high worldwide penetration of IoT applications has significantly increased the interaction between users and things. Thus, relationships are established not only between users but also between smart things and services.
An abstraction layer called the subcommunity network layer, which exists between the physical and global community layers, allows users and things from different communities to establish user–user, user–object, and object–object relationships based on several factors [
9,
21], such as common interests, common goals, friendships, and common owners. Service discovery plays an important role in SIoT environments because users and things require an efficient recommendation system to reduce the system load.
The recommender engine generates recommendations for users through the integration of various inputs from internal and external SIoT services. For instance, a dining place is recommended on the basis of trajectory analysis that takes into consideration a restaurant ranking provided by a social networking service (e.g., Foursquare). User personal preferences should be considered for the filtering of recommendations. As per the framework displayed in
Figure 4, we adopt NoSQL to capture most of the system data, such as the location logs, user transactions, object interactions, and user preferences. The data are stored on cloud computing platforms (e.g., Google Cloud Platform and Amazon Web Services) for further reference.
Section 4 presents the details of the recommender engine.
Issues and Challenges
The application of personalized recommendation in a smart community has considerable potential for future SIoT applications. Intelligent service discovery can produce new solutions to meet the growing and varying requirements of users. However, the following challenges remain to be addressed:
- (1)
Data collection: An efficient data-capturing model is required to represent different levels of diversity in user beliefs and the social relationships of users.
- (2)
Inference engine: A generic yet tailored approach (from generic to personalized) is required to offer users various customizable outcomes for recommendation.
- (3)
Dynamicity and scalability: A recommendation system that can be applied to different problem domains is required. The system should scale out or scale in when no direct evidence supports an outcome.
The following section describes how the aforementioned challenges are addressed with the proposed personalized recommendation framework based on the KDI model.
4. Personalized Recommendation for Smart Communities
Figure 7 displays the overall architecture of the proposed personalized recommendation framework, which is an extension of our research work in user trajectory analysis [
23]. We adopt the knowledge–desire–intention (KDI) model [
23] to collect user data explicitly (e.g., ratings for items) and implicitly (e.g., location history and number of orders) to profile users. The collected user profiles contained data that are filtered and sorted using the KDI model according to the assigned weight and confidence level. The KDI model yields a hierarchical representation of user data that allows different IoT applications to be integrated with it. In contrast to many conventional recommendation systems, the proposed model allows different types of commonly available smart community user data, such as data related to food preferences, daily activities, purchasing records, and other factors, to be stored inside a user profile. We propose a novel hybrid approach that involves filtering location-based recommendations tailored to users’ preferences. We apply link analysis (LA) [
24,
52,
53], which is an enhanced version of collaborative filtering (CF) [
26], on user trajectories to generate the top
N recommendations for various smart community services (e.g., information sharing and transportation). A filtering process is then performed according to the user profiles obtained using the KDI model. This approach combines the advantages of LA, which allows offline preprocessing of time-consuming and costly tasks prior to reasoning, and user profiling through the KDI model, which allows the generation of real-time personalized recommendations, to support SIoT environments that involve numerous dynamic user–object social relationships. The uniqueness of the proposed framework lies in the fact that it combines filtering and reasoning mechanisms to generate personalized recommendations in an SIoT environment.
Our proposed framework comprises the KDI modeling module [
27], location history modeling module, knowledge mining module, knowledge base, inference engine, and most importantly, the personalized recommendation module. As mentioned in earlier, we adopt the LA method proposed by Zheng et al. [
24] for the location history modeling and knowledge mining modules. The KDI modeling, location history modeling, and knowledge mining modules are executed offline in the proposed framework and preprocess user location logs and preferences. These offline components require higher load than the reasoning module, which is operated online. The personalized framework does not require the offline modules to be executed for every recommendation. The collected logs and preferences (both old and new) are reprocessed only after a certain amount of time (
Tp). Let
Tp be a dynamic threshold value that is determined by the percentage of new data added into the user profile and trajectory. For instance, when the accumulated new data (logs or preferences) are more than 2% of the total records, the weightage of the user beliefs and POIs must be updated to better reflect the user’s knowledge. Different metrics can be used to determine
Tp, which is based on environment factors. If updating is performed too frequently, the updates will not reflect the changes in human behavior because such changes usually require time to emerge.
4.1. Data Capture
The proposed system captures user or object data, such as location history, transaction, and preference data, which are converted into user or object profiles for use in KDI modeling and location logs for use in location history modeling. As displayed in Algorithm 1, the user data are first converted into location points (LPs) and location logs. For each object, the user or object profiles include the time spent at the LPs; the point frequency, which is the number of visits made to an LP; the point recentness, which is the most recent time the user visited the LP; and the LP velocity, which is displacement per time. The aforementioned data serve as the input of the KDI and location history modeling modules for further analysis.
Algorithm 1. DataCapturing |
Input: User transactions and preferences, userPref |
Output: Generated user profiles and location logs |
1 | LP = locationPointDetection(userPref); |
2 | userLogs = locationLogsGeneration(userPref); |
3 | Foreach user do |
4 | TS = durationCalculation(user, LP); |
5 | PF = pointFrequency(user, LP); |
6 | PR = pointRecency(user, LP); |
7 | LV = locationPointVelocity(user, LP); |
8 | userProfiles.add(TS, PF, PR, LV); |
9 | Return userProfiles, userLogs; |
4.2. KDI Model
Bloedorn et al. [
78] suggested the use of a hierarchical model rather than a flat set model for user profile because a hierarchical model enables the recommendation system to be more generic in capturing a variety of data. The hierarchy levels can be fixed or dynamic according to user preferences. A simple user profile can be constructed from a reference taxonomy, and a complex profile can be constructed through a reference ontology. The aforementioned statement is in line with the proposed user profiling approach. In this research, user profiles are the user or object information recorded in the smart campus application. The profiles include the user or object preferences, location histories, and personal information provided by the users. The user or object preferences are fed into a belief system, which is based on the belief–desire–intention (BDI) model. Other profiling methods, such as weighted keywords, semantic networks, weighted concepts, or association rules, can also be used in the proposed framework [
74]. We select the BDI model because this model is a type of computational model that resembles human reasoning [
82]. The KDI model, which is an advancement of the BDI model, advocates and emphasizes that human beliefs are the fundamental elements on which human decisions are made [
23]. In our experimental design, user preferences are set as beliefs that constitute a tuple with three attributes (item, weights, and level) and their corresponding values. Item refers to the smallest unit of data (e.g., color, place, or age) in a user or object profile, and weight represents the importance of the data unit through the calculation of parameters such as frequency, recency, and fixity. We adopt hierarchical belief modeling [
27] to represent progressive levels of belief. This strategy is different from that used in conventional content-based methods. Three levels, namely temporary belief (raw data), analyzed belief (information), and permanent belief (knowledge), are assigned to each data unit by referring to its confidence vector that accumulates over time. The personalized recommendation framework utilizes and compares the relevancy of objects according to the given beliefs when making any decision. Feedback from every action taken is collected explicitly from the user to update the weight of each belief in every recommendation attempt.
In
Figure 8,
#J-CR,
#K-FR, and
#L-SP represent a user’s beliefs in the
#J,
#K, and
#L domains, respectively. At the level of temporary belief, raw data are obtained from user records collected through interaction with various smart community apps. The frequency (
f) and recency (
R) of each belief are captured to determine their relative importance index. Only beliefs that achieve a certain level of importance (beyond a threshold value) are selected for further analysis. At the level of analyzed belief, a weight is assigned to each propositional belief, which is represented by the belief fixity (
Fb). Let
Fb indicate the confidence level of each user’s belief on the proposition. Moreover, an indicator of the reliability of a belief-forming process,
RMb, is calculated. This parameter indicates the degree to which a set of beliefs is formed from a reliable or truth-conducive belief-forming process. The parameters
Fb and
RMb are used for escalating the relevant beliefs to the next level. At the level of permanent belief, the output from the previous stage is used to obtain the Gettier-centered justification (
Jb) for each belief. Belief justification is the process of validating that a belief is connected to truthfulness and not to luck or coincidence. The threshold level of knowledge (
K) is calculated to determine the ‘preferred’ and ‘nonpreferred’ beliefs. For instance, any belief with
Jb greater than or equal to
K is selected and ranked in the knowledge base. A similar action is also performed for the LPs in the user trajectories. Then, the knowledge threshold (
K) becomes the reference point for all belief justifications in our knowledge base.
All beliefs are dispositional because they may be based on assumptions, fallacies, or impulses (all characterized by chance or uncertainty) and hence surrounded by doubts [
27]. A model or system should not be completely reliant on beliefs surrounded by doubts. Therefore, any decision or output produced by beliefs cannot be completely relied on [
83]. The KDI model aims to address this important drawback of a typical BDI model by considering knowledge as a more suitable element of reliable human decision-making than belief. The refinement or processing stages that the belief system undergoes in the KDI model are summarized and embedded in the proposed personalized recommendation framework.
4.3. Location History Modeling
In our personalized user trajectory recommender system, location history modeling involves deriving user trajectories from user location histories, as displayed in Algorithm 2. Location logs obtained from user location histories contain collections of GPS points. These points are connected sequentially according to their time series, and the GPS data are split into trajectories if the time interval between consecutive points exceeds a certain threshold (
∆T). A tree-based hierarchical graph (
TBHG) is used for modeling multiple users’ location histories [
24]. A TBHG integrates two structures, namely a tree-based hierarchy and a graph, on each level. The tree-based hierarchy (
H) is a collection of stay-point-based clusters (
C). The tree indicates the parent–children relations at different levels, and the graph indicates the peer relations among nodes at the same level.
Algorithm 2. LocationHistoryModeling |
Input: Collection of users GPS logs, userLogs |
Output: Tree-Based Hierarchical Graph (TBHG) |
1 | Foreach user do |
2 | trajectory = LogParsing(userLogs); |
3 | S = StayPointDetection(trajectory); |
4 | LocH = PersonalLocHis(S); //individual user |
5 | SP.add(S); //collection of stay points |
6 | H = HierarchicalClustering(SP); |
7 | Foreach level do //build a graph on each level |
8 | Foreach user do |
9 | g = graphBuilding(g, LocH); |
10 | G.add(g); |
11 | TBHG = (H, G); |
12 | Return TBHG; |
The trajectories are then converted into stay points. Stay points are geographic regions where a user has stayed over a certain time interval within a distance threshold. The dataset includes the stay points detected from users’ trajectories. By using a density-based clustering algorithm, the dataset is hierarchically clustered into some geospatial regions. Similar stay points from various users are assigned to the same clusters at different levels. Directed edges connect the tree-based hierarchy with users’ trajectories and clusters at the same level. If consecutive stay points on one path are individually contained in two clusters, a link is created between the two clusters in a chronological direction according to the time series of the two stay points. These clusters represent POIs.
4.4. Knowledge Modeling
For knowledge modeling, a HITS-based inference model is used to infer users’ travel experiences (hub score) and location interests (authority score) in a region, as depicted in Algorithm 3 (adapted from [
24]). HITS is a search-query-dependent ranking algorithm that is often used for web information retrieval. In the knowledge model, a user’s visit to a POI (cluster) is considered a directed link from the user to the location.
Algorithm 3. LocationHistoryInference |
Input: TBHG=(H,G) and users’ location histories, LocH |
Output: Users’ hub scores, S and locations’ authority scores, A. |
1 | S = A = ∅; |
2 | Fori = 1; i < |L|; i ++ //on each level |
3 | For j = 1; j < |C|; j ++ //on each cluster on the level |
4 | For k = i + 1; k ≦ |L|; k ++ //on each sub-level |
5 | C = LocationCollecting(k, c, H); |
6 | M = MatrixBuilding(C, LocH); |
7 | (x,y) = HITS-inference(M); |
8 | S = (x); |
9 | A = (y); |
10 | Return (S,A); |
A user is a hub if they have visited many locations, and a location is an authority if it is frequently accessed by many users. By using a power iteration method, final scores are generated for each user and location. A user has multiple hub scores for different regions. Moreover, a location has multiple authority scores specified by its ascendant clusters at different levels because each cluster of the TBHG specifies an implied region for its descendant clusters. The calculations for the hub and authority scores are performed offline to ensure the efficiency of the recommendation system:
Adjacent matrices (M) are constructed between users and locations according to the user access to the locations, which belong to the same ascendant cluster. A mutual reinforcement relationship exists between user travel experience (A) and location interest (S) in Equations (3) and (4). The subscripts i and j denote the ith level parameter of the jth cluster in the TBHG, Sl represents the lth location interest, and Ak represents the kth user travel experience. The score for each location sequence within a given region is calculated according to the travel experiences of users traversing the sequence and the locations of interest in the sequence. Because multiple paths begin from a location, the location interest is shared among all these paths. The location interest in different paths is influenced by the probability of users taking these paths. The results from knowledge modeling are location sequences that contain high scores. The resulting domain knowledge consists of interesting POIs and opinions of domain experts.
4.5. Recommendation Module
An inference engine is used in the recommendation module to make recommendations. The KDI model used in our personalized recommendation framework records the dataset in binary form to represent users’ likes and dislikes with respect to various propositional beliefs. This strategy allows the system to determine the frequency or number of belief occurrences. The time of visit is captured to determine the recency of users’ last visit to the object of belief. By using the theory of degrees of beliefs, belief fixities (
Bf) (firmness or tenacity of beliefs) and vulnerabilities to doubts (
VtD) are used in reaching the next stage of analyzed beliefs. The resulting knowledge value recorded in the knowledge base as well as the authority score is used by the inference engine to generate the recommendation, as displayed in Algorithm 4. Because the proposed framework aims to provide a generic reference model for implementation, users can also consider other bioinspired [
69,
70,
71] and probabilistic [
72,
73] methods to replace the inference engine. We must balance the complexity and applicability of the selected model in generating real-time recommendations based on different domain requirements.
Algorithm 4. KnowledgeInference |
Input: User-selected region and knowledge base |
Output: Sorted collection of POIs. |
1 | Fori = 1; i < |L|; i ++ //on each level |
2 | For j = 1; j < |C|; j ++ //on each cluster on the level |
3 | If region.contain(C[j]) |
4 | A.add(C[j].authority); |
5 | SP.add(C[j].poi); |
6 | Foreach SP do |
7 | k = KnowledgeKDI(SP); |
8 | K.add(k); |
9 | POI = SIoT-inference(SP, A, K); |
10 | ReturnPOI; |
According to the pseudocode (Algorithm 4), when a geospatial region is specified by a user, the inference engine determines the corresponding level of hierarchy in the TBHG and then retrieves the POIs (clusters) in the specified region. The authority scores of the clusters and the corresponding knowledge values obtained from the KDI model are retrieved from the knowledge base and used for ranking the POIs [
23]. Users can submit their satisfaction with each recommendation as feedback to the inference engine and knowledge base. The weight vectors for the relevant beliefs can be further fine-tuned according to users’ requests. This function ensures that the problem of personalized recommendation within a fixed boundary does not occur in the proposed framework.
The complexity of the recommendation algorithm is analyzed to investigate the possibility of its real-time implementation. Because the KDI modeling module is the core module of the proposed framework, the following analysis is performed. According to the pseudocode in Algorithm 4, POIs are searched for travel data according to their ascendant clusters (j) at different levels (i). Subsequently, the inference engine filters irrelevant POIs by referring to the beliefs and weights in the KDI knowledge base. Assume that C and L are the maximum number of clusters and levels, respectively. According to the aforementioned explanation, the complexity of (i, j) is O(m) and that of POI filtering is O(n). Therefore, the complexity of the overall knowledge inference algorithm for a personalized recommendation based on the KDI approach is O(m + n). The aforementioned analysis indicates that complexity does not become a negative factor that affects the real-time implementation of the recommendation algorithm in SIoT environment. The highest complexity of the proposed framework occurs during location history modeling. The complexity of the algorithm is O(x2y2) during the construction of a TBHG with x levels for y users. However, TBHG construction is a data preprocessing stage that is only revisited by the framework after sufficient new input is obtained in the knowledge base by referring to a threshold value (Tp). The time taken for every recommendation requested by a user is determined through the online operation of the inference engine only. Thus, the complexity of real-time recommendation is linear (O(m + n)) in the proposed framework.
5. Implementation and Measurement
To evaluate the effectiveness of the proposed recommendation framework, the GeoLife [
28], Weeplaces [
29], Brightkite [
30], and Gowalla [
29] public datasets (as shown in
Table 2) are used to determine the precision and recall ratios. The GeoLife dataset includes tracking data for 182 users in Beijing over 3 years. The dataset comprises 17,621 recorded trajectories. Each trajectory log is a sequence of time-stamped points that contains latitude, longitude, and altitude information.
Weeplaces is a website that aims to visualize users’ check-in activities in Location-Based Social Networks (LBSNs). The Weeplaces dataset is generated using data crawled from Foursquare. The dataset contains 971,309 POIs generated by 15,799 users. Brightkite is an LBSN that enabled users to check-in to places and see who else had visited the location. Brightkite is acquired and its operations discontinued by Limbo. The Brightkite dataset is collected through Brightkite public Application Programming Interfaces (API) and consists of data from 4,491,143 check-ins by 58,228 users. Gowalla is an LBSN that was acquired by Facebook in December 2012 [
30]. The Gowalla dataset includes user profiles, location profiles, and check-in history collected prior to 1 June 2011, through the Gowalla public APIs. The dataset contains 2,844,076 POIs generated by 319,063 users:
To calculate the precision and recall ratios, datasets must be divided into training and testing sets. In this study, the data for the final 8 months are included in the testing set and the remaining data are used for training. The training set is used to learn user preferences and construct the recommendation model. The system is then evaluated by examining whether it could suggest sites visited by a user within the querying region according to the training data:
Precision, given by Equation (5), is the fraction of all recommended items that are relevant, and recall, given by Equation (6), is the fraction of all relevant items that are recommended. Precision (
p) and recall (
r) are measured as proportions of true positives (
tp), false positives (
fp), and false negatives (
fn). The parameter
F1 is the weighted average of precision and recall. Both false positives and false negatives are considered in calculating
F1 [Equation (7)] [
84]. The positive real
β entails the selection of
β such that recall is
β times as important as precision. The parameter
Fβ determines the effectiveness of retrieval with respect to a user who attaches
β times as much importance to recall as precision.
Baselines and Methods
Ranking by frequency (RF) method: The more frequently people access a location, the more interesting this location might be. The visiting frequency of a location is the ratio between the number of users visiting the location and the time span of their visits (i.e., from the first day at least one user accesses the location to the last day at least one user accesses the location).
LA: LA is a generic recommendation approach in which a location is more likely to be recommended if a higher number of experienced users (expert users) have visited the location.
Hybrid approach: This approach integrates LA with a KDI model to provide personalized recommendations based on user preferences.
6. Experimental Results
To examine the user satisfaction of the proposed trajectory analysis method, we use four public datasets, namely the GeoLife, Weeplaces, Gowalla, and Brightkite datasets, for benchmarking. Precision–recall analysis is performed to measure and compare the performance of the proposed recommendation method, LA method, and RF method. The RF method is used as the baseline method (i.e., it acted as a reference for the other two approaches). LA is a conventional approach in which most of the time recommendations are applied. The proposed hybrid approach integrates the advantages of LA and also employs a user belief system, namely the KDI model, to support the personalization process. Thus, the hybrid approach should outperform the other approaches.
6.1. Comparison of Precision and Recall Measurements in Individual Datasets
Figure 9 displays the results obtained using the GeoLife dataset. The performance of the hybrid and LA methods for the GeoLife dataset (
Figure 9) is inferior to that for the other datasets.
This result is obtained because the GeoLife dataset does not include a ready list of POIs (unlike the other three datasets) and POIs are generated from user stay points obtained from user trajectory logs. The GeoLife dataset mainly includes location data for Beijing over 3 years. The dataset comprises 17,621 trajectories generated by 182 users. The proposed hybrid approach outperformed the LA and RF methods when POIs are generated from user location logs.
The Weeplaces dataset comprised data crawled from the Foursquare platform, which is a location data platform famous for its city guide application. The dataset contains 971,309 POIs for 7,658,368 check-ins generated by 15,499 users, with most check-ins being concentrated in a specific region. The Weeplaces dataset is larger than the GeoLife dataset. However, because the Weeplaces dataset includes suitable POI data, higher precision and recall values are obtained for the three methods with the Weeplaces dataset than with the GeoLife dataset. As depicted in
Figure 10, the precision values of the hybrid, LA, and RF methods are 0.43, 0.40, and 0.24, respectively, and the recall values of these methods are 0.62, 0.58, and 0.15, respectively. Thus, the hybrid method outperformed the other two methods.
Brightkite and Gowalla are large LBSNs. The collected user and location profiles in the Brightkite and Gowalla datasets comprise well-maintained and appropriate descriptive information. These datasets are collected through public APIs for locations throughout the world. The Brightkite dataset consists of 4,491,143 check-ins generated by 58,228 users, and the Gowalla dataset contains 36,001,959 check-ins generated by 319,063 users. Thus, the Gowalla dataset is the largest of the four datasets used. Similar to the Weeplaces dataset, the Brightkite and Gowalla datasets also provide POIs, which helped to increase the precision and recall rates for the three adopted methods. The proposed hybrid approach marginally outperformed the conventional LA method for the aforementioned two datasets, as displayed in
Figure 11 and
Figure 12.
For the Gowalla dataset, precision values of 0.658, 0.632, and 0.240 are obtained with the hybrid, LA, and RF methods, respectively, and for the Brightkite dataset, precision values of 0.657, 0.531, and 0.448 are obtained with the three methods, respectively. For the Gowalla dataset, recall values of 0.512, 0.448, and 0.150 are obtained with the hybrid, LA, and RF methods, respectively, and for the Brightkite dataset, recall values of 0.619, 0.761, and 0.314 are obtained with the three methods, respectively.
6.2. Measuring F1 for GeoLife, Gowalla, Weeplaces, and Brightkite Datasets
The results in
Figure 13 accord with the proposed hypothesis that a user belief system can benefit the recommendation. As displayed in
Figure 13, the average
F1 values of the proposed hybrid approach are up to 27.95%, 3.98% higher than those of the RF and LA methods for the four adopted datasets. An improvement of 3.98% may not seem impressive, but we should analyze the overall performances of the methods from various perspectives to determine the significance of the improvement.
First, the proposed method outperformed the other two approaches over four datasets, where each dataset represented different domains in the actual environment. The proposed approach is adaptable to different data conditions. Second, a “personalized” recommendation method that captures user or object preferences over time should gradually improve the recommendation accuracy of the system. Such improvement can be achieved because the inference engine can utilize more information over time to make a precise recommendation. One possible issue that may arise over time is the over-tuning of the inference engine with overloaded information. However, a feedback mechanism and satisfactory monitoring from users can easily help prevent this problem.
Significantly lower precision and recall scores are obtained for the RF method than for the other two approaches. This result is obtained because unlike the other two methods, the RF method disregards individual user experiences. The proposed hybrid approach marginally outperformed the LA method due to the integration of the KDI model, which considers not only experienced users’ trajectories but also user behaviors simultaneously.
7. Conclusions
In this article, a personalized recommendation framework suitable for SIoT is proposed. A use case scenario in a smart community is used to describe SIoT deployment. The overall architecture of user trajectory analysis, which includes the indoor and outdoor positioning modules, internal and external IoT devices and services, the data-capturing module, the recommender engine, and the multilayer SIoT community model, is described in detail. For further analysis, the proposed personalized recommendation framework for smart communities is decomposed into several modules, namely the KDI modeling module, location history modeling module, knowledge mining module, knowledge base, inference engine, and personalized recommendation module. The proposed hybrid recommendation algorithm is implemented in a smart community with several SIoT applications and services. User trajectories over several years are collected through different community services, such as e-commerce and location navigation services. The collected data and the selected benchmarking datasets are used for performance analysis. Three recommendation approaches, namely the RF method (baseline approach), LA method (conventional approach), and hybrid method (proposed approach), are examined in this study. According to the precision and recall results, the proposed personalized recommendation method achieved an average of up to 28% higher satisfaction for users compared with the other two approaches. Thus, the proposed method provides more accurate user recommendations than the other approaches do.
The proposed personalized recommendation algorithm is based on user profile and belief systems that might not exist in some recommender engines with simple designs. In a future study, we plan to focus on trustworthiness management to ensure reliable interactions between users and thing to reduce exposure to malicious objects. Several addition topics, such as the extension of smart community coverage by providing recommendations for locations with different cultures, time-slots and environmental settings, can be examined in future studies. Deep learning can be used to model SIoT behaviors for delivering suitable recommendations in service discovery and composition.