Tour-Route-Recommendation Algorithm Based on the Improved AGNES Spatial Clustering and Space-Time Deduction Model

: This study designed a tour-route-planning and recommendation algorithm that was based on an improved AGNES spatial clustering and space-time deduction model. First, the improved AGNES tourist attraction spatial clustering algorithm was created. Based on the features and spatial attributes, city tourist attraction clusters were formed, in which the tourist attractions with a high degree of correlation among attributes were gathered into the same cluster. It formed the precondition for searching tourist attractions that would match tourist interests. Using tourist attraction clusters, this study also developed a tourist attraction reachability model that was based on tourist-interest data and geospatial relationships to conﬁrm each tourist attraction’s degree of correlation to tourist interests. A dynamic space-time deduction algorithm that was based on travel time and cost allowances was designed in which the transportation mode, time, and costs were set as the key factors. To verify the proposed algorithm, two control algorithms were chosen and tested against the proposed algorithm. Our results showed that the proposed algorithm had better results for tour-route planning under different transportation modes as compared to the controls. The proposed algorithm not only considered time and cost allowances, but it also considered the shortest traveling distance between tourist attractions. Therefore, the tourist attractions and tour routes that were suggested not only met tourist interests, but they also conformed to the constraint conditions and lowered the overall total costs.


Introduction
Tourists are the core of tourism activity. A key issue of smart tourism is how to improve tourist satisfaction and provide the best experience. A complete tourism activity cycle includes pre-travel, traveling, and post-travel activities. The pre-travel experience includes itinerary, planning, and tour-route searching, etc. The travel process itself includes visiting tourist attractions and the travel between locations, etc. The post-travel activity includes the evaluation and feedback on the tourism experience as a whole. In the whole tourism activity cycle, the pre-travel experience is the most important factor to influence tourists' satisfaction and, therefore, their subjective evaluations regarding the quality of their experiences. Tourists will have spent a certain amount of time and cost on their experiences. Therefore, devising and suggesting tour routes according to tourists' needs and desires while realizing the minimum time and cost as well as the maximum benefit is key to optimal tour planning.
In a tour, the tourism objects are tourist attractions. Searching the very tourist attractions that accurately match the tourists' needs is the critical step for the planning and recommending of tour routes. Tourists' needs have large discrepancies, while the tourist attractions distributed in a city also have different feature attributes and spatial attributes, for which reason, each tourist attraction has relatively large different capacities on meeting tourists' needs. The diversity of tourists' needs and tourist attractions' attributes makes the searching process complex. Thus, rapidly confirming the interested tourist attraction groups according to the tourists' needs can improve the searching algorithm's accuracy and efficiency, so that applying the effective method to generate tourist attraction groups is the key step to search the accurate tourist attractions. In data mining technology, a clustering algorithm can group spatial dots. It absorbs the spatial dots which have the similar attributes into the same group and divides the large scale data mining into a smaller scale one in the group, which can improve the algorithm efficiency. This paper uses the clustering algorithm to generate city tourist attraction clusters in accordance with tourist attraction attributes, and it provides the algorithm basis for searching accurate tourist attractions. There are many kinds of clustering algorithms. This paper uses AGNES as the basic algorithm to set up the clustering model. The agglomerative nesting (AGNES) algorithm is a hierarchical clustering method that operates from bottom to top. It sets the elements as the bottom layer in the spatial distribution and gathers them from bottom to top according to a defined criterion. The AGNES algorithm is a single-link method where each cluster is represented by its arbitrary elements. Therefore, the degree of correlation of two clusters is determined by the two values with the highest degree of correlation in each cluster. The clustering process begins at the discrete distributed bottom layer and gathers each dot within the clusters and ends with the preset number of clusters. A traditional AGNES algorithm is operated with the same spatial distance as the criterion. The reasons for this paper to use AGNES are as follows. First, AGNES is simple and it is easy to implement. AGNES is a naive clustering algorithm, which has a concise principle and process. Its starting and ending conditions are definite and the selecting of the starting seed point is simple. In the clustering process, it only needs to judge the dispersion between the seed point and the non-seed point. Compared with other clustering algorithms, it is more accessible and easier to implement. Second, AGNES has relatively low spatial complexity and time complexity. It has a faster operating rate and consumes less computer memory. Third, AGNES is very suitable for the clustering on small scale dataset. In this paper, the research objects are city tourist attractions; it forms a typical small scale dataset, thus the AGNES is feasible. Fourth, AGNES is more flexible and can realize the multiple layer clustering structures on different granularity by setting different parameters. It has no strict requirements on the samples inputting sequence and can realize the synchronous clustering from different dots to reduce the convergence time.
In tourism clustering research, [1] provided a general introduction on the clustering method, including the AGNES clustering algorithm. In [2], the researchers applied the hierarchical cluster analysis to a set of Indonesian tourism sites in and around Malang City, Malang Regency, and Batu City using the AGNES algorithm to optimize a search engine that could assist tourists when choosing tourist attractions under certain constraints. In another study [3], the AGNES algorithm was applied to the data from the online platform, Airbnb. The collaborative economy of tourism hosts based on their geographic distribution was studied. The city of Guanajuato, Mexico, was selected as the subject city for convenience purposes, and the main touristic attractions were used as parameters to conduct the analysis. According to [4], an ontology-based clustering method was used to analyze the qualitative factors from a semantic perspective to define tourist segments and understand why tourists travelled to a particular destination in the Catalonia region of Spain, and the researchers reported better results using this method as compared to classic clustering algorithm methods. In the literature, the proposed ontology-based clustering method was derived from an extension of the AGNES clustering algorithm. Researchers in [5] designed an original approach to characterize the daily behaviors of tourists by analyzing the sequences of places that were visited by tourists per day, in which the geolocation information of tourists on photo-sharing websites was used as the data, from which the AGNES clustering algorithm formed clusters and carried out the experiment. The study in [6] proposed a point-of-interest (POI) recommendation method to plan tourism routes. Different clustering methods have been developed in the design and implementation of tourist route-information recommendation systems based on user POI indices, including AGNES clustering. In [7], the AGNES clustering algorithm was used to identify residents' dependence on public transport. It provided a potential method for choosing the transportation mode for tourists. The researchers in [8] applied semantic clustering to extract tourist preferences. It compared the semantics of tourist preferences with tourist attraction attributes and provided tourist attraction suggestions. The researchers in [9] used the partitioning clustering method to find the nearest tourism destination according to the extracted geotagged photograph-location data. The researchers in [10] studied the cluster-mapping procedures for tourism regions based on the fuzzy-clustering method. This method proved to increase the identification accuracy of the tourism clusters. The researchers in [11] developed a Bali tourism information system by using web-scrapping and clustering methods. The clustering algorithm was used to process the word-text data and output word clusters, and then performed clustering on the website. The researchers in [12] proposed a tourist-preference clustering method that was based on tourist facial and background information that were extracted from photographs. The clustering method was used to generate tourist classifications. The researchers in [13] used spatial clustering methods to mine tourist destinations and preferences, in which the regions of tourist attractions for each tourism category were derived by the clustering algorithm. The researchers in [14] used a density-based spatial clustering algorithm to study tourist behavior, and by extracting the tourist behaviors, the tourism hot-spots were extracted as they related to tourist behavior. In [15], the clustering algorithm was used to generate tourist-attraction clusters via network and geographic information system (GIS) analyses, and three tourist-attraction clusters were extracted.
For tour-route algorithms, the researchers in [16] proposed a tour-route-recommendation method using the multiple-criteria tensor model fusing time-space information. The researchers in [17] combined factors of time and space and used the tourist-attraction photographs that were posted on a website by previous tourists to set up a tour-routerecommendation model. The researchers in [18] applied a heuristic method for tour-route recommendation based on urban traffic monitoring. The researchers in [19] employed social-network analysis combined with deep-learning theory to develop a tour-routerecommendation model. The researchers in [20] created a tour-route-recommendation model that was based on Smart Agent technology. In [21], an individualized tour-routerecommendation model that was based on POI functionality and accessibility was proposed, and it determined tourist physiological and physical conditions as the important reference criteria. The researchers in [22] suggested an individualized tour-route-recommendation model that was based on social networks' geographical context cognition, and it used social relationships and trust networks among tourists as the important indices. The researchers in [23] developed a tour-route-recommendation model that was based on improved collaborative filtering technology. The researchers in [24] designed a tour-route-recommendation algorithm that was based on dynamic clustering to counter the challenge of data scarcity. In [25], a tour-route-recommendation algorithm was designed that was based on deepinterest label mining and association rule clustering. The researchers in [26] also designed a tour-route-recommendation model that was based on a collaborative filtering algorithm. The researchers in [27] suggested a tour-route-recommendation model that was based on geotagging and temporal divisions where the core principle that included user and group ratings as well as time and distance. The researchers in [28] proposed a tour-route-recommendation method that was based on tourist time-space behavioral constraints, and it used temporal and spatial constraints as the important factors. The researchers in [29] proposed a tourroute-recommendation method that was based on a combined recommendation algorithm including hybrid-interest modelling and a heuristic tour-route-planning algorithm. In [30], an energy-aware clustering method was used for mobile application, which provided a method that efficient routing, resource allocation, and energy management can be achieved through clustering of mobile into local groups. Ref. [31] collected tourists' traveling data on the website and analyzed the tourists' behaviour and, based on the website tourists' mobile data as well as the mined POIs, it set up the tour route recommendation algorithm. Ref. [32] studied the importance of the mobile devices and location-based services. Based on the big data, such as tourism data, location predicting could be realized, which could be used in studying tourists' mobility and the tendency on the traveling behaviour.
According to the literature review, tourism clustering research has predominantly focused on tourist attractions and tourist clustering. As seen in [1][2][3][4][5][6][7], clustering algorithms have been used in tourism research for POI extraction, data mining, algorithm modeling, transportation behavior, etc. The other clustering methods in [8][9][10][11][12][13][14][15] indicated that spatial and attribute data of tourist attractions were the main targets that were used to generate proper tourism categories, extract tourist preferences, and recommend appropriate tourist destinations. The studies concerning tourist-attraction data extraction and tour-route algorithms that were used in [16][17][18][19][20][21][22][23][24][25][26][27][28][29] focused on three specific aspects. Refs. [30][31][32] tended to study the big data that were obtained from social networks on mobile devices and website. The big data could be used as the basic data to do clustering on tourist attractions and tourists or could be used to study the tourists' mobility and traveling behaviour. First, they examined the recommendation algorithm itself, including data scarcity and "cold start" issues. The data scarcity means that, in a database, the most valuable and useful data are missing, or the majority of the data are zero. The "cold start" means that, in a recommendation system, the newly registered users and new added products lack historical data, and they could be hardly recommended to the new registered users. Second, they developed an improved algorithm that was based on traditional recommendation methods such as the collaborative filtering algorithm, where historical data that were extracted from users and groups with similar interests to the current user are identified to customize the recommendations for the current user. Third, they mined historical tourist-interest data to recommend tour routes for the current tourists. Common methods that are used for this process include tourist label, photo, and evaluation data mining. Overall, the existing methods focused on improvements in algorithm performance, historical data, and the improvement on solving the problems such as "cold start" and data scarcity but overlooked tourist needs, attraction attributes, real-world geospatial environment, and tour-route searching, so they have typically yielded fuzzy results that lacked sufficient accuracy.
As indicated above, the challenges in tour-route planning remain. First, the research on tourist needs and tourist-attraction attributes is insufficient, especially in terms of real-world concerns, such as time and cost. Second, since tourist-attraction clustering provides preconditions for matching tourists' interests, there is no effective and reasonable mechanism for urban tourist attraction clustering, and the clustering criterion is merely the spatial distance, neglecting the inner attributes such as tourist attraction classification, popularity, optimal traveling time, and traveling fee. For the traditional AGNES clustering algorithm, a specific tourist's individualized needs and interests were not fully considered. Third, the research on the space-time deduction on the traveling process is insufficient, in which the space-time deduction means a tourist's traveling activity in a whole tour route will be constrained by time, space, and cost, and it is a dynamic deduction process on the traveling cost. The more tourist attractions to be visited, the more time, traveling distance, traveling fee, etc., will be produced. The time and cost play key roles on recommending tour routes. Fourth, under the conditions of fixed time and cost budget, the transportation mode determines the selected tourist attraction quantity and the planned tour route. The existing methods seldom study the mixing transportation modes with tour route planning.
Therefore, this study designed and tested a tour-route-recommendation algorithm that was based on an improved AGNES spatial clustering and space-time deduction model, focusing on precise interest-matching, urban tourist-attraction spatial clustering, spacetime deduction of the traveling process, and precise tour route searching based on the transportation mode. Compared with the previous studies, the proposed algorithm has differences and novelties. First, the AGNES method is not merely and directly used as a clustering tool. In this study, the AGNES was set as the research target and content, whereby the improved AGNES algorithm was developed. It is the precondition of modeling the tour route algorithm. Second, in the process of developing the improved AGNES clustering algorithm, the tourist attractions' feature attributes were set as the critical parameters in forming the clustering criterion function, conforming to the tourist activity in matching tourist interests, while previous studies had only considered the spatial attributes. Third, different from the research line in which the location-based social network was exploited to understand human mobility and people behavior by mining check-in patterns, this research was based on the city tourist attractions' attributes and one tourist's specific interests. The former studies were performed on tourism big data, and they tended to mine the tourists' moving behaviors and find out the potential interested tour routes. The proposed method is a one-one mode in which tourist interests were studied and set as the specific preconditions to extract certain tourist attractions, and then the path-searching algorithm was used to find out the optimal tour route. Thus, they are different in algorithm mechanisms. Fourth, the studies on the tourist attraction and tour route recommendation are based on the fuzzy recommendation, while the proposed algorithm is under the consideration and constraint of the real-world city tourism environment, road conditions, and transportation modes, thus it could find out the global optimal routes that match the tourists' interests within the limited time and space complexity. Figure 1 shows the research work and the structure of the paper.

The Improved AGNES Tourist Attraction Spatial Clustering Model
The features and spatial attributes of urban tourist attractions can vary widely. The feature attributes are the characteristics of one tourist attraction that differ from another one, such as tourist attraction classification, popularity, optimal traveling time, traveling fee, etc. The classification labels represent the characteristics or features of one tourist attraction, they determine the tourist attraction's category, and they are typically mined from feature mapping data. The popularity is the average attraction capacity of one tourist attraction, which is determined from the online "big data" sources; for example, "Ctrip", "Fliggy", and "Qunar", among others, provide popularity data for tourist attractions in China. The optimal traveling time and cost stand for the basic time and cost that are needed by the tourists to visit one tourist attraction. Each tourist attraction has various feature attributes that are associated to quantified values. The spatial attributes consider the geospatial location and positioning of a tourist attraction, including the discrete features and the indirectly correlated features. The discrete features should be considered independently for all tourist attractions [33]. Indirectly, the correlated features represent that each tourist attraction is connected with another one by urban roads and tourists can move between two tourist attractions freely. Tourist attraction attributes determine that tourist attractions have a close or distant relationship with each other, bringing different capacities for satisfying tourist interests. The precondition of selecting the tourist attractions to be visited is to confirm the classification that meet the tourist needs and interests. Therefore, the urban tourist attractions should be clustered primarily.

The Foundation of Tourist Attraction Attribute Label Matrix Model
The preconditions for the clustering algorithm confirmed the tourist attraction attributes and developed the association model that would measure the degree of correlation among the attractions. The degree of correlation among their attributes would be determined by their features as well as by their spatial factors. Thus, the clustering model should combine with the feature attribute factors and the spatial attribute factors [34].
The arbitrary typical tourist attraction in a tourism city is the tourist attraction element s (i) , and it belongs to one certain tourist attraction classification. All the elements of s (i) form an entire research range, and it is the tourist attraction research domain S. The domain S contains different types of tourist attractions, and it can be divided into several classifications.
The inner characteristics of one tourist attraction are the feature attributes, and they are noted as t 1(i1) . The feature attributes influence tourist choices on the interest tendency and intelligent system's search results of tourist attraction clusters and specific tourist attractions. The factor i 1 is the footnote of the feature attribute. Meanwhile, the touristattraction geolocation is the spatial attribute factor t 2(i2) , and i 2 is the factor's footnote. The tourist attraction attributes include m number of feature attributes t 1(i1) and n number of spatial attributes t 2(i2) , i 1 ∈ (0, m] ⊂ Z + , i 2 ∈ (0, n] ⊂ Z + . Each factor t 1(i1) or t 2(i2) is one feature attribute label and spatial attribute label of s (i) , and collectively, the tourist attraction attribute label.
For the tourist attraction attribute label matrix T, it is formed by m number of t 1(i1) and n number of t 2(i2) and determines the tourist attraction's features and spatial attributes as well as influences the tourists' interest tendency. The matrix T meets the following conditions: The matrix row is the vector t 1(i1) or t 2(i2) . The matrix column is the element of the vector t 1(i1) or t 2(i2) . The matrix contains m + n number of rows and α number of columns. The rows from 1 to m relate to the vector t 1(i1) ∼ i 1 ∈ (0, m] ⊂ Z + , the m + 1 to m + n rows relate to the vector t 2(i2) ∼ i 2 ∈ (0, n] ⊂ Z + . One tourist attraction relates to one matrix element distribution. Equation (1) is the general formula of the matrix T and its element distribution. 2) . . .
The feature attributes and spatial attributes are quantified. The feature attributes include tourist attraction classification t 1(1) , popularity t 1(2) , optimal travel time t 1(3) , and traveling fee t 1 (4) . The spatial attribute mainly relates to the longitude and latitude coordinates (t 2(1) , t 2(2) ) ∼ (l, B) of the tourist attraction [35]. The feature attributes t 1(i1) and the spatial attributes t 2(i2) are quantified, where t 1(1) is tourist attraction classification; t 1(2) is popularity degree, noted as h o , h o ∈ (0, 1) ⊂ R + , representing the users' average evaluation scores on the website; t 1(3) is the optimal travel time, noted as t b , unit: hour; and t 1(4) is the traveling fee (cost), noted as c o , unit: CNY, ¥ yuan. The spatial attributes include longitude t 2(1) ∼ l and latitude t 2(2) ∼ B. Each attribute factor includes a specific data value range which forms the tourist attraction feature attribute label vector t 1(i1) and spatial attribute label vector t 2(i2) . The classification factor is determined by the tourist attraction's inner attributes and it is the critical index to distinguish different tourist attractions and an important reference for a smart system to select a tourist attraction cluster and specific tourist attractions. The popularity degree represents the average preference of tourists on a tourist attraction s (i) . The optimal travel time represents the most suitable time for tourists to visit a tourist attraction s (i) . The traveling fee represents the minimum cost for tourists to visit a tourist attraction s (i) such as the fee for the entrance ticket. The formed , each label vector includes the specific index t 1(i1,j1) or t 2(i 2 ,j 2 ) . Quantify the index t 1(i1,j1) or t 2(i 2 ,j 2 ) as follows, in which the classification factor is also quantified into a specific value.
When all the feature attribute label vectors t 1(i1) for all the elements s (i) in domain S are confirmed, the correction parameter for each vector t 1(i1) is then defined to normalize all the values.
The impact of each feature attribute label vector impact on calculating the degree of correlation between the tourist attractions should be in the same order of magnitude, and thus the feature attribute label vector normalized parameter δ 1(i1) is generated, and all the labels are normalized according to a range of (0, 1]. According to the range of the vector t 1(i1) , each normalized parameter δ 1(i1) is confirmed as follows: The parameter δ 1(i1) is used to normalize each vector t 1(i1) in the matrix T to obtain a new normalized matrix T δ . As compared to the matrix T, the elements in the matrix T δ are all normalized except for the vector t 2(i2) . Equation (2) is the general formula for the matrix T δ .
Based on the tourist attraction attribute label matrix T and the normalized matrix T δ , the tourist attraction research domain S clustering algorithm is created.

The Tourist Attraction Domain Clustering Algorithm Based on the Improved AGNES Algorithm
The aim of the tourist attraction domain clustering was to obtain a cluster with a high degree of correlation among the attributes, realizing that the tourist attractions in the same clusters have a high degree of correlation among the attributes while those in different clusters have a low degree of correlation among the attributes, and finally to guide the smart system into precisely matching tourist interests. The clustering process was the automatic process driven by data, and the clustering criteria could differentiate according to the different clustering targets. When a spatial dot is the only a location point in a coordinate system, a traditional clustering algorithm will assume the spatial distance as a singular criterion. Tourist attractions have spatial attributes and feature attributes, and thus the criteria for tourist-attraction clustering should combine both factors.
The k number of elements s (i) in the domain S are clustered by the clustering algorithm, and the tourist attractions s (i) , which have a high degree of correlation among the attributes and are in the same cluster S (i) , while the tourist attractions s (i) and ¬ s (i) , which have a low degree of correlation among the attributes, are in the different clusters S (i) and ¬ S (i) , k ∈ N. The cluster's element is noted as s (i,j) , i is the footnote of the cluster S (i) , j is the footnote of the element in the cluster S (i) . In all, it is supposed that the clustering algorithm forms p number of clusters, p ∈ N and p << k. Assume that the cluster The elements s (i,j) in the same cluster S (i) have a high degree of correlation among the attributes, and elements s (i,j) in different clusters S (i) and ¬ S (i) have a low degree of correlation among the attributes. An arbitrary cluster ∀S (i) contains at least one element s (i) . Arbitrary one element ∀s (i) in the domain S only belongs to one certain cluster S (i) . Clusters S (i) and other cluster ¬ S (i) have no intersection, but in the aspect of spatial analysis, the clusters may have a buffer overlap in the city space. The union of all the clusters S (i) is the domain S, and i ∈ (0, p] ⊂ N. In the domain S, there are at least two clusters, that is p ≥ 2. Whether the tourist attraction element s (i) should be absorbed into the cluster S (i) is determined by the objective function ξ (s (i1) ,s (i2) ) among s (i) and other tourist attractions. The function is determined by several clustering factors, including the feature attribute factors t 1(i1) and the spatial attributes factors t 2(i2) . As to the two independent tourist attractions s (i1) and s (i2) , their degree of correlation includes their geospatial relationship and the spatial attributes correlation, and thus their neighborhood relationship is determined by consensus of the two factors. Therefore, the matrix T and matrix T δ both contain the factors classification t 1(1) , popularity degree t 1(2) , the optimal travel time t 1(3) , and the traveling cost t 1(4) , as well as longitude and latitude (t 2(1) , t 2(2) ) ∼ (l, B). The improved Minkowski distance is applied to for the objective function, and the clustering criteria should consider features and spatial attributes simultaneously. The pseudo-code of the process to create the function ξ (s (i1) ,s (i2) ) . (Algorithm 1) is shown as follows.

4:
Step 4: Confirm the Minkowski distance d(x, y) as the objective function ξ (s (i1) ,s (i2) ) The Minkowski distance between the two samples x and y is shown in Equation (3). The Minkowski distance is used to define the objective function ξ (s (i1) ,s (i2) ) , shown as Equations (4) and (5). According to the function ξ (s (i1) ,s (i2) ) , the norm value of the function is used to judge whether the tourist attractions s (i1) and s (i2) belong to the same cluster. Therefore, the function ξ (s (i1) ,s (i2) ) value is set as the clustering criterion.
In the process of generating clusters, the k number of elements s (i) are dynamically stored into one matrix K ∧ (p × maxk (i) ) in the cluster code sequence by the clustering algorithm. Each row in the matrix dynamically stores the related cluster's elements. When the clustering algorithm ends, all the tourist attraction elements are consistently stored in the matrix K(p × maxk (i) ) according to the cluster code i and cluster's element code j. The matrix row number is p, the column number is maxk (i) , in which k number of elements are used to store tourist attractions, while the other p × maxk (i) − k number of elements are stored as 0. The row rank meets at rank(K ∧ (p•) ) ≤ p and rank(K (p•) ) ≤ p. The column rank meets at rank(K ∧ (•maxk (i) ) ) ≤ maxk (i) ) and rank(K (•maxk (i) ) ) ≤ maxk (i) ). The matrix K(p × maxk (i) )) has at least two non-empty rows. Equations (6) and (7) relate to the matrix K ∧ (p × maxk (i) ) and K(p × maxk (i) ), in the formula, s (i) ∧ represents the element with random storage value.
The tourist attraction clustering objective function ξ (s (i1) ,s (i2) ) is set as the improved AGNES clustering algorithm criterion. The k number of elements s (i) in the domain S are clustered into p number of clusters and stored into the matrix K(p × maxk (i) )).
In the improved AGNES clustering algorithm, in a single instance of dot gathering from the bottom to top, a seed point element s (i) * is chosen as the tourist attraction representing a certain cluster S (i) . Take the element s (i) * as a criterion to calculate and judge the objective function ξ (s (i1) ,s (i2) ) to confirm another element to be gathered and form the cluster. The tourist attractions that are not the seed point are noted as ¬ s (i) * .
In one instance of gathering from bottom to top, if one point ¬ s (i) * belongs to the cluster S (i) , the point ¬ s (i) * is absorbed into the cluster S (i) , the edge l(s (i) , ¬ s (i) ) connecting s (i) * and ¬ s (i) * is generated. When the clustering algorithm ends, the k (i) number of tourist attractions as well as the gathered k(i) − 1 number of topological edges l(s (i) , ¬ s (i) ) in the cluster S (i) form a cluster structure tree T r (S (i) ) . The spatial range that is expanded by the tree T r (S (i) ) forms the cluster spatial buffer r a(S (i) ) . The edge l(s (i) , ¬ s (i) ) and the tree T r (S (i) ) show the visualized process of the improved AGNES algorithm. The buffer r a(S (i) ) is the visualized range for each cluster. Since the objective function ξ (s (i1) ,s (i2) ) contains both feature attributes and spatial attributes, different buffers r a(S (i) ) may intersect. Figure 2 shows the spatial relationship among the cluster S (i) topological edge l(s (i) , ¬ s (i) ), cluster structure tree T r (S (i) ) , and the cluster spatial buffer r a(S (i) ) . Figure 2a is an edge l(s (i) , ¬ s (i) ), Figure 2b is the tree T r (S (i) ) which is formed by several edges l(s (i) , ¬ s (i) ), and Figure 2c is the buffer r a(S (i) ) which is formed by the cluster structure tree T r (S (i) ) .

Figure 2.
The spatial relationship among the cluster S (i) topological edge l(s (i) , ¬ s (i) ), cluster structure tree T r (S (i) ) , and the cluster spatial buffer r a (S (i) ) . (a) is an edge l(s (i) , ¬ s (i) ), (b) is the tree T r (S (i) ) that is formed by several edges l(s (i) , ¬ s (i) ), and (c) is the buffer r a (S (i) ) that is formed by the cluster structure tree T r (S (i) ) .
According to the modeling principle, the improved AGNES clustering algorithm has been created. The smart system will search the optimal tourist attractions and tour routes by the p number of clusters and tourists' interests, time budget, and cost budget, etc. The pseudo-code of the process to create the improved AGNES clustering algorithm (Algorithm 2) is shown as follows: Algorithm 2: The process to create the improved AGNES clustering algorithm 1: Step 1: Create ξ(s (i1) , s (i2) ) to store S to (•) and its descending S to (•) d with element S to(i,j) . The value ξ(s (i1) , s (i2) ) is stored in the descending order from the first row first column to the last one.
Sub-step 5: The p number of a (i) with the maximum value S to(i,j) d as s (i) * .

5:
Step 5: Expand the tree T r(S (i) ) from l( ¬ s (i) * , s (i) * ) with a radius range r and form the cluster spatial buffer r a(S (i) ) .
The proposed AGNES clustering algorithm is significantly different from those that have been used in previous research (see the Introduction section). First, the aim is totally different; the proposed method is to find out the classifications of city tourist attractions, and it tends to extract the correlation among different tourist attractions and calculate the degree of correlation between two tourist attractions, and finally output the tourist attraction clusters. This clustering process is the critical step for tourists' interests matching the tourist attractions' attributes. The previous methods did not concern tourist attractions clustering, and they mainly tended to find out the tourists' clusters, tourists' behaviour, and the relationship between the collaborative economy and tourism, etc. Second, the parameters that were used in developing the AGNES model are different. Besides the spatial attributes, the proposed AGNES algorithm makes improvements on the clustering criterion function by adding tourist attraction's feature attributes, which makes the tourist attraction clustering more logical, since the clusters and tourist attractions are grouped to match the tourists' interests. The previous methods directly used the AGNES algorithm itself on the basis of spatial attributes such as longitude and latitude. Third, since the proposed AGNES algorithm is an improved method, the detailed steps on modeling the algorithm are provided in the paper. It is an important research content and precondition of the whole research work. In previous studies, the AGNES algorithm is merely a tool that is used by the authors without detailed algorithm modeling process.

Tour-Route-Recommendation Algorithm Based on the Space-Time Deduction
The selection of tourist attractions and tour-route design are the two critical factors for any tour itinerary. Tourists must choose the tourist attractions that best match their interests and then plan the most reasonable route based on their selections. Time and cost are always limitations, to which travel and participation significantly contribute. Therefore, with available time and financial expectations as fixed conditions, a "smart" system should be able to recommend attractions that best-match an individual's preferences as well as optimize the transportation route. Since the mode of transportation would be largely influenced by the tourist themselves, it was a crucial factor for consideration when developing our model [36,37].

Tourist Attraction Reachability Space Model Based on Interest Matrix and Geographical Position
The precondition for the smart system to recommend tourists with tour routes was obtaining the tourist interest data. The interest data were set as the input labels and then matched with the tourist attraction attributes. The capacity of each tourist attraction to satisfy a tourist's interests would be different, and this capacity was defined as the reachable capacity, the value of which would dictate the likelihood of its recommendation by the system. Therefore, creating a reachability space model between the tourist interest data and the tourist attractions was the precondition when searching for tourist attractions that would best satisfy an individual's interests [38,39].
The tourist interest label vector n 1(i1) and spatial positioning vector n 2(i2) have the same dimension as the vectors of t 1(i1) and t 2(i2) , and they represent the tourist-interest data. The variable n 1(i1) is a 1 × u (i1) dimension vector. The interest label factor n 1(i1) contains u (i1) items of different classifying indices n 1(i1,j1) and j 1 ∈ (0, α] ⊂ Z + . The variable j 1 is the footnote for the index n 1(i1,j1) of the factor n 1(i1) , and α is the maximum number of the index. The variable n 2(i2) is a 1 × u (i2) dimension vector. The spatial positioning factor n 2(i2) contains u (i2) items of different classifying indices n 2(i 2 ,j 2 ) and j 2 ∈ (0, α] ⊂ Z + , where j 2 is the footnote for the index n 2(i 2 ,j 2 ) of the factor n 2(i2) , and α is the maximum number of the index. The number of vectors n 1(i1) and n 2(i2) are m and n.
The starting point of one tour route for the tourist is S t . The point S t determines the dimension and specific values of the spatial location vector n 2(i2) . The matrix N is formed by m number of feature attribute label vectors n 1(i 1 ) and n number of spatial attribute label vector n 2(i2) and represents the tourists' interest tendency. The matrix row is the vector n 1(i 1 ) or n 2(i2) and the column is the specific element of the vector n 1(i 1 ) or n 2(i2) . It contains m + n number of rows and α number of columns. The No.1 to No.m rows relate to the vector n 1(i 1 ) ∼ i 1 ∈ (0, m] ⊂ Z + , the No.m + 1 to No.m + n rows relate to the vector n 2(i2) ∼ i 2 ∈ (0, n] ⊂ Z + . When the tourist interest data are confirmed, the arbitrary row n 1(i 1 ) will form one item of an attribute element value n 1(i 1 ,j 1 ) , j 1 ∈ (0, α] ⊂ Z + , and the other elements are 0. The Equation (8) is the general formula N and its specific elements. N = n 1(1), . . . , n 1(m) , n 2(1) , . . . , n 2(n) 1) n 1(m,2) . . . . . . n 1(m,α) n 2(1,1) n 2 (1,2) . . . 0 . . . n 2(1,α) . . . n 2(m+n,1) n 2(m+n, 2) . . . n 2(m+n,4) . . . 0 The matrix N elements are related to the matrix T elements, including tourist classification n 1(1) , popularity degree n 1(2) , travel time n 1(3) , traveling fee n 1(4) , longitude n 2(1) , and latitude n 2(2) . The spatial location vector n 2(i2) of the matrix N is determined by the longitude and latitude of the point S t .
The correlation between the tourists' interest and the tourist attraction attributes is determined by the interest quantitative matching objective function ξ (N,T) . Transfer the feature attribute label vector normalized parameter δ 1(i 1 ) and take it as the parameter to create the function ξ (N,T) , then confirm the tourist-interest data. Traverse j 1 , j 2 ∼ (0, α], search and extract the non-zero elements δ 1(i 1 ) · n 1(i 1 ,j 1 ) and n 2(i 2 ,j 2 ) in the matrix N label vector δ 1(i 1 ) · n 1(i 1 ) and n 2(i2) . Transpose the matrix N and generate the matrix N T . Create the norm relationship of the Minkowski distance between the tourists' interests and the tourist attractions, shown in the Equations (9) and (10). Calculate the function ξ (N,T) between the matrix N and the matrix T. Use the interest quantitative matching objective function matrix P ξ(N,T) to store the function value.
When the interest data remains the same, the tourist attractions s (i) in different clusters will generate different function values ξ (N,T) . The values ξ (N,T) are stored in the sequence of the cluster S (i) footnote of the matrix P ξ (N,T) . The value ξ (N,T) is stored in the P ξ(N,T) in ascending order from the first row and column to the last one. When tourists confirm the interest data, they contain the longitude and latitude of the starting point S t . Taking the point S t as the center core, each row of the matrix P ξ(N,T) represents the correlation between the tourist attractions and the interest data, and also represents the reachability extent of the tourist attractions.

The Dynamic Space-Time Deduction Algorithm Based on the Travel Time and Cost
During a city tour, tourists expect to visit several tourist attractions in one day; tourists have different interests and levels of desire to visit various kinds of tourist attractions that will each involve different time investments and associated costs. Therefore, when the time and cost are fixed, the number of tourist attractions to be visited must be finite. According to the tourist attraction reachability model, the smart system would formulate a tour route that would best match tourist interests while meeting the time and cost conditions. Furthermore, depending on the mode of transportation that was chosen, the goal of saving time and cost could result in better attraction recommendations and optimized routeplanning. Since travel time would be directly affected by the mode of transportation and route between attractions, the precondition for the dynamic space-time deduction of the tour-route-recommendation algorithm had the lowest path-searching cost [37,38].

The Shortest-Path-Searching Algorithm Based on the Space-Vector Lattice
After visiting a tourist attraction, tourists will move to the next one. This activity is based on specific activities. First, tourists will use a transportation mode such as walking, cycling, taxi service, etc. Second, they will travel city roads to the destination. Third, the moving process will consume time and cost.
The traffic space between two tourist attractions is the tourist attraction traffic subspace Φ. The space Φ is the interval from point A to point B, and it is a vector space with coordinates, shown in Figure 3. The left bottom dot of the square Φ is the origin of coordinate. Each line represents an abstract city road. The line intersection a (i) represents the road intersection. In the Figure 3, the space Φ contains all city roads between the two points A and B. The road distance d is (a (i) , a (j) ) of the edges CD, DE, EF, and CF in the small square CDEF may be different. is the spatial road and lattice relationship as well as the searching process for the series contained in the square S qu (A, a (1) , a (5) , a (6) ). (c) is the spatial road-lattice relationship as well as the searching process for the series that is contained in the square S qu (A, a (2) , a (10) , a (12) ).
Starting from the point A, search the path along the road until the point B is reached; in the whole process, all the searched points are listed in the spatial searching series S eq . The searching series S eq represents a reachable path that is related to a searched distance d is (S eq ). Figure 3 shows the shortest-path algorithm searching process.
The pseudo-code of the process to create the shortest-path algorithm (Algorithm 3) is shown as follows. This searching mode considers all the city roads and intersections between two points and finds the global minimum value, which may be more precise than the other shortest-path algorithms. The shortest path may reduce time and costs and thus increase the number of tourist attractions to be visited.

The Dynamic Space-Time Deduction Tour-Route-Searching Algorithm
Once the tourist attractions are been identified based on tourist interests, the travel time and cost would determine the number of tourist attractions to be visited and the optimal travel route, all of which would influence a tourist's overall experience. Based on the matrix P ξ(N,T) and fixed time and cost conditions, the dynamic space-time deduction tour-routesearching algorithm was created. The basic process of the algorithm was as follows: using a daytrip as an example, a tourist confirms the travel time budget t (unit: hour), traveling fee c (unit: CNY ¥ yuan), and then chooses one transportation mode. Starting from the point S t , search the shortest path between the point S t and tourist attractions and confirm the travel time using the chosen transportation mode. Iterate the visiting time in the tourist attractions and calculate the travel costs and any entrance or activity fees. Search the minimum time and cost between the point St as well as all the tourist attractions and set the related point as the first tourist attraction K (1) to be visited. Starting from the K (1) point, search the next minimum travel time and cost tourist attraction K (2) until the total travel time or cost reaches the preset time budget t or cost budget c, and then the optimal tour route is defined. Step 2: Take the points a (6) , a (12) , a (18)

5:
Step 5: Continue searching until the square S qu (A, a (4) , a (20) , B) is finished. Find the minimum one mind is (S eq ) (4) between A and B.
In a tour, the road interval from the point A to B is a cost iteration sub-unit Q (i) . This sub-unit is the basic unit for the dynamic deduction process. The sum of the travel time contains the visiting time of a tourist attraction and the travel time from the point A to B; it is the time consumption t (i) of the sub-unit. The time t (i) is determined by the tourist attraction B visiting time and the travel time to the B. The sum of traveling costs contains the visiting fees of the tourist attractions and the travel costs from the point A to B; it is the cost consumption c (i) of the sub-unit. Starting from the point S t , the tourist passes through n number of sub-units Q (i) and finally deduces to the terminal tourist attraction P; in this process, the total time and costs are noted as the dynamic deduction time ∆t dynamic deduction cost ∆c, as shown in the Equation (11): A 1 × k dimension vector T s is used to consistently store tourist attractions that represent the tour route after the searching process. The sequence of the matrix T s in storing the tourist attractions obeys the algorithm rule, and the empty elements are 0. A 1 × k dimension vector ∆Ts is used to dynamically store the tourist attractions in the searching process, and the empty elements are 0. The pseudo-code of the process to create the tour-route-searching algorithm (Algorithm 4) is as follows: Algorithm 4: The process to create the tour-route-searching algorithm 1: Step 1: Confirm transportation mode m o , starting point S t , time budget t, and cost budget c . Variable m o = 1 represents taking the bicycle, m o = 2 represents taking the taxi, and m o = 3 represents taking the public bus.

6:
Step 6: Confirm Q (i) , search the shortest path in Q (i) . Calculate t (i) and c (i) of , judge whether ∆t ≤ t ∧ ∆c ≤ c. Continue or stop. 7: Step 7: Continue searching until ∆t > t or ∆c > c. Output the tour route.

Sample Experiment and Result Analysis
To verify the advantages of the proposed algorithm, the tourism city of Chengdu was selected as the subject of the experiment. The basic thought of the experiment is as follows. First,15 popular tourist attractions in the Chengdu City were selected. All the tourist attraction feature attributes and spatial attributes were confirmed and quantified. According to the tourist attraction attributes, we used the proposed clustering algorithm to obtain tourist attraction labels and clusters, cluster structure trees, and cluster spatial buffers. Based on these clusters, the tourist-interest data were obtained and the quantified interest-matching objective function matrix was created. According to the tourist time and cost allowances as well as the preferred mode of travel, the tourist attractions and tour routes were analyzed for optimal matches. For the tour-route optimization, the experiment chose two frequently used shortest path searching algorithms as a control group to verify the advantages of our proposed algorithm.  (11) : Tazishan Park; s (12) : Eastern Suburb Memory; s (13) : SM Square; s (14) : Chengdu Zoo; s (15) : Raffles Square}. Table 1 shows the quantified feature attributes and spatial attributes of each tourist attraction. The symbol t 1(1) represents the classification, t 1(2) represents the popularity, t 1(3) represents the best travel time, t 1(4) represents the traveling fee, t 2(1) represents the longitude, and t 2(2) represents the latitude. Based on the Table 1 data, the proposed improved AGNES spatial clustering algorithm was performed to generate tourist attraction clusters. Table 2 shows the analyzed results of the clustering objective function ξ(s (i1) , s (i2) ) values among the tourist attractions. s (1) s (2) s (3) s (4) s (5) s (6) s (7) s (8) s (9) s (10) s (11) s (12) s (13) s (14) s (15) s ( In the clustering process, the cluster structure trees and cluster spatial buffers were generated, as shown in Figure 4. Figure 4a is the tourist attraction distribution, and Figure 4b-d are the visualization results of the structure trees and spatial buffers for the clusters S (1) -S (3) .

The Output Result of the Tourist Attractions and Tour Route
Considering the daytrip example, we chose two tourists as the research objects. Table 3 shows the attribute label values based on the tourist interests. The last two indices were the longitude and latitude of the starting point for each tourist. The first tourist sample T (1) chose to use a bicycle for transportation, while the second tourist sample T (2) chose to use a taxi service.

The Analyzed Results of the Interest-Matching Objective Function Values
Based on the output cluster results and the tourist interest data, the interest-matching objective function ξ (N,T) values between the tourist interests and each tourist attraction were calculated, as shown in Table 4.

The Sequencing Results of Interest-Matching Objective Function Values
Based on the data shown in Table 4, the results were provided in ascending order values of function ξ (N,T) in the sequence of the clusters, as shown in Table 5.   Table 4. The interest-matching objective function ξ (N,T) values between the tourist samples and each tourist attraction.  (11) s (12) s (13) s (14) s (15) s (11) s (12) s (13) s (14) s (15) Table 5. The interest-matching objective function ξ (N,T) ascending values between the tourist samples and the cluster tourist attractions.
(3) Figure 5c shows the function ξ (N,T) value of the first tourist according to the tourist attraction storage sequence of the interest-matching objective function matrix Pξ(N, T). (4) Figure 5d shows the function ξ (N,T) value of the second tourist according to the tourist attraction storage sequence of the interest-matching objective function matrix Pξ(N, T).
In Figure 5c,d, in the cluster sequences, the tourist attraction objective function values in each cluster are listed in the ascending order in which the red curve represents the cluster S (1) , the blue curve represents the cluster S (2) , and the green curve represents the cluster S (3) .

The Results of the Tourist Attractions and Tour-Route Planning
According to the data in Table 3 for a one-day tour, the travel-time allowance for the first tourist sample was 9 h and the cost budget was CNY 300 yuan. The travel-time allowance for the second tourist sample was 11 h and the cost budget was CNY 500 yuan. The first tourist chose to take the bicycle while the second tourist chose to take the taxi.
Based on the proposed algorithm, a potential tourist attraction itinerary and tour route that was based on each tourist sample's interests and their chosen modes of transportation (i.e., bicycle and taxi service, respectively) were identified, as shown in Table 6. Table 6. The tourist attractions and tour route that best match the tourists' interests.   Table 6 shows the results of the tourist attraction element T s(i) of the tour-route searching steady vector T s and the cost iteration sub-unit Q (i) . The values were the required time (unit: hour) and the minimum cost (unit: CNY yuan) to visit the tourist attractions. The values between the two tourist attractions represented the travel time (unit: hour) and minimum travel cost (unit: CNY yuan) in the cost iteration sub-unit Q (i) under the condition of the chosen transportation mode. It also shows the optimal tourist attractions and tour route based on the tourists' requirements.

The Comparison Results of the Algorithms
To verify the results of the proposed algorithm, a control set of algorithms were conducted under the same experimental conditions and their results compared with those of the proposed algorithm.

Selecting and Confirming of the Control Algorithms
In tourism research, shortest-path algorithms such as the Dijkstra and A* algorithms have typically been used to plan tour routes with the shortest traveling distances. They have the benefits of being easily accessed and applied [40][41][42]. In addition, the shortest-path algorithms were also constrained by tourism factors such as features and spatial attributes. Once the traveling distances between the tourist attractions have been defined by the city roads and road nodes, the shortest-path algorithms can operate. Since the proposed algorithm's experimental environment conforms to these conditions, the Dijkstra algorithm and the A* algorithm were chosen as controls to plan the travel routes for the sub-unit Φ, and the control group algorithms were defined as Algorithm 1 (A1) and Algorithm 2 (A2). Under the same conditions of the algorithm operating time and the interest data of the two tourist samples, the control group algorithms were used to dynamically search the same tourist attractions, cost iteration sub-units, and tour routes. Their results of were then compared with those of the proposed algorithm (PA), as shown in Table 7, in which the first tourist chose cycling, the second tourist chose a taxi service. Table 7 shows the element tourist attractions T s(i) of the steady matrix T s and the cost iteration sub-units Q (i) under the condition of each algorithm. The values between the two tourist attractions represent the travel time (unit: hour) and minimum moving cost (unit: CNY yuan) in the cost iteration sub-unit Q (i) with the chosen transportation modes. According to Table 7, the Figure 6 curve results were as follows: Table 7. The tourist attractions and the tour routes that best match tourist interests under the condition of the three algorithms.   With regard to the computer algorithm optimization, when searching for the shortest route, the Dijkstra algorithm has low efficiency. Compared to the Dijkstra algorithm, the heuristic function is introduced to the A* algorithm, to some extent, the algorithm efficiency was improved. In comparison, the proposed algorithm is based on multiple dot parallel searching, it has higher operating efficiency, and consumes smaller operating space than the Dijkstra algorithm and A* algorithm. Table 8 shows the comparison of the Dijkstra algorithm, A* algorithm, and the proposed algorithm with regard to the time complexity (TC) and space complexity (SC). The data in the table shows the TC and SC examples when the tourist attraction numbers are n = 4, n = 5, and n = 6. The symbol ρ 1,1 represents the TC ratio between the Dijkstra algorithm and the proposed algorithm, the symbol ρ 1,2 represents the SC ratio between the Dijkstra algorithm and the proposed algorithm. The symbol ρ 2,1 represents the TC ratio between the A* algorithm and the proposed algorithm, the symbol ρ 2,2 represents the SC ratio between the A* algorithm and the proposed algorithm.   Table 1 data, the following conclusions were reached.

The Comparison Results of the Proposed Algorithm with the Control Algorithms
(1) The tourist attractions that were chosen in the experiment conformed to the preset conditions of the proposed algorithm. (2) The tourist attractions were popular tourist attractions in Chengdu City with typical features and spatial attributes. Each tourist attraction had different attributes that affected their capacity to satisfy varying tourist interests, which formed the clustering condition. In addition, they were located across the city, which formed the spatial modeling condition. Table 1 data were used to generate clusters and interest-matching objective functions. The data were normalized values that were processed by the feature attribute label vector normalization parameter δ 1(i1) , and they conformed to the proposed algorithm conditions.

The Analysis and Conclusion of the Results of Clustering and Cluster Visualization
After analyzing Section 4.2, Table 2 data, and Figure 4, the following conclusions were reached.
(1) In Table 2, the clustering objective function values between the two different tourist attractions were all different. They represented the degree of correlation among the two different tourist attractions. (2) Via the proposed algorithm, three tourist attraction clusters were identified.
(3) The tourist attractions in the same cluster had a high degree of correlation among their attributes and the objective function values were relatively small. The tourist attractions in different clusters had a low degree of correlation among their attributes and the objective function values were relatively large. (4) The Figure 4 shows the formed three clusters, cluster structure trees, and cluster spatial buffers that were constrained by the proposed algorithm.
1 Figure 4a shows the distribution of all the tourist attraction samples with note labels. They were spatially discrete. 2 As to the inner tree structure of the clusters: in Figure 4b, the topological connecting lines among tourist attractions formed the first structure tree and it indicated the searching process of the first cluster. In Figure 4c, the topological connecting lines among the tourist attractions formed the second structure tree, and it indicated the searching process of the second cluster. In Figure 4d, the topological connecting lines among the tourist attractions formed the third structure tree, and it indicated the searching process of the third cluster. 3 As to the structure of the cluster buffer: in Figure 4b, the closed brown space was the first cluster spatial buffer and indicated the spatial range of the first cluster. In Figure 4c, the closed blue space was the second cluster spatial buffer and indicated the spatial range of the second cluster. In Figure 4d, the closed green space was the third cluster spatial buffer and indicated the spatial range of the third cluster.
(5) Since the proposed algorithm combined spatial attributes, the three structure trees and buffers each had different shapes and topology tendencies, which visually indicated that different clusters not only had discrepancies in feature attributes but had larger discrepancies in their spatial attributes. In addition, the three clusters had spatial overlap in the city range. It indicated that the correlation relationship among the tourist attractions in the clusters as relative but not isolated. The formed tour routes could pass through different clusters.

The Analysis and Conclusion on the Results of the Tourist Attractions and Tour Route
After analyzing Section 4.3, Tables 3-6 data, and Figure 5, the following conclusions were reached.
(1) Table 3 shows the normalized data of the two tourist interest labels. It indicated that the two tourists had completely different interests. In addition, the starting points for the two tourists were different. According to the proposed algorithm, the precondition in Table 3 differentiated the results of the two tourists and followed the experimental logic.
(2) Table 4 shows the interest-matching objective function values between the two tourist samples with each tourist attraction. 1 The values were different due to Table 3 preconditions and the operation of the proposed algorithm. It indicated that each tourist attraction's capacity on satisfying tourist's interests would be different. The tourist attraction that had the stronger capacity would be preferentially selected as the tour-route tourist attraction. 2 Upon further interpretation, the smaller the value was, the closer the tourist attraction attributes were to the tourist's interests and the tourist attraction would be more likely to satisfy the tourist. On the contrary, the bigger the value was, the more remote the tourist attraction attributes were to the tourist's interests and the tourist attraction would be less likely to satisfy the tourist.
(3) Table 5 was deduced from Table 4. It shows that each tourist attraction was stored in the matrix P ξ(N,T) in the ascending order of the interest-matching objective function value. It indicated the capacity sequence of each tourist attraction on satisfying tourist interests. (4) Figure 5 was deduced from Tables 3-5, and it indicated that the interest-matching objective function values had the fluctuate tendency and the tourist attraction capacities varied with the footnotes in each cluster.
As to the first tourist, the interest-matching objective function values are shown in the Figure 5a,c: (1) Tourist attraction s (6) : East Lake Park had the highest matching function value and we interpreted that it had the lowest capacity for satisfying the tourist's interests. (2) Tourist attraction s (14) : Chengdu Zoo had the lowest matching function value, and we interpreted that it had the highest capacity for satisfying the tourist's interests. (1) Tourist attraction s (6) : East Lake Park had the highest matching function value, and we interpreted that it had the lowest capacity for satisfying the tourist's interests. (2) Tourist attraction s (14) : Chengdu Zoo had the lowest matching function value, and we interpreted that it had the highest capacity for satisfying the tourist's interests. for satisfying the tourist's interests; The tourist attraction s (7) : Wenshu Temple had the lowest capacity for satisfying the tourist's interests. (5) In cluster S (3) , tourist attraction s (14) : Chengdu Zoo had the highest capacity for satisfying the tourist's interests; The tourist attraction s (6) : East Lake Park had the lowest capacity for satisfying the tourist's interests. (6) Table 6 indicated the tour route output results that were based on Tables 3-5 precon-ditions. The following conclusions were reached. (result interpretation 5) 1 The tourist attractions of the two tour routes all matched the tourist interests. 2 The recommended tour route for the first tourist was 8.77 h long and cost CNY 33 yuan. We interpreted that the proposed algorithm's tour route results conformed to the tourist's requirements. 3 The recommended tour route for the second tourist was 10.25 h long and cost CNY 136 yuan. We interpreted that the proposed algorithm's tour route conformed to the tourist's requirements. 4 The total time and costs were within the ranges of the tourists' allowances and met their needs. We interpreted that the algorithm was feasible and accurate.

The Analysis and Conclusion on the Comparison Result of the Algorithms
After analyzing Section 4.4, Table 7, Table 8, and Figures 6 and 7, the following conclusions were reached.
(1) The controls that were used were the Dijkstra and A* algorithms, as they have both been used extensively for shortest-path calculations and for planning optimal tour routes. Therefore, the control group algorithms were feasible, accessible, and comparable. (2) Due to the different preconditions on tourist interests, starting points, time, costs, and their chosen modes of transportation, the first tourist had three recommended tourist attractions while the second tourist had four recommendations. We interpreted that when the preconditions changed, the results of the proposed algorithm and the control algorithms changed as well. (3) All three algorithms produced fluctuating time durations and costs for visiting various tourist attractions and traveling between two tourist attractions. Each algorithm resulted in different values for these variables. The differences are caused by the preconditions of the tourists' needs, tourist attraction attributes, and city geospatial environment, and were also caused by the three algorithms' different performances. 1 The tour routes by the Dijkstra and A* algorithms were less efficient and more expensive than those by the proposed algorithm. We interpreted that the proposed algorithm had an advantage on saving time and costs when planning tour routes, as compared to the controls. 2 From the Table 8, it can be concluded that the three algorithms had different performances. On the aspect of computer algorithm performance, when searching the shortest tour route, the proposed algorithm had much lower time complexity and space complexity than the Dijkstra algorithm, while it had much lower time complexity than the A* algorithm and had the same dimension of space complexity with the A* algorithm. Through the mathematical calculating, the ratio ρ was obtained. When the tourist attraction number n was larger than 2, the ratios ρ 1,1 , ρ 1,2 , ρ 2,1 , and ρ 2,2 were all larger than 1. It can be concluded that when tourist attractions are confirmed in the searching process on the shortest tour route, the Dijkstra and A* algorithm always consumed higher time complexity and space complexity than the proposed algorithm, and the Dijkstra algorithm always consumed higher time complexity than the proposed algorithm while the A* algorithm consumed the same dimension of space complexity with the proposed algorithm.
system, they can also find out the optimal tour route, but they will consume more computer operating time and space. In all, the proposed algorithm had a better performance than the Dijkstra and A* algorithms in searching optimal tour routes.
(4) In Figure 7, the following conclusions were reached. 1 With regard to the first tourist, the proposed algorithm route was 8.77 h long and cost CNY 33 yuan. The Dijkstra algorithm route was 9.14 h long and cost CNY 36.5 yuan. The A* algorithm route was 9.2 h long and cost CNY 37 yuan. We interpreted that the proposed algorithm was superior to the control algorithms. 2 With regard to the second tourist, the proposed algorithm route was 10.25 h long and cost CYN 136 yuan. The Dijkstra algorithm route was 10.55 h long and cost CYN 147 yuan. The A* algorithm route was 10.44 h long and cost CYN 145 yuan. We interpreted that the proposed algorithm was superior to the control algorithms. 3 For the first tourist, the time duration of the tour routes that were recommended by the Dijkstra and A* algorithms both exceeded the nine hours, and thus the results did not conform to the tourist's allowance. In this aspect, we interpreted that the control algorithms were inferior to the proposed algorithm.

Contribution
Based on the current challenges in tour-route planning and attraction recommendations, this study designed a tour-route planning and recommendation algorithm that was based on an improved AGNES spatial clustering and space-time deduction model. This model improved interest-matching, urban-tourist-attraction clustering, space-time deduction, and tour-route planning based on various modes of transportation. By combining the tourist attraction features and spatial attributes, the improved AGNES tourist attraction clustering algorithm was created, and the cluster structure trees, cluster spatial buffers, and clusters were generated. All the tourist attractions with a high degree of correlation among the attributes were clustered together. Based on the tourist-interest data, the interestmatching objective function was created. This function reflected each tourist attraction's capacity for satisfying the tourist's interests, which formed the precondition when planning the tour route. Under the constraint conditions of time and cost allowance, the proposed algorithm searched for the optimal tourist attractions to match the tourist interests as well as considered the optimal tour route. The resultant tour routes met the tourists' needs and interests. Based on the comparison results, the proposed algorithm had advantages when compared to the controls. The proposed algorithm reduced the costs and time investment for tour-route planning. The improved AGNES clustering algorithm considered spatial distance and various tourist attraction attributes. The proposed algorithm integrated mixed (i.e., preferred) transportation modes for different optimized results. Tour-route planning that was based on space-time deduction was an innovative method that not only considered the time and cost constraints, but also considered the shortest traveling distance between two tourist attractions. Therefore, the resultant tour routes satisfied the tourist's interests and reduced the time and costs that were invested by tourists.

Addressing Challenges for Research
Smart mobile devices have become part of daily life, and, for many applications, activities and events are planned using smart mobile devices. Mobile planning is the key to ensuring efficient routing, resource allocation, and energy management. For example, the researchers in [30] considered that efficient routing, resource allocation, and energy management could be achieved through clustering of mobile nodes into local groups. In the study, a clustering scheme was developed to prolong the network lifetime by distributing energy consumption among clusters. In [31], a novel travel route recommendation system was proposed that collected tourist on-site travel behavior data automatically regarding a specific POI that was based on smart phone and Internet of Things technologies. The tour-route-recommendation algorithm was then created to search and rank the tangible travel routes. The researchers in [32] considered that the prevalence of smart mobile devices and location-based services would lead to an increasing volume of mobility data. Based on big mobile data, it proposed a method for accurately predicting the next location of a traveling object.
In tourism activities, tourists' traveling behaviors also generate massive amounts of data on mobile devices. How to appropriately and accurately use these data is a future challenge for tourism research. Mobile data could be used in tourism data mining, tourist attraction location, tourist interest tendency research, tourism facility evaluations, tour-route planning, and recommendations, etc. It has been deemed the most important, challenging, and valuable research field for the future. How to precisely optimize mobile data acquisition, mine interest data, match tourists' needs, search optimal solutions, etc., are challenges that should be addressed.

Limitation and Future Work
When searching the tour routes, the proposed algorithm sets the transportation mode, time allowance, and costs as the constraint conditions. However, the proposed algorithm still has some drawbacks and limitations. First, the AGNES clustering algorithm itself has its limitation in efficiency, accuracy, and space complexity. Second, in the tour-route algorithm, the transportation modes were relatively fixed, whereas tourists might choose different transportation modes in the tour process. Third, the proposed method did not involve mobile data; we provided a method under the condition of city tourist attractions' attributes, tourists' specific interests, and an urban tourism environment. Therefore, additional research could expand and validate our proposed algorithm further. First, more precise tourist attraction clustering methods could be studied, which could refine and better target the clustering results based on tourist interests. The clustering objective function criteria and model procedure could be refined further as well. The criteria to select the parameters could add more factors to satisfy more individualized interests. Second, the transportation mode selection for the whole tour should be more flexible and random, which could then consider tourist selection tendency on different cost deduction sub-units between two tourist attractions. In further research, we will study random transportation mode selection in different sub-units, and a more individualized tour-route-searching algorithm will be designed and proposed. Third, mobile data should be used to mine tourist interests and to integrate specialized interests. To some extent, a smart tourism recommendation system could be set up by mining historical tourists' data and find related knowledge.

Data Availability Statement:
The data presented in this study are available from the author upon reasonable request.