A Human-Guided Machine Learning Approach for 5G Smart Tourism IoT

With the continuous development of tourism, the integration of the Internet of Things (IoT) into tourism projects is considered a very promising technology. Smart tourism aims to use the IoT to maximize information communication; that is, the IoT technology will become an important element to meet the needs of a new generation of tourists. Therefore, in this study, we propose a human-guided machine learning classification method based on tourist selection behavior. This classification method can effectively help tourists make a decision in choosing a certain tourist destination. The results obtained from the cross-validation experiments and performance evaluation prove the effectiveness of this method.


Introduction
The Internet of Things (IoT) is a part of the future Internet ecosystem that will have a significant impact on the development of healthcare and e-tourism services [1]. Location and geographic context-aware services have become a new area of rapid business development. These services play a key role in the development of IoT scenarios, smart spaces, and proactive solutions. Among the above, smart tourism is one of the most fascinating application areas [2]. In many cities, various social problems, such as non-security, fraud, and lack of appropriate resource information, are the biggest obstacles to tourism. The design of smart tourism can help travelers capture and process all of the data involving users, turning them not only into useful information but also into personalized knowledge [3]. Figure 1 shows the application of the IoT in smart tourism. Our system is designed to develop the tourism industry so as to attract different customer groups and provide users with innovative and ideal communication, information, and entertainment applications and services in a new business model. According to the needs of cloud-based services [4], we designed a new method of searching for travel information [5]. By bringing together smartphones, Global Positioning System (GPS), Google Maps, Augmented Reality (AR), and Worldwide interoperability for Microwave Access (WiMAX) networks [6], we provide mobile users with a whole new travel experience [7]. At present, the 5th Generation Wireless System (5G) is the world's leading technology, and its applications are gradually spreading in our daily lives. The development direction and prospects of 5G smart tourism are very optimistic. Integrating 5G technology [8] into the tourism field and combining it with AR, Artificial Intelligence (AI), social sharing, and other applications while simultaneously benefitting from its high speed, large bandwidth, and low-latency features will give rise to an intelligent upgrade of tourist attractions. Promoting the application and development of 5G smart tourism is an important smart tourism is an important way to boost tourism in all regions of the country and to facilitate the intelligent development of scenic spots. It will also definitely bring significant benefits to tourist attractions.
Tourism is one of the most prosperous economic activities in the world; in fact, it is one of the main components of economic growth in many countries [9]. Many countries in the world use tourism to revitalize their national economies, as tourism can greatly promote the development of various domains, such as the regional economy, as well as increase employment opportunities [10]. In addition, it can play an important role in promoting mutual understanding with other countries. The tourism experience is the core product of the tourism industry, directly affecting tourists' satisfaction [11]. However, traditional tourism approaches may not attract enough tourists to revitalize the economy. The World Tourism Organization (UNWTO) has introduced the concept of smart tourism and defined it as clean, green, ethical, and high-quality tourism [12]. The realization of smart tourism is an important and novel way to transform the tourism industry. Recently, the tourism industry has undergone a shift to a more user-friendly intelligent system based on "smart tourism" information technology [13]. Smart tourism is considered customer-centric and is designed to fully satisfy tourists' dietary, accommodation, travel, shopping, and entertainment needs. It basically means applying smart technology at all stages of travel to improve the visitor experience.
The purpose of smart tourism is to heighten the tourist experience and enhance the competitiveness of tourist destinations [14]. In addition, it promotes the sustainable development of the country. To upgrade the travel experience in a certain city, smart technology enables users to download smart applications on their phones. Such smart applications can enable tourists to take advantage of local cuisines, stories, customs, amongst other things, for a complete immersion in the local culture. Intelligent tourism systems should provide tourists with relevant and meaningful information based on big data analysis, personal information, behavior patterns, etc., to help them easily make decisions before or during a trip. Therefore, smart tourism should be able to meet the requirements of short-term economic needs and long-term sustainable development.
Based on the above reasons, we propose a method for predicting the behavior of smart travel users. This method is a K-Nearest Neighbor (KNN) user behavior prediction method based on human-guided machine learning. Compared with the traditional KNN algorithm, it can realize the analysis and response of fuzzy and uncertain problems by integrating human perception, cognitive capabilities, machine computing capabilities, and storage capabilities. Cross-validation experiments and performance evaluation methods show that the technology can classify the decision-making behaviors of travel users. It also has a good prediction ability. The classification results can quickly predict whether the travel user will choose the destination in the short term. Tourism is one of the most prosperous economic activities in the world; in fact, it is one of the main components of economic growth in many countries [9]. Many countries in the world use tourism to revitalize their national economies, as tourism can greatly promote the development of various domains, such as the regional economy, as well as increase employment opportunities [10]. In addition, it can play an important role in promoting mutual understanding with other countries.
The tourism experience is the core product of the tourism industry, directly affecting tourists' satisfaction [11]. However, traditional tourism approaches may not attract enough tourists to revitalize the economy. The World Tourism Organization (UNWTO) has introduced the concept of smart tourism and defined it as clean, green, ethical, and high-quality tourism [12]. The realization of smart tourism is an important and novel way to transform the tourism industry. Recently, the tourism industry has undergone a shift to a more user-friendly intelligent system based on "smart tourism" information technology [13]. Smart tourism is considered customer-centric and is designed to fully satisfy tourists' dietary, accommodation, travel, shopping, and entertainment needs. It basically means applying smart technology at all stages of travel to improve the visitor experience.
The purpose of smart tourism is to heighten the tourist experience and enhance the competitiveness of tourist destinations [14]. In addition, it promotes the sustainable development of the country. To upgrade the travel experience in a certain city, smart technology enables users to download smart applications on their phones. Such smart applications can enable tourists to take advantage of local cuisines, stories, customs, amongst other things, for a complete immersion in the local culture. Intelligent tourism systems should provide tourists with relevant and meaningful information based on big data analysis, personal information, behavior patterns, etc., to help them easily make decisions before or during a trip. Therefore, smart tourism should be able to meet the requirements of short-term economic needs and long-term sustainable development.
Based on the above reasons, we propose a method for predicting the behavior of smart travel users. This method is a K-Nearest Neighbor (KNN) user behavior prediction method based on human-guided machine learning. Compared with the traditional KNN algorithm, it can realize the analysis and response of fuzzy and uncertain problems by integrating human perception, cognitive capabilities, machine computing capabilities, and storage capabilities. Cross-validation experiments and performance evaluation methods show that the technology can classify the decision-making behaviors of travel users. It also has a good prediction ability. The classification results can quickly predict whether the travel user will choose the destination in the short term.

IoT Technology
The IoT, as an integrated part of the future development of the Internet, is defined as a dynamic global network infrastructure with self-configuring capabilities based on standards and interoperable communication protocols. The basic idea of the IoT concept is to address the ubiquity of various objects, such as Radio Frequency Identification (RFID) tags, sensors, actuators, mobile phones, etc., enabling them to interact with each other through unique addressing schemes to achieve common goals, such as the realization of intelligence identification, location tracking, surveillance, and management functions [15]. In the IoT, everything can be exchanged without human intervention. In fact, the IoT contains two meanings: first, the IoT is the Internet, and its extension, core, and foundation are still the Internet; second, the scope of the IoT has been extended to everything to achieve information exchange. The architecture of the IoT is shown in Figure 2. The IoT is regarded as a new generation of information and communication technology. It has three very important basic elements, namely, information collection, information transmission, and information processing; the most critical technology is information collection [16]. By collecting and processing information, we can achieve the real-time detection of various information. With the IoT, we can provide users with a variety of new services and benefits, as their infrastructure allows for a more effective use of information about the user's context, such as geographic location, weather conditions, availability and functionality of various surrounding objects, history records before use, feedback from other users, and so on [17]. This allows users to create a personalized service space, configure the environment, and start services to represent themselves, thereby providing maximum personal comfort to all users at any time and any place.

5G IoT for Smart Tourism
With the development of Internet-based information acquisition technology, the IoT will start the third wave of industrialization [18]. The development and application of the IoT is not only one With the IoT, we can provide users with a variety of new services and benefits, as their infrastructure allows for a more effective use of information about the user's context, such as geographic location, weather conditions, availability and functionality of various surrounding objects, history records before use, feedback from other users, and so on [17]. This allows users to create a personalized service space, configure the environment, and start services to represent themselves, thereby providing maximum personal comfort to all users at any time and any place.

5G IoT for Smart Tourism
With the development of Internet-based information acquisition technology, the IoT will start the third wave of industrialization [18]. The development and application of the IoT is not only one of the important strategic measures to solve economic problems, but also a new trend of globalization, which will lead us into the new information age. The same holds true in its application in the travel and entertainment business. It can help improve the depth and breadth of information perception in each link and provide reliable support for the intelligentization of the tourism industry. The application of smart tourism is shown in Figure 3. People can easily access and use services intuitively. This means that comprehensive and well-designed services will play an important role in future cloud services. The scenic area management system is the most important link to realize tourism automation and high efficiency. The application of the IoT in tourist attractions not only conforms to the concept of "low carbon", but can also greatly improve management efficiency [19]. Potential applications of the IoT in tourism include wildlife monitoring and tracking, marine monitoring, bird and plant species surveillance, tourism information services, hotel services, tourism marketing, mountain climbing, and weather surveillance. The IoT can monitor and track wildlife in many ways, for instance via sensors implanted in animals, by transmitting data to a server for workers to access and track the animals' movements. The application of the IoT in smart tourism can implement functions such as tourism management, ticket management, passenger flow management, information collection, security monitoring, and environmental monitoring. Moreover, statistical analysis, statistical reporting, and integrated management are easier to implement for IoT systems. of the important strategic measures to solve economic problems, but also a new trend of globalization, which will lead us into the new information age. The same holds true in its application in the travel and entertainment business. It can help improve the depth and breadth of information perception in each link and provide reliable support for the intelligentization of the tourism industry. The application of smart tourism is shown in Figure 3. People can easily access and use services intuitively. This means that comprehensive and well-designed services will play an important role in future cloud services. The scenic area management system is the most important link to realize tourism automation and high efficiency. The application of the IoT in tourist attractions not only conforms to the concept of "low carbon", but can also greatly improve management efficiency [19]. Potential applications of the IoT in tourism include wildlife monitoring and tracking, marine monitoring, bird and plant species surveillance, tourism information services, hotel services, tourism marketing, mountain climbing, and weather surveillance. The IoT can monitor and track wildlife in many ways, for instance via sensors implanted in animals, by transmitting data to a server for workers to access and track the animals' movements. The application of the IoT in smart tourism can implement functions such as tourism management, ticket management, passenger flow management, information collection, security monitoring, and environmental monitoring. Moreover, statistical analysis, statistical reporting, and integrated management are easier to implement for IoT systems. The combination of 5G, VR, AI, and other technologies allows tourists to experience the beauty and culture of their destination in an immersive manner regardless of the time and place. The early transmission of tourist attraction photos through 5G is also an advantage. Thus, users can enjoy the beautiful scenery without leaving home. Using 5G and voice interaction technology, intelligent robots can actively serve tourists, respond quickly to demand, and improve service efficiency in scenic spots. The intelligent hawk-eye system throughout the scenic area provides an analysis of the AI functions of edge computing through the 5G network, forming an invisible protection net in the scenic area, which can realize an unconscious service for tourists and a refined management. Users can also take photos and videos to generate corresponding interactive videos and high-definition video panorama travel notes, reproduce beautiful memories in three dimensions, and comprehensively record what they have seen. The scenic area can simultaneously generate a panoramic travel experience for each tourist and then enable him or her to immediately receive and share this experience upon leaving the scenic area.
The implementation of smart tourism requires the combination of the IoT with a variety of easyto-use applications, relying on four core information and communication technologies, i.e., the IoT, mobile communications, cloud computing, and AI technologies, to maximize the use of the IoT for The combination of 5G, VR, AI, and other technologies allows tourists to experience the beauty and culture of their destination in an immersive manner regardless of the time and place. The early transmission of tourist attraction photos through 5G is also an advantage. Thus, users can enjoy the beautiful scenery without leaving home. Using 5G and voice interaction technology, intelligent robots can actively serve tourists, respond quickly to demand, and improve service efficiency in scenic spots. The intelligent hawk-eye system throughout the scenic area provides an analysis of the AI functions of edge computing through the 5G network, forming an invisible protection net in the scenic area, which can realize an unconscious service for tourists and a refined management. Users can also take photos and videos to generate corresponding interactive videos and high-definition video panorama travel notes, reproduce beautiful memories in three dimensions, and comprehensively record what they have seen. The scenic area can simultaneously generate a panoramic travel experience for each Electronics 2020, 9, 947 5 of 14 tourist and then enable him or her to immediately receive and share this experience upon leaving the scenic area.
The implementation of smart tourism requires the combination of the IoT with a variety of easy-to-use applications, relying on four core information and communication technologies, i.e., the IoT, mobile communications, cloud computing, and AI technologies, to maximize the use of the IoT for information communication [20]. Therefore, the IoT technology will become an important element in the smart tourism solution to meet the needs of the new generation of tourists.
For further analysis, we expound how smart tourism services can achieve essential functions by choosing a destination and searching for suitable travel arrangements. These services provide a personalized representation of a destination, with a focus on attracting users' attention to inspire them to learn more about the place and ultimately visit it. This is accomplished by providing users with useful and interesting details, such as information about the experiences of tourists on the go (photos, reviews, experiences, etc. that people show on social networks), to help them choose the best travel arrangements for them [21]. The destinations may also be divided according to age groups, so as to improve efficiency when users are selecting destinations.
Users can also share publicly transmitted data instantly from their smartphones [22]. Traffic information (whether it is a public transport company or a private transport company) can immediately improve travelers' schedules and provide them with up-to-date information. It can differentiate users or travelers at different levels and provide useful information that is easy to understand. For example, when a user utilizes a smartphone to log into our service platform to read interactive devices such as RFID, Quick Response (QR) codes, or AR pictures, our system can help filter most of the information and display what the user likes or what suits him or her. The intelligent tourism management system based on the IoT has completely changed the traditional tourism management method and improved efficiency [23].

Human-Guided Machine Learning Introduction
Intelligent machines have become close companions of human beings. The interaction and cooperation between humans and intelligent machines will be indispensable in the formation of our future society [24]. However, many problems faced by humans are often highly complex, uncertain, and open ended. As humans are the arbiters of the ultimate goal of intelligent machines, human intervention in machines has been consistent throughout the development of these systems [25]. In addition, even if sufficient data resources or unlimited data resources are provided for the AI system, the possibility of human intervention in the intelligent system cannot be ruled out. Many problems of the AI system need to be solved, such as, for example, how to understand the nuances and ambiguities of human language in the face of human-computer interaction systems and how to avoid the risks and even the possible harm caused by the limitations of AI. The human-guided machine learning system combines machine and human intelligence to overcome the shortcomings of existing AI systems [26].
As shown in Figure 4, in human-guided machine learning systems, human intelligence can be integrated into the AI systems to complement the machine functions throughout their life cycle. Human-guided machine learning systems can share computing tasks with humans as needed to overcome the shortcomings of the AI systems. Human participation can prevent errors and failures that can be caused by AI systems working alone, and human feedback can lead to a benign improvement cycle, allowing the system to continuously learn [27].
Human-guided machine learning systems can also easily solve problems that are not easy to classify or that could not be integrated in machine learning. For different fields, different human-guided machine learning systems should be constructed. Human-guided machine learning models integrate machine learning and human decision making. They make use of machine learning (supervised and unsupervised) to create a model from the training data or a small number of samples and use the model to predict new data [28]. In the hybrid learning framework, when the system is abnormal or the computer prediction error rate is large, it should be determined whether the prediction needs to be adjusted manually or if manual intervention is required. The system's knowledge base is then automatically updated. Human intervention in the algorithm can improve the accuracy and credibility of the system. Of course, human-guided machine learning requires minimal personnel involvement so that the computers can do most of the work. The intelligence of the hybrid learning model can greatly expand the scale and efficiency of tasks that humans can complete.  Human-guided machine learning systems can also easily solve problems that are not easy to classify or that could not be integrated in machine learning. For different fields, different humanguided machine learning systems should be constructed. Human-guided machine learning models integrate machine learning and human decision making. They make use of machine learning (supervised and unsupervised) to create a model from the training data or a small number of samples and use the model to predict new data [28]. In the hybrid learning framework, when the system is abnormal or the computer prediction error rate is large, it should be determined whether the prediction needs to be adjusted manually or if manual intervention is required. The system's knowledge base is then automatically updated. Human intervention in the algorithm can improve the accuracy and credibility of the system. Of course, human-guided machine learning requires minimal personnel involvement so that the computers can do most of the work. The intelligence of the hybrid learning model can greatly expand the scale and efficiency of tasks that humans can complete.

Tourist Behavior Introduction
With the rising trends in travel, most travel websites provide the following services: travel information, traffic information, flight or hotel reservations, etc. A suitable travel time and a large number of historical orders will become the final destination for new tourist users. If we analyze the multiple behavior indicators of users, we expect to use these existing data to predict user behavior information, so that tourists can make decisions based on historical order information or browsing information.

Use Human-Guided Machine Learning to Analyze Tourist Behavior
By analyzing the behavior of travel users, we can help users quickly determine their travel destinations. Human-guided machine learning is an effective tool for data analysis. By using AI, we can cluster and classify user behaviors and predict user decision behaviors.
However, for users with personal information or fewer historical orders, using AI methods to analyze them will not achieve good results and will even increase the error of the entire system. Therefore, we use human-guided machine learning methods to analyze user behavior. For special users, we can remove or tag them using manual methods.

Experimental Analysis and Simulation
We used travel datasets published by the Data Castle (DC) contest to analyze user behavior. The dataset includes the information of more than 50,000 users in the travel app (some of them completed the order after browsing and some only viewed the historical order on the app), including personal information. Their millions of browsing records and corresponding historical order records also

Tourist Behavior Introduction
With the rising trends in travel, most travel websites provide the following services: travel information, traffic information, flight or hotel reservations, etc. A suitable travel time and a large number of historical orders will become the final destination for new tourist users. If we analyze the multiple behavior indicators of users, we expect to use these existing data to predict user behavior information, so that tourists can make decisions based on historical order information or browsing information.

Use Human-Guided Machine Learning to Analyze Tourist Behavior
By analyzing the behavior of travel users, we can help users quickly determine their travel destinations. Human-guided machine learning is an effective tool for data analysis. By using AI, we can cluster and classify user behaviors and predict user decision behaviors.
However, for users with personal information or fewer historical orders, using AI methods to analyze them will not achieve good results and will even increase the error of the entire system. Therefore, we use human-guided machine learning methods to analyze user behavior. For special users, we can remove or tag them using manual methods.

Experimental Analysis and Simulation
We used travel datasets published by the Data Castle (DC) contest to analyze user behavior. The dataset includes the information of more than 50,000 users in the travel app (some of them completed the order after browsing and some only viewed the historical order on the app), including personal information. Their millions of browsing records and corresponding historical order records also contain user reviews of historical orders.
By analyzing the data, we learned that, for most users, their choice of a tourist destination has a few regularities. For example, among the users distributed in 31 provinces in mainland China, about 61.78% come from Beijing, Shanghai, and Guangzhou. The destinations of these tourists are distributed in 51 countries worldwide, with about 54.64% of them choosing Asia. In order to predict a user's decision-making behavior, we mainly consider the following aspects when selecting variables as the user's decision-making object:

1.
The user browses travel information in order to determine whether to travel to a certain tourist destination.

2.
Judging the practical value of a desired tourist destination based on information such as user historical orders.
Combining the above points, we will train and predict whether users will travel to tourist destination A within a short period of time, which greatly saves users' preparation time in the early stages of travel.

Dataset Labeling Method
We need to label the travel user's choice of destination. The basis for labeling is the positioning technology for users. Positioning technology is mainly divided into two categories: GPS and Location-Based Service (LBS) [29]. In the GPS-based method, the Mobile Station (MS) receives and measures signal parameters from at least four of the 24 GPS satellites in the current network. Therefore, the GPS system has a relatively high accuracy and can provide very accurate location information about the user. However, in practical applications, embedding a GPS receiver in a mobile device will cause a large increase in cost, battery consumption, etc., and to a certain extent, expose the personal information of the user's geographic location.
The LBS technology obtains the location information of mobile terminal users through wireless communication networks of telecommunications and mobile operators [30]. Relying on Base Stations (BS) to collect user data, this technology can pinpoint the user's location accurately under the condition that there is no dead zone in the mobile phone network coverage area. It will not consume the power of the user's mobile terminal equipment, and there is no requirement for the hardware configuration of the mobile terminal equipment. Compared with GPS, this method is more feasible. Therefore, the positioning method used in this paper is the LBS BS positioning technology. The method is used to measure the signal parameters received by the mobile user on the network BS. Using this technique, the BS measures the signal sent from the MS and relays it to the central site for further processing and data fusion to provide an estimate of the MS's position [31,32].
The data fusion step combines measurement data from different BSs to obtain an estimate of the MS position. As shown in Figure 5, let (x m , y m ) represent the MS position coordinates in the Cartesian coordinate system. Let the coordinates of BSs (BS 1 , BS 2 , and BS 3 ) be (x 1 , y 1 ), (x 2 , y 2 ), and (x 3 , y 3 ). For simplicity, only the x and y coordinates are considered in the derivation. The z coordinates are ignored. This corresponds to the case where BSs and mobile users are located on a relatively flat plane. Without loss of generality, the origin of the Cartesian coordinate system is set to BS 1 , that is, (x 1 , y 1 ) = (0, 0). The most common signal parameters are the arrival time, angle, and amplitude of the MS signal. Here, we use the Time Difference of Arrival (TDOA) method to determine the approximate location of the user. For example, in Figure 5, the MS m detects that its coordinate position is within the coordinate range of the travel destination through the BS. It is considered to have reached the destination within a short time and can be marked as "1". However, if the MS m detects that its coordinates are not within the coordinates of the attractions, it is considered not to have reached the destination within a short time and can be marked as "0".
The TDOA data fusion method is estimated for the combined MS signals when three different BSs are reached [33]. Since the wireless signal travels at the speed of light (c = 3 × 10 8 m/s), the distance between MS and BS i is: Among them, t o is the time point when the MS starts transmission, and t i is the MS signal at BS i . The distance (x m , y m ) can be estimated by solving the following set of equations (r 1 , r 2 , r 3 ): Electronics 2020, 9, 947 8 of 14 Without loss of generality, we can assume that r 1 < r 2 < r 3 .
Electronics 2020, 9, x FOR PEER REVIEW 8 of 14 Figure 5. TDOA data fusion using multiple BSs.
The TDOA data fusion method is estimated for the combined MS signals when three different BSs are reached [33]. Since the wireless signal travels at the speed of light (c = 3 × 10 8 m/s), the distance between MS and BSi is: Among them, t o is the time point when the MS starts transmission, and ti is the MS signal at BSi. The distance (xm, ym) can be estimated by solving the following set of equations (r1, r2, r3): Without loss of generality, we can assume that r1 < r2 < r3. Next, we define the difference in distance between the MS and different BSi values as: Equation (3) can be rewritten as the TDOA measurement value r21 as: where: Expanding and rearranging the terms yields: Similarly, Equation (4) results in: Next, we define the difference in distance between the MS and different BS i values as: Equation (3) can be rewritten as the TDOA measurement value r 21 as: where: Expanding and rearranging the terms yields: Similarly, Equation (4) results in: Rewriting these equations in matrix form gives: where: Equation (10) can be used to solve the unknown r 1 of x: Substituting this intermediate result into Equation (2), we can obtain the quadratic equation in r 1 . Solve r 1 and substitute the positive root into Equation (12) to get the final solution of x.
If the MS position involves more than three BSs, then Equation (10) remains.
which yields the following least-squares intermediate solution: Combine this intermediate result with Equation (2) again to get the final estimate ofx. If the second-order statistics of TDOA measurement errors are known, a more accurate solution can be obtained, as in [34].
According to the calculation results, if the coordinates of the user's MS are within the range of the attractions, it proves that the user selects a certain travel destination in the short term, and it is marked as "1"; otherwise, it is marked as "0".

KNN Classification for Tourist Behavior
In this simulation, the dataset we used had information from more than 50,000 users. We first needed to go down the dataset and filter out the data with more meaningful simulation results. We used the TDOA method to determine the approximate location of the user and detect whether the coordinate position was within the coordinate range of the travel destination. By comprehensively considering the multiple classification algorithms in machine learning, as shown in Table 1, the classification ideas in this manuscript are basically consistent with the ideas of the KNN algorithm. KNN is simple and easy to use when compared with other algorithms. The model training time is fast, the prediction effect is good, and it is not sensitive to outliers, so KNN was chosen in this study for the simulation. Table 1. Advantages and Disadvantages of each Algorithm.

Algorithm Advantages Disadvantages
Decision Tree There is no need to generalize the data like for other algorithms, such as removing redundant or blank attributes.
(1) Difficult to use in dealing with missing data. (2) Leads to the emergence of overfitting problems. (3) Ignores the correlation between attributes in the dataset.

Bayes
The estimated parameters required are few, and the algorithm is simple.
(1) The prior probability needs to be known. (2) An error rate in the classification decision.

Support Vector Machine (SVM)
It is advantageous when the sample data capacity is small.
(1) No universal solution to nonlinear problems. (2) Lack of sensitivity to data.

KNN
It is simple, effective, and suitable for the automatic classification of class domains with relatively large sample sizes.
When the samples are unbalanced, the prediction accuracy of rare categories is low.
Next, we will use the KNN method to classify user behavior. KNN is a supervised learning method that classifies new data by measuring the distance between the new data point and the remaining labeled data points [35]. This is a distance-based algorithm that can determine the category of new data based on the category of the nearest neighbor [36]. We usually use the Euclidean method to calculate the distance between data points. Then, we will select the training data of the first k distances closest to the new data. The category with the most training data is the category of the new data. There is no fixed experience in choosing K values. Generally, a smaller value is selected based on the distribution of the samples, and a suitable K value can be selected through cross-validation.

Human-Guided Machine Learning-Based KNN Classification for Tourist Behavior
The traditional KNN algorithm does not involve the participation of AI, and there are inaccuracies when processing the edge information of some more complex data. Introducing human intelligence into the loop of intelligent systems can achieve a close coupling between advanced cognitive mechanisms for analysis and response in fuzzy and uncertain problems and machine intelligent systems. Therefore, the two adapt to and cooperate with each other, forming a two-way information exchange and control. Such "1 + 1 > 2" hybrid enhanced intelligence can be achieved by integrating human perception, cognitive ability, machine computing ability, and storage ability. Next, we apply the KNN algorithm, combined with human-guided machine learning, to the classification task for users.
In order to observe the classification error rate of KNN, we define the training error rate as the ratio of KNN training sample tags to input tags. The error rate is expressed as: Thus, the proper k can be maximized: in the training set. We use grid search and cross-validation to choose the appropriate K value to verify the accuracy of the classification [37]. First, we screened the data. After screening, there were 40,307 data in total. The data was divided into six groups. The first five groups were used for cross-validation. Each group had 6717 data, for a total of 33,585 data, while the sixth group had 6722 data. The data were used for the final accuracy test.
In the cross-validation, we used four sets of data as the training set and the remaining one set of data as the test set. Figure 6 shows the accuracy of the different numbers of neighbors of the training set and test set; when K = 8, the training error rate is minimal. Finally, the remaining 6722 data were checked as the final verification set. The correct rate was 84.06%.
In the field of machine learning, a confusion matrix is a specific matrix used to visualize the performance of an algorithm [38]. In predictive analysis, the confusion matrix is a table consisting of False Positives (FP), False Negatives (FN), True Positives (TP), and True Negatives (TN). By using an obfuscation matrix, it is easy to see if the learning machine will obfuscate two similar classes [39]. The "confusion matrix" of the classification results in this experiment is shown in Figure 7.
In order to comprehensively observe the generalization performance of the model, the Receiver Operating Characteristic (ROC) curve can be drawn to better observe the characteristics of the model's distinguishing types [40]. The ROC curve is usually more effective when the class distribution is not uniform. The Area Under Curve (AUC) is the area under the ROC curve, which is an indicator used to judge the performance of the learning machine [41]. The AUC-ROC curve is a performance measure for classification problems under various threshold settings. ROC is a probability curve, and AUC represents the degree or measure of separability. It embodies the distinguishing ability of a model. The higher the AUC, the higher the model's ability to distinguish between 0 and 1, that is, the more it can reflect whether the model can correctly predict the user's decision.
As shown in Figure 8a, the ROC curve is plotted with TPR versus FPR, where TPR is on the y-axis and FPR is on the x-axis. Points A, B, and C are the AUCs of three different classification results with different parameters. After comparing these three points, we can ascertain that the AUC-A has the best performance.
The ROC curve of the classification model we established is shown in Figure 8b. As shown in the figure, the curve is close to a semi-ellipse and the AUC = 0.809, indicating that the classification effect of our model is good. Therefore, using this model, we can classify whether travel users will choose destination A or not. Based on these classification results, corresponding services are provided to users.
Electronics 2020, 9, x FOR PEER REVIEW 11 of 14 had 6717 data, for a total of 33,585 data, while the sixth group had 6722 data. The data were used for the final accuracy test.
In the cross-validation, we used four sets of data as the training set and the remaining one set of data as the test set. Figure 6 shows the accuracy of the different numbers of neighbors of the training set and test set; when K = 8, the training error rate is minimal. Finally, the remaining 6722 data were checked as the final verification set. The correct rate was 84.06%. In the field of machine learning, a confusion matrix is a specific matrix used to visualize the performance of an algorithm [38]. In predictive analysis, the confusion matrix is a table consisting of False Positives (FP), False Negatives (FN), True Positives (TP), and True Negatives (TN). By using an obfuscation matrix, it is easy to see if the learning machine will obfuscate two similar classes [39]. The "confusion matrix" of the classification results in this experiment is shown in Figure 7. In order to comprehensively observe the generalization performance of the model, the Receiver Operating Characteristic (ROC) curve can be drawn to better observe the characteristics of the model's distinguishing types [40]. The ROC curve is usually more effective when the class distribution is not uniform. The Area Under Curve (AUC) is the area under the ROC curve, which is an indicator used to judge the performance of the learning machine [41]. The AUC-ROC curve is a performance measure for classification problems under various threshold settings. ROC is a probability curve, and AUC represents the degree or measure of separability. It embodies the had 6717 data, for a total of 33,585 data, while the sixth group had 6722 data. The data were used for the final accuracy test.
In the cross-validation, we used four sets of data as the training set and the remaining one set of data as the test set. Figure 6 shows the accuracy of the different numbers of neighbors of the training set and test set; when K = 8, the training error rate is minimal. Finally, the remaining 6722 data were checked as the final verification set. The correct rate was 84.06%. In the field of machine learning, a confusion matrix is a specific matrix used to visualize the performance of an algorithm [38]. In predictive analysis, the confusion matrix is a table consisting of False Positives (FP), False Negatives (FN), True Positives (TP), and True Negatives (TN). By using an obfuscation matrix, it is easy to see if the learning machine will obfuscate two similar classes [39]. The "confusion matrix" of the classification results in this experiment is shown in Figure 7. In order to comprehensively observe the generalization performance of the model, the Receiver Operating Characteristic (ROC) curve can be drawn to better observe the characteristics of the model's distinguishing types [40]. The ROC curve is usually more effective when the class distribution is not uniform. The Area Under Curve (AUC) is the area under the ROC curve, which is an indicator used to judge the performance of the learning machine [41]. The AUC-ROC curve is a performance measure for classification problems under various threshold settings. ROC is a probability curve, and AUC represents the degree or measure of separability. It embodies the distinguishing ability of a model. The higher the AUC, the higher the model's ability to distinguish between 0 and 1, that is, the more it can reflect whether the model can correctly predict the user's decision. As shown in Figure 8a, the ROC curve is plotted with TPR versus FPR, where TPR is on the yaxis and FPR is on the x-axis. Points A, B, and C are the AUCs of three different classification results with different parameters. After comparing these three points, we can ascertain that the AUC-A has the best performance. The ROC curve of the classification model we established is shown in Figure 8b. As shown in the figure, the curve is close to a semi-ellipse and the AUC = 0.809, indicating that the classification effect of our model is good. Therefore, using this model, we can classify whether travel users will choose destination A or not. Based on these classification results, corresponding services are provided to users.

Conclusions
With the continuous development of the tourism industry, integrating the IoT into tourism projects is a very promising technology. Smart tourism uses the IoT to maximize information analysis and information integration, as well as to achieve a fast and convenient information exchange between users. In other words, the IoT technology will become an important element to meet the needs of the new generation of tourists. Based on this, we propose a tourist behavior decision-making method based on human-guided machine learning called KNN classification. This classification method can effectively help tourists make a decision on whether to choose a certain travel destination based on historical order data and historical browsing information. The results obtained from the cross-validation experiments and performance evaluation prove the effectiveness of the method.
Author Contributions: R.P. contributed the central idea, analyzed most of the data, and wrote the initial draft of the paper. Y.L. contributed the idea for the simulation experiment and provided constructive suggestions. M.K. contributed in the revision of the paper and polishing of the language. All authors discussed the results and revised the manuscript. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.