Using the TensorFlow Deep Neural Network to Classify Mainland China Visitor Behaviours in Hong Kong from Check-in Data

: Over the past decade, big data, including Global Positioning System (GPS) data, mobile phone tracking data and social media check-in data, have been widely used to analyse human movements and behaviours. Tourism management researchers have noted the potential of applying these data to study tourist behaviours, and many studies have shown that social media check-in data can provide new opportunities for extracting tourism activities and tourist behaviours. However, traditional methods may not be suitable for extracting comprehensive tourist behaviours due to the complexity and diversity of human behaviours. Studies have shown that deep neural networks have outpaced the abilities of human beings in many ﬁelds and that deep neural networks can be explained in a psychological manner. Thus, deep neural network methods can potentially be used to understand human behaviours. In this paper, a deep learning neural network constructed in TensorFlow is applied to classify Mainland China visitor behaviours in Hong Kong, and the characteristics of these visitors are analysed to verify the classiﬁcation results. For the social science classiﬁcation problem investigated in this study, the deep neural network classiﬁer in TensorFlow provides better accuracy and more lucid visualisation than do traditional neural network methods, even for erratic classiﬁcation rules. Furthermore, the results of this study reveal that TensorFlow has considerable potential for application in the human geography ﬁeld.


Introduction
In recent years, considerable research has focused on human mobility and travel behaviours using big data.These big data include Global Positioning System (GPS) data [1,2], mobile phone tracking data [3,4], and social media check-in data [5,6].These data have been widely used to determine transportation patterns [3,4], urban daily commuting behaviours [7] and even dynamic movement trajectories in combined spatiotemporal analyses [8,9].Specifically, studies of tourism management have explored the potential of applying big data to assess tourist behaviours.In 2015, the world tourism industry generated a revenue of $1.5 trillion, with 1.2 billion international arrivals [10].Therefore, ISPRS Int.J. Geo-Inf.2018, 7, 158 2 of 20 such studies of tourist management are essential to the tourism industry, which plays an important role in economic development in many countries and regions, particularly in popular tourist destinations.In recent years, tourism research has focused more on tourists than on tourism resources, particularly tourist movements and behaviours.However, human behaviours are complex; they may be prompted by intentions or habits; modified by skill, affect and attitude; and affected by physical and contextual conditions [11].Many methods either are based on certain assumptions to simulate human behaviours or are unable to consider all the factors that influence human behaviour [12].
In this context, deep learning methods may provide state-of-the-art solutions to comprehensively understand human behaviours.Since Google's Artificial Intelligence programme AlphaGo became the first computer programme to beat a 9-dan professional human player without a handicap in March 2016 [13], people have paid increasing attention to deep learning and artificial intelligence and noted the powerful potential of their applications; moreover, deep neural networks have rapidly outpaced human beings' understanding of the nature of their solutions [14].Specifically, Google DeepMind developed AlphaGo in 2015.In April 2016, DeepMind started using TensorFlow for future research and eventually moved completely to TensorFlow.TensorFlow is a deep learning open source library developed by Google Inc.Since the library was open sourced in November 2015, TensorFlow has been able to excel in image processing [15], including handwritten digit recognition [16,17], visual object recognition and detection [18] from images, and even dynamic object tracking from video [19].In addition, the tool can be used in voice recognition, natural language processing [20], etc.A new study on DeepMind showed that deep neural networks can be successfully explained in a psychological manner [14], suggesting that deep neural networks can potentially understand and extract human behaviours in a more interpretable way.Many studies have already applied deep learning methods in various areas of human behaviour research, such as human action recognition [21] and human trajectory prediction [22].Still, few attempts have been made to understand tourist behaviours.In this paper, we explore the possibility of applying the TensorFlow deep neural network to tourism geography to classify tourist behaviours and innovatively implement a deep learning method constructed using TensorFlow to classify behaviours of check-in users based on neural network theory.
The following paper is composed of five parts.Following an introduction, Section 2 briefly reviews existing studies about tourist research involving social media data and human behaviour research involving deep learning methods.Section 3 illustrates the research methodology.In this section, we provide a brief introduction to the theory underlying TensorFlow and neural networks and illustrate the data processing flow.Specifically, the data processing flow includes preprocessing and classification steps.Section 4 introduces the study area, explains the data sources and discusses the concrete data preprocessing steps.Section 5 presents the classification results and resulting analyses.Notably, we present the classification results and compare the accuracy and other metrics of the proposed method with that of other traditional neural networks.Moreover, we determine the proportion of each classification result and analyse the characteristics of visitors.Finally, Section 6 concludes the paper, discusses the strengths and limitations of the study, and offers future research directions.

User-Generated Big Data for Tourist Research
Traditional demographic, survey, and opinion poll data [23][24][25] have been used to assess tourist behaviour patterns.Additionally, these data have been combined with traditional multivariate statistical methods, such as logistic regression analysis [23] or principal component analysis [24].However, these data require sampling, are limited in extent and are difficult to collect and update; therefore, it is difficult to comprehensively capture up-to-date tourist behaviours [26].These limitations have been largely overcome with the emergence of big data.Big data sources provide dynamic and up-to-date data for studies of tourist behaviour and provide better insight into tourism preferences and tourism resource management than other sources do.Before 2010, Lau and McKercher attempted to use geographic information systems (GIS) to explore tourist movement patterns [27].Later, they used GPS recorders to produce highly accurate and fine-grained trajectory data and GIS analysis to identify 78 discrete movement patterns [28].Leung et al. collected trip diaries from six different websites and used content and social network analyses to analyse and map overseas tourist patterns in Beijing during the Olympics [29].Many studies have used geotagged photographs, such as those from Flickr, to mine the characteristics of tourist behaviours [26,30].Uncovering tourist behaviours can contribute to tourist attraction prediction [2,31,32].In addition, tourist behaviours can be combined with the morphological structures of tourist attractions using space syntax analysis to manage and protect tourism resources [33].Specifically, identifying and classifying tourist behaviours can help tourism managers understand different tourist preferences and contribute to personalised tourist attraction recommendations.In the early 1980s, Plog et al. summarised eight tourist characteristics according to all existing typologies [34].McKercher and his colleagues compared the behavioural patterns of first-time and repeat visitors to Hong Kong and found that the visitors adopted different travel patterns [1].Padhi et al. demonstrated that there are three primary types of tourists: those who travel for business purposes, those who travel for leisure and those who travel to academic conferences [31].Bianchi et al. classified travellers in Chile into short-haul travellers and long-haul travellers to investigate their respective intentions [35].
The results of the aforementioned studies suggest that over the past decade, big data have become more widely used than traditional data in tourist behaviour research, and an increasing number of big data processing methods have arisen.However, human behaviours are complex and diverse, and traditional models may struggle to learn and express patterns of human behaviour.Thus, deep learning methods that have developed rapidly and succeeded in many fields in recent years may provide a novel way to learn human behaviours.

Deep Learning Methods for Human Behaviours
Deep learning methods have recently become popular in solving supervised learning problems in many fields, such as image processing [36,37], speech recognition [38,39] and natural language processing [40].Specifically, an increasing number of researchers have attempted to apply deep learning methods to the study of human behaviour.The reason is that human behaviour is complex; indeed, human behaviour may be prompted by intentions or habits; modified by skill, affect and attitude; and affected by physical and contextual conditions [11].Hartford et al. built a deep neural networks model to predict human participants' behaviour in strategic settings, as most existing approaches either assumed participants to be perfectly rational or attempted to directly model each participant's cognitive processes based on insights from cognitive psychology and experimental economics [12].In the field of human action recognition, Baccouche et al. proposed a fully automated model incorporating convolutional neural networks and recurrent neural networks to well classify human actions [21].Fei-Fei Li and her team proposed a prediction model for human trajectory in a crowded space named "Social LSTM", which stands in contrast to traditional methods using hand-crafted functions.The difficulty of trajectory prediction in a crowded space is that not only should each individual trajectory be considered, but also, interactions among humans cannot be neglected, which in traditional methods may fail to be thoroughly considered [22].Yao et al. predicted next locations in trajectories on a larger temporal and spatial scale with Twitter data and obtained satisfactory accuracy [41].Compared with other traditional methods, such as the use of hidden Markov models, the proposed method's higher performance may be due to the ability of LSTM to make full use of contextual locations rather than merely relying on the last several locations.In general, deep learning methods have demonstrated much recent success in a variety of human behaviour research fields.Nevertheless, few studies have focused on tourist behaviour, which is of great importance to tourist management and economic development.The aforementioned studies motivate us to apply deep learning methods to tourism research.

TensorFlow and Neural Networks
In this paper, TensorFlow is mainly used to classify tourist behaviours.TensorFlow is an open-source machine learning library created by the Google Brain Team's researchers and engineers.The library was originally developed for machine learning and deep neural network research and was open sourced in November 2015 by Google [42].TensorFlow uses data flow graphs to represent all computational operations and data in the machine learning algorithm.In TensorFlow, nodes in the graph represent mathematical operations and the start of feeding in or the end of outputting information.Edges represent multidimensional data arrays (tensors) that communicate between nodes [43].These tensors flow to all the nodes and ultimately complete the machine learning process.TensorFlow also provides a convenient visualisation tool called TensorBoard to easily display images of computational graphs.
Most algorithms in TensorFlow are based on neural networks.Neural networks, also called artificial neural networks or connectionist systems, were originally inspired by the biological neural networks that constitute animal brains.A neural network is a massively parallel distributed processor composed of simple processing units that has a natural propensity for storing experiential knowledge and making it available for use [44].The basic elements of a neural network include a neuron, a set of synapses, an adder and an activation function.A neuron is an information-processing unit that is fundamental to the operation of a neural network.Each connection (synapse) between neurons is characterised by a weight or strength of its own and can transmit a signal to another neuron.An adder is used to sum the input signals, which are weighted by the respective synaptic strengths of the neurons.An activation function is used to limit the amplitude of the output of a neuron (Figure 1).We can describe the neuron k in Figure 1 Neurons are typically organised in layers.Typically, there is an input layer, an output layer and a hidden layer.The input layer represents the first layer through which input signals travel before entering the network.The output layer is the final layer, and it outputs the result of the entire network.The hidden layer is the layer between the input and output layers.The more hidden layers that are present, the deeper the neural network architecture is.

TensorFlow and Neural Networks
In this paper, TensorFlow is mainly used to classify tourist behaviours.
TensorFlow is an opensource machine learning library created by the Google Brain Team's researchers and engineers.The library was originally developed for machine learning and deep neural network research and was open sourced in November 2015 by Google [42].TensorFlow uses data flow graphs to represent all computational operations and data in the machine learning algorithm.In TensorFlow, nodes in the graph represent mathematical operations and the start of feeding in or the end of outputting information.Edges represent multidimensional data arrays (tensors) that communicate between nodes [43].These tensors flow to all the nodes and ultimately complete the machine learning process.TensorFlow also provides a convenient visualisation tool called TensorBoard to easily display images of computational graphs.
Most algorithms in TensorFlow are based on neural networks.Neural networks, also called artificial neural networks or connectionist systems, were originally inspired by the biological neural networks that constitute animal brains.A neural network is a massively parallel distributed processor composed of simple processing units that has a natural propensity for storing experiential knowledge and making it available for use [44].The basic elements of a neural network include a neuron, a set of synapses, an adder and an activation function.A neuron is an information-processing unit that is fundamental to the operation of a neural network.Each connection (synapse) between neurons is characterised by a weight or strength of its own and can transmit a signal to another neuron.An adder is used to sum the input signals, which are weighted by the respective synaptic strengths of the neurons.An activation function is used to limit the amplitude of the output of a neuron (Figure 1).We can describe the neuron in Figure 1 mathematically.
Neurons are typically organised in layers.Typically, there is an input layer, an output layer and a hidden layer.The input layer represents the first layer through which input signals travel before entering the network.The output layer is the final layer, and it outputs the result of the entire network.The hidden layer is the layer between the input and output layers.The more hidden layers that are present, the deeper the neural network architecture is.Specifically, tf.contrib.learn,one of the multiple application programme interfaces (APIs) in TensorFlow, enables users to easily increase the number of hidden layers and other parameters and rapidly build a model without massive duplicate codes, making it easy to configure, train, and evaluate a variety of machine learning models [45].In this paper, DNNClassifier (Deep Neural Network Classifier), a well-encapsulated and easy-to-use classifier model of the tf.contrib.learnAPI based on a deep neural network, is mainly used to classify user behaviours.Specifically, tf.contrib.learn,one of the multiple application programme interfaces (APIs) in TensorFlow, enables users to easily increase the number of hidden layers and other parameters and rapidly build a model without massive duplicate codes, making it easy to configure, train, and evaluate a variety of machine learning models [45].In this paper, DNNClassifier (Deep Neural Network Classifier), a well-encapsulated and easy-to-use classifier model of the tf.contrib.learnAPI based on a deep neural network, is mainly used to classify user behaviours.

Data Processing
Social media check-in data are used in this study.Before inputting training data to TensorFlow and classifying tourist behaviours, some preliminary work is required to process the check-in data.Suppose there are k types of points of interest (POIs) that are reclassified according to visitor classification requirements.The dataset of users is U = {u 1 , u 2 , . . . ,u i , . . . ,u n } , and the dataset of POIs is P = {p 1 , p 2 , . . . ,p k } .The total number of check-in records for a user u i is sum i , and the number of check-in records that are created for each type of POI is represented by P i = {p i1 , p i2 , . . . ,p ik } .Thus, the frequency of each type of POI is expressed as follows.
Therefore, the dataset of each user checking in to each type of POI is According to prior research, the check-in behaviours of users are generally assessed based on the types of POIs where user check-ins occur [5].Consequently, we can classify a user's behaviour according to the largest check-in frequency and the combination of the frequencies of all types of POIs.Then, we categorise m types of visitors and establish corresponding classification rules according to the status quo.However, visitor behaviours are classified mainly based on the dominant categories and the combinations of check-in POIs.In this classification approach, visitors may not be strictly mutually exclusive (i.e., some visitors may be associated with the features of more than one classification and may be difficult to distinguish) because of the diversity and complexity of human activities.Such complications increase the difficulty of deep neural network classification using TensorFlow.
Because the TensorFlow DNNClassifier is a supervised neural network, an artificially classified training dataset is needed before training.Additionally, a test dataset is required to validate the accuracy of the neural network.Therefore, after establishing the classification rules, we must classify a portion of the users according to the established rules.Generally, the ratio of the training dataset to the test dataset is 80%:20%.Thus, we can construct a neural network classifier with tf.contrib.learn.DNNClassifier.Next, the parameters of the classifier are optimised, including the number of hidden layers, the number of units in each hidden layer, and the number of global iteration steps.The model is fit using the training data, and the test dataset is used to evaluate the accuracy of the model.The parameters are adjusted to improve the accuracy if necessary.After an optimal accuracy is reached, the remaining unclassified records are classified.The entire workflow of user behaviour classification is illustrated in Figure 2.

Research Case
Hong Kong is located on the southeast coast of China and is adjacent to Shenzhen City, Guangdong Province (Figure 3).The city is a highly prosperous international metropolis with a total area of over one thousand square kilometres and a population of over seven million people since 2014.Hong Kong is one of the most famous tourist cities in the world.The city is praised as a "shopping paradise", "gourmet paradise" and the "oriental pearl".However, few researchers in Mainland China have focused on tourism in Hong Kong compared with studies of other cities in Mainland China.According to the monthly visitor arrival statistics of the Hong Kong Tourism Board, the total number of visitor arrivals to Hong Kong in December 2015 reached 5 million, which increased by 5.4% in December 2016.Moreover, it is estimated that the number of visitor arrivals will continue to grow [46].Additionally, 73% of visitors in December 2015 were from Mainland China, and the growth rate of visitors from Mainland China is 1.1 times higher than that of the total number of visitors.Thus, the tourism industry of Hong Kong is driven by tourists from Mainland China, and this number continues to grow.
Since the return of the sovereignty of Hong Kong to China in 1997, Hong Kong has been defined as the Hong Kong Special Administrative Region of the People's Republic of China, and the "one country, two systems" policy has been implemented.Consequently, Hong Kong has its own social system, currency, tariff preference, etc.Some residents in Mainland China prefer the low-price merchandise, good education resources, etc., that Hong Kong offers.Consequently, many Mainland China visitors visit Hong Kong not only for tourism or vacation but also for other reasons, such as shopping and education.Residents in Mainland China must apply for an "Exit-Entry Permit for Travelling to and from Hong Kong" to enter the territory of Hong Kong and have a time limit on their stay in Hong Kong.
Based on the status quo, it is necessary to study the preferences and behaviours of visitors to Hong Kong from Mainland China to manage and improve tourism quality and estimate the influence of these visitors on Hong Kong.
as the Hong Kong Special Administrative Region of the People's Republic of China, and the "one country, two systems" policy has been implemented.Consequently, Hong Kong has its own social system, currency, tariff preference, etc.Some residents in Mainland China prefer the low-price merchandise, good education resources, etc., that Hong Kong offers.Consequently, many Mainland China visitors visit Hong Kong not only for tourism or vacation but also for other reasons, such as shopping and education.Residents in Mainland China must apply for an "Exit-Entry Permit for Travelling to and from Hong Kong" to enter the territory of Hong Kong and have a time limit on their stay in Hong Kong.
Based on the status quo, it is necessary to study the preferences and behaviours of visitors to Hong Kong from Mainland China to manage and improve tourism quality and estimate the influence of these visitors on Hong Kong.

Data Specification
Check-in data from Sina Weibo are used to extract and analyse the behaviours of visitors from Mainland China.Weibo is a famous social networking platform in China similar to Twitter.Users can create check-in records with location information and other forms of information, such as words, pictures and video.Many users utilise Weibo to record their daily lives.Therefore, these check-in data can reflect user activities to some extent.We focus on check-in data from Weibo in Hong Kong created between January 2014 and December 2014.In addition, to avoid ambiguities in judgement, we remove users who made no more than two check-ins in Hong Kong during this period.After removing these user records, we analyse 259,062 check-in records for more than 42,000 users with accounts registered in Mainland China.More than 9000 POIs are included in these records.
Because POIs in Weibo use the coordinate system of Gaode Map [47], POIs are obtained and categorised according to the Gaode POI Classification Code [48].Therefore, based on the Gaode POI Classification Code and our research objectives, we first reclassify the aforementioned POIs into nine types (Table 1): common attractions, special event attractions, transport, hotels, catering, retail, education, residence, and other.In this context, common attractions are tourist attractions or scenic spots that tourists can visit whenever they want and are not affected by special events.For instance, theme parks, natural areas, etc., qualify as common attractions.Compared with common attractions, special event attractions are POIs with check-in records remarkably affected by special events, such as international conferences, exhibitions, and concerts.Correspondingly, these types of POIs include conference and exhibition centres, coliseums, etc.To distinguish among transit passengers and other visitors, transport POIs only include airports and wharfs, as well as their surrounding areas and other associated places.The hotels category includes hotels, family inns, youth hostels, etc. Catering includes restaurants, cafés, nosheries, bakeries, pubs, etc.The retail category includes retail stores, shopping malls, commercial streets, night markets, etc. Education includes colleges and universities, adult education institutions, secondary schools, primary schools, kindergartens, public libraries and other relevant places.The residence category includes rental houses, villas, etc.Finally, other places are those excluded from the abovementioned categories.These POIs mainly include public spaces, such as hospitals, courts and post offices.According to the status quo, the behaviours of Mainland visitors to Hong Kong are preclassified into the following types (Table 2): (1) Purchasing-oriented visitors.Because of tariff preferences and monetary exchange rates, mainland residents, particularly residents living near Hong Kong (such as residents in Guangdong Province or other neighbouring provinces), are fond of buying in Hong Kong.Some people are even professional "daigou", which means that they buy products in Hong Kong on behalf of mainland residents [49].Therefore, the main purpose of visitors of this type is purchasing.Most of their check-in POIs are shopping malls, retail stores, etc. (2) Tourism-oriented visitors.Visitors of this type are typical tourists.Their check-in locations are mainly concentrated in tourist and scenic spots, as well as hotels.In addition, because Hong Kong is a famous "shopping paradise" and "gourmet paradise", some of these visitors' check-in locations are word-of-mouth shopping malls and restaurants.(3) Special event-oriented visitors.This type of visitor comes to Hong Kong for particular events, such as concerts, large international conferences and exhibitions.The majority of these visitors' check-in locations are conference and exhibition centres and coliseums.Additionally, those who participate in the same event have similar check-in records over a certain time at a certain place.(4) Education-oriented visitors.These visitors can be subdivided into two types.The first type is those who study and live in Hong Kong and can be regarded as temporary residents.Most of these visitors are undergraduates or postgraduates, and some are middle school students.The other type is those who are born and study in Hong Kong but live in Mainland China [50].These students are called "Shenzhen-Hong Kong cross-boundary students".These students are common because many mainland pregnant women give birth to children in Hong Kong, and their children, who do not have Hukou in mainland, cannot study in mainland public schools.(5) En-route visitors.Visitors of this type merely pass through Hong Kong while travelling to other destinations.Notably, many international flights stop at Hong Kong International Airport.Additionally, there are ports in Hong Kong where ships can transfer passengers to other regions or to ships from other regions.(6) Others.Other visitors are those who cannot be classified into the aforementioned categories.

Classification Results of TensorFlow
In this analysis, we classify a training dataset of 4000 records and a test dataset of 1000 records artificially according to the classification rules.Then, we construct a neural network with four fully connected hidden layers, with 10, 20, 20 and 10 units in each layer.We leverage four metrics to evaluate the DNNClassifier, including classification accuracy, precision, recall and f-score.Accuracy is the percentage of the correctly classified results among all the classified results, which is a commonly used and easy-to-understand metric in measuring the quality of a classifier.Precision is the ratio describing how many classified samples are correct, and recall is the ratio describing how many actual labels were correctly classified [51].Recall and precision are often traded off such that a very high precision often accompanies a low recall.Therefore, the f-score is also introduced to evaluate the classifier.The f-score is a combined metric that can balance recall and precision to measure the quality of a classifier.Only if both precision and recall are fairly high will the f-score reach a high value.The f-score is calculated as follows: f score = 2 * precision * recall/(precision + recall) In addition, we compare these metrics of the DNNClassifier with those of other traditional machine learning classification models, including back propagation neural networks (BPNNs), radial basis function neural networks (RBFNNs), random forest methods and support vector machines (SVMs).BPNNs are multilayer feedforward neural networks trained according to an error back propagation algorithm [52,53] and represent one of the most widely applied neural network architectures [54].RBFNNs are also feedforward neural networks, but they feature three fixed layers: an input layer, a single hidden layer and an output layer [55].Random forest models are not neural networks but a combination of tree predictors, such that each tree depends on the values in a random vector sampled independently and with the same distribution for all trees in the forest [56].SVMs are state-of-the-art discriminative classifiers that incorporate statistical learning, maximum margin optimal hyperplane and other concepts [57].
In this comparison, the main parameters of the compared methods are set as follows: A BPNN is constructed with three hidden layers of 10, 20 and 10 units in each layer using the Levenberg-Marquardt algorithm as the training function [58].The RBF has a single hidden layer with 10 units.In addition, the random forest model parameters include a maximum tree depth of 10, a minimum leaf size of 5 and 100 trees.The penalty parameter C and kernel function of the SVM are 1.0 and an RBF kernel, respectively.In addition, we repeat the training processing of all the aforementioned methods five times and obtain the average performance to avoid random errors.Table 3 shows the performance of all the methods and the bold fonts in each column represent the best performance of each metric.TensorFlow DNNClassifier outperforms the other methods in accuracy, recall and f-score.The accuracy of DNNClassifier can reach 92.43%, which is 2.18% to 5.83% higher than that of the other models, followed by the accuracies of the BPNN and SVM.For precision, although the random tree model provides the highest precision, it has a low recall and therefore a low f-score, indicating some minor classes may be easily misclassified.Both the recall and f-score of DNNClassifier are considerably high compared with those of the other methods, denoting that DNNClassifier is a comparatively well-performing classifier when addressing human behaviour classification problems.In addition to the improved performance, another advantage of TensorFlow is its powerful monitoring and visualisation tools.Without any monitoring or logging information, the classification training can be considered a black box approach.We monitor every 100 global steps in the DNNClassifier model and visualise them in TensorBoard.Through the graph visualisation, we can view the entire computational graph of the model (Figure 4) and the expansion of the DNN layer (Figure 5).Additionally, the scalar summary illustrates the progression of the accuracy (Figure 6a) and loss values (Figure 6b).As the number of global steps increases, the loss value decreases sharply, particularly at 1000 global steps, and then decreases slowly over subsequent iterations.The accuracy value continuously increases and remains steady after approximately 1500 global steps.The final accuracy and loss values are 92.3% and 0.19, respectively, after 3000 global steps.
RBF kernel, respectively.In addition, we repeat the training processing of all the aforementioned methods five times and obtain the average performance to avoid random errors.Table 3 shows the performance of all the methods and the bold fonts in each column represent the best performance of each metric.TensorFlow DNNClassifier outperforms the other methods in accuracy, recall and fscore.The accuracy of DNNClassifier can reach 92.43%, which is 2.18% to 5.83% higher than that of the other models, followed by the accuracies of the BPNN and SVM.For precision, although the random tree model provides the highest precision, it has a low recall and therefore a low f-score, indicating some minor classes may be easily misclassified.Both the recall and f-score of DNNClassifier are considerably high compared with those of the other methods, denoting that DNNClassifier is a comparatively well-performing classifier when addressing human behaviour classification problems.
In addition to the improved performance, another advantage of TensorFlow is its powerful monitoring and visualisation tools.Without any monitoring or logging information, the classification training can be considered a black box approach.We monitor every 100 global steps in the DNNClassifier model and visualise them in TensorBoard.Through the graph visualisation, we can view the entire computational graph of the model (Figure 4) and the expansion of the DNN layer (Figure 5).Additionally, the scalar summary illustrates the progression of the accuracy (Figure 6a) and loss values (Figure 6b).As the number of global steps increases, the loss value decreases sharply, particularly at 1000 global steps, and then decreases slowly over subsequent iterations.The accuracy value continuously increases and remains steady after approximately 1500 global steps.The final accuracy and loss values are 92.3% and 0.19, respectively, after 3000 global steps.

Proportions and Characteristics of Visitor Behaviours
The proportions of different classification types in the final classification results are shown in Table 4.In this section, we summarise and depict certain characteristics of visitor behaviours.
ISPRS Int.J. Geo-Inf.2018, 7, 158 of 20 In addition, we analyse the spatial, temporal and other features of major classification to verify the results.

Tourism-Oriented Visitors
As shown in Table 4, tourism-oriented visitors account for the largest proportion of any class, at 64.9%, which is similar to the official value of 63% reported by visitors from Mainland China who vacation in Hong Kong [59].This result demonstrates that the majority of Mainland China visitors to Hong Kong travel for tourism or vacation.
(1) Force-directed graph of the top 20 tourist and scenic spots A force-directed graph is created using the Yifan Hu algorithm [60], as shown in Figure 7.
The graph shows the popularity of the 20 most popular tourist attractions as well as the relationships among attractions.These results cover 70% of the top ten favourite scenic spots listed in the Visitor Profile Report 2014 of the Hong Kong Tourism Board [59].Each node represents one scenic spot, and the size of the node represents the weight of the check-in frequency.An edge connects two nodes and reflects that a tourist chose to go to both POIs during their visit.The size of an edge is associated with the number of tourists who visited the two POIs.

Tourism-Oriented Visitors
As shown in Table 4, tourism-oriented visitors account for the largest proportion of any class, at 64.9%, which is similar to the official value of 63% reported by visitors from Mainland China who vacation in Hong Kong [59].This result demonstrates that the majority of Mainland China visitors to Hong Kong travel for tourism or vacation.
(1) Force-directed graph of the top 20 tourist and scenic spots A force-directed graph is created using the Yifan Hu algorithm [60], as shown in Figure 7.The graph shows the popularity of the 20 most popular tourist attractions as well as the relationships among attractions.These results cover 70% of the top ten favourite scenic spots listed in the Visitor Profile Report 2014 of the Hong Kong Tourism Board [59].Each node represents one scenic spot, and the size of the node represents the weight of the check-in frequency.An edge connects two nodes and reflects that a tourist chose to go to both POIs during their visit.The size of an edge is associated with the number of tourists who visited the two POIs.
Based on the graph, we can conclude that the major tourism pattern in Hong Kong is "theme park" + "shopping mall".Hong Kong Disneyland and Hong Kong Ocean Park play core roles as major tourist spots in Hong Kong.Although the category of shopping malls cannot be strictly regarded as a tourist or scenic category, the large number of check-in records associated with shopping malls suggests that Hong Kong deserves its reputation as a "shopping paradise", and shopping is appealing to tourists.Other popular POIs include landmark scenic spots, such as Victoria Harbour, the Avenue of the Stars, and Lan Kwai Fong.In addition, because of the rich and profound film and television culture in Hong Kong, many places have become popular due to the filming of a certain movie or television show.For instance, Central-Mid Escalator and Chungking Mansion, two locations where scenes from the classic movie "Chungking Express" directed by Kar Wai Wong were filmed, plotted in the top 20 scenic spots graph, although Chungking Mansion is not a typical or official scenic spot.(2) Monthly visit analysis We compare the number of check-in records of tourism-oriented visitors per month with the official statistics regarding the number of Mainland China tourists per month (Figure 8).Although certain small-scale trends fluctuate, the overall trend is approximately consistent.Specifically, the line plot of check-in records shows that July, August and December are major tourist months.Based on the graph, we can conclude that the major tourism pattern in Hong Kong is "theme park" + "shopping mall".Hong Kong Disneyland and Hong Kong Ocean Park play core roles as major tourist spots in Hong Kong.Although the category of shopping malls cannot be strictly regarded as a tourist or scenic category, the large number of check-in records associated with shopping malls ISPRS Int.J. Geo-Inf.2018, 7, 158 of 20 suggests that Hong Kong deserves its reputation as a "shopping paradise", and shopping is appealing to tourists.Other popular POIs include landmark scenic spots, such as Victoria Harbour, the Avenue of the Stars, and Lan Kwai Fong.In addition, because of the rich and profound film and television culture in Hong Kong, many places have become popular due to the filming of a certain movie or television show.For instance, Central-Mid Escalator and Chungking Mansion, two locations where scenes from the classic movie "Chungking Express" directed by Kar Wai Wong were filmed, plotted in the top 20 scenic spots graph, although Chungking Mansion is not a typical or official scenic spot.
(2) Monthly visit analysis We compare the number of check-in records of tourism-oriented visitors per month with the official statistics regarding the number of Mainland China tourists per month (Figure 8).Although certain small-scale trends fluctuate, the overall trend is approximately consistent.Specifically, the line plot of check-in records shows that July, August and December are major tourist months.

Purchasing-Oriented Visitors (1) Proportion of visitors from different sources
In this section, we analyse the registration locations of users in class 1 and compare the proportions of class 1 visitor sources with those of all visitors.We find that among all visitors, visitors from Guangdong Province, the province closest to Hong Kong, accounted for the largest proportion of total visitors at 30.8%.The following provinces are also neighbouring provinces: Shanghai (11.4%),Beijing (10.4%),Fujian (6.8%), Zhejiang (5.5%), Jiangsu (5.4%), Sichuan (4.3%), and Hubei (3.8%).In addition, visitors from Shenzhen account for 37.1% of visitors from Guangdong Province, followed by visitors from Guangzhou (30.2%),Foshan (4.9%), Dongguan (4.2%) and other cities in Guangdong (Figure 9).For class 1 visitors, the major provinces of origin were nearly the same, but the proportion of ISPRS Int.J. Geo-Inf.2018, 7, 158 of 20 Guangdong Province visitors increased to 55.0%.Moreover, the percentage of visitors from Shenzhen increased to 55.2% (Figure 10), and the percentages of visitors from Guangzhou (19.4%),Dongguan (4.5%), Foshan (3.3%) and other cities also varied.The statistical results are consistent with the actual situation, as most residents in Guangdong Province, particularly in cities adjacent to Hong Kong, such as Shenzhen, Guangzhou and Dongguan, visit Hong Kong often to purchase items due to the low transportation costs and convenient procedure of applying for an "Exit-Entry Permit for Travelling to and from Hong Kong".(

2) Kernel density analysis of visitor check-in locations
In this section, we analyse all check-in records of purchasing-oriented visitors and conduct a kernel density analysis.We find that the hot spots visited by these visitors are concentrated near the following locations: (3) Visit patterns In this section, we compare the visit patterns of purchasing-oriented visitors and tourism-oriented visitors, including their average stay times, proportions of same-day trips (i.e., trips without staying overnight) and average numbers of trips to Hong Kong in a year.Table 5 shows that the average stay time of purchasing-oriented visitors is shorter than that of tourism-oriented visitors, while the average numbers of trips to Hong Kong of purchasing-oriented visitors is greater than that of tourism-oriented visitors.In addition, the proportion of same-day trips for purchasing visitors is greater than 75%, while that of tourism-oriented visitors is less than 50%.This result reveals that tourism-oriented visitors do not visit Hong Kong as frequently as purchasing-oriented visitors do and are inclined to stay in Hong Kong overnight to visit multiple scenic spots during one trip.Conversely, purchasing-oriented visitors tend to visit Hong Kong many times throughout the year and generally stay in Hong Kong for only one day without staying overnight.

Special Event-Oriented Visitors
Previous studies have shown that social media check-in data can be used to detect urban events [61,62].Although we do not use a text-mining method in this study, it is still possible to detect some special events that large numbers of people attend based on check-ins at a certain place within a short period.We regard more than one check-in record on a certain date and at a certain place as a supposed event (for example, May 22nd at Hong Kong Coliseum and May 22nd at the ISPRS Int.J. Geo-Inf.2018, 7, 158 of 20 AsiaWorld Expo).The check-in records of special event-oriented visitors indicate the occurrence of 366 supposed events based on 3820 check-in records.By looking up the event histories posted on the official websites of these places, we find that 251 of these events did occur and covered 88.0% of check-in records.In Table 6, we list the 20 most frequent check-in dates and corresponding POIs and events based on 36.8% of the check-in records of special event-oriented visitors.The popularity of a star or the influence of a conference or an exhibition results in the clustering of people at a certain place within a short period.In return, the check-in frequency can reflect the popularity of the star or the influence of the conference or exhibition to some extent.

Education-Oriented Visitors
The classification results obtained for education-oriented visitors can be used to subdivide these visitors into the following subclasses according to their major check-in locations (Table 7).Specifically, 95.3% of these visitors are university students, and 2.3% of them are secondary school students.These two types of visitors can actually be regarded as temporary residents.Only 0.3% of education-oriented visitors are primary school students, and unfortunately, we cannot detect the existence of "Shenzhen-Hong Kong cross-boundary students", as expected.This detection issue may be associated with the limited user scope of the social network, as the age range of major users is between 13 and 35 [63].

Discussion
Identifying and classifying tourist behaviours can help tourism managers understand different tourist preferences, recommend personalised tourist attractions, develop tourism products and manage tourism resources.However, human behaviours are diverse and complex; therefore, it is difficult to ISPRS Int.J. Geo-Inf.2018, 7, 158 of 20 assess and classify these behaviours.Deep learning methods may provide state-of-the-art solutions to these issues.Notably, deep learning methods can surpass the abilities of human beings in many fields, and deep neural networks can be explained in a psychological manner.The results suggest that the deep neural network in TensorFlow can be used to process the complex and erratic classification rules of user behaviour classification problems and yield results with satisfactory accuracy.In addition, the deep neural network in TensorFlow is not a "black box" due to its powerful monitoring and visualisation tools.In this paper, we process location-based social network data using a deep learning method.As data volumes increase, traditional data processing methods will increasingly struggle to process big data.Thus, the use of deep learning methods to process big data is trending.In this study, we use the deep learning method in TensorFlow to analyse tourism geography.Although TensorFlow has been available as an open-source product for nearly two years, it has been applied in few studies of human behaviour assessment.As deep learning methods become more popular, they should be applied in both natural science and social science to provide state-of-the-art solutions to social and human problems.
In future work, we will attempt to apply recurrent neural networks (RNNs) and the Word2Vec technique for the text mining of tweets to produce accurate information and classification results and further integrate deep learning methods and human geography. mathematically.

Figure 2 .
Figure 2. Workflow of user behaviour classification.

Figure 3 .
Figure 3.The case study city: Hong Kong.

Figure 3 .
Figure 3.The case study city: Hong Kong.

Figure 4 .
Figure 4.The complete Main Graph of the model.

Figure 4 .
Figure 4.The complete Main Graph of the model.

Figure 5 .
Figure 5.The expansion of the DNN layer.

Figure 6 .
Figure 6.Scalar visualisation of the DNNClassifier model: (a) accuracy value (b) and loss value.

Figure 7 .
Figure 7. Force-directed graph of tourist attractions of class 2 visitors.

Figure 7 .
Figure 7. Force-directed graph of tourist attractions of class 2 visitors.

Figure 8 .
Figure 8. Number of visitors based on check-ins and number of actual tourists.

Figure 9 .
Figure 9. Proportions of visitor sources for all visitors.
1 shopping malls close to Shenzhen-Hong Kong ports, such as shopping malls in Tuen Mun (close to Shen Bay Port) Sheung Shui and Fanling (close to Futian Port and Luohu Port); of 20

Table 1 .
Points of interest (POI) categories and check-in data.

Table 2 .
Classification of visitor behaviours.

Table 4 .
Proportion of each classification type.

Table 6 .
The twenty most frequent check-in dates and the corresponding special events.

Table 7 .
Sub-classifications and respective proportions of education-oriented visitors.