Enhancing the K-Means Algorithm through a Genetic Algorithm Based on Survey and Social Media Tourism Objectives for Tourism Path Recommendations

Damos, Mohamed A.; Zhu, Jun; Li, Weilian; Khalifa, Elhadi; Hassan, Abubakr; Elhabob, Rashad; Hm, Alaa; Ei, Esra

doi:10.3390/ijgi13020040

Open AccessArticle

Enhancing the K-Means Algorithm through a Genetic Algorithm Based on Survey and Social Media Tourism Objectives for Tourism Path Recommendations

by

Mohamed A. Damos

^1,2

,

Jun Zhu

^1,*,

Weilian Li

^1,3,

Elhadi Khalifa

²,

Abubakr Hassan

²

,

Rashad Elhabob

⁴

,

Alaa Hm

² and

Esra Ei

²

¹

Faculty of Geosciences and Environmental Engineering, Southwest Jiaotong University, Chengdu 611756, China

²

Faculty of Engineering, Karary University, Khartoum 12304, Sudan

³

Institute for Geodesy and Geoinformation, University of Bonn, 53115 Bonn, Germany

⁴

College of Computer Science and Information Technology, Karary University, Omdurman 12304, Sudan

^*

Author to whom correspondence should be addressed.

ISPRS Int. J. Geo-Inf. 2024, 13(2), 40; https://doi.org/10.3390/ijgi13020040

Submission received: 24 November 2023 / Revised: 19 January 2024 / Accepted: 25 January 2024 / Published: 27 January 2024

(This article belongs to the Topic Geocomputation and Artificial Intelligence for Mapping)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Social media platforms play a vital role in determining valuable tourist objectives, which greatly aids in optimizing tourist path planning. As data classification and analysis methods have advanced, machine learning (ML) algorithms such as the k-means algorithm have emerged as powerful tools for sorting through data collected from social media platforms. However, traditional k-means algorithms have drawbacks, including challenges in determining initial seed values. This paper presents a novel approach to enhance the k-means algorithm based on survey and social media tourism data for tourism path recommendations. The main contribution of this paper is enhancing the traditional k-means algorithm by employing the genetic algorithm (GA) to determine the number of clusters (k), select the initial seeds, and recommend the best tourism path based on social media tourism data. The GA enhances the k-means algorithm by using a binary string to represent initial centers and to apply GA operators. To assess its effectiveness, we applied this approach to recommend the optimal tourism path in the Red Sea State, Sudan. The results clearly indicate the superiority of our approach, with an algorithm optimization time of 0.01 s. In contrast, traditional k-means and hierarchical cluster algorithms required 0.27 and 0.7 s, respectively.

Keywords:

survey and social media tourism objectives; tourism path recommendations; genetic algorithm; machine learning algorithms

1. Introduction

The rapid proliferation of social media platforms has significantly expanded the availability of tourism objectives, enabling the recommendation of tourism paths from anywhere and at any time [1,2]. Travelers can effortlessly contribute to tourism objectives while on the go, using platforms such as WhatsApp, Facebook, Flickr, and Instagram. Additionally, users can share their current locations through services like Foursquare and provide feedback on tourist destinations via X, formerly known as Twitter. Location-based social media platforms like Foursquare and TripAdvisor empower users to share their experiences, reviews, and recommendations of tourist destinations, often including geographical coordinates [3,4,5]. By considering factors such as tourist travel costs, preferences, internal tourism objectives, destinations, and transportation options, it is possible to create recommended itineraries using both survey and social media tourism objectives [6,7]. The process for recommending tourist routes based on survey and social media data involves two key steps. First, the tourism objectives are clustered into groups, facilitating analysis and recommendations. Second, the optimal tourist path is recommended by solving the traveling salesman problem (TSP) [8,9].

The increasing volume of tourism data has added complexity to data collection and clusters, particularly when suggesting optimal tourist paths using ML algorithms and solving the TSP by an optimization algorithm. These methods are instrumental in analyzing unstructured data from social media platforms, ultimately enhancing path planning and destination recommendations [10,11,12]. The k-means is a commonly employed clustering algorithm for tourist data. However, it faces challenges in determining the optimal number of clusters (k) and in selecting initial seeds. The choice of k significantly influences the outcomes, and researchers often employ methods like the elbow method and silhouette score for the k selection [13,14,15]. While effective, the elbow method may encounter difficulties when dealing with many clusters [16,17].

On the other hand, the silhouette score evaluates object similarity within clusters but may have limitations when dealing with overlapping or significantly varying-sized clusters. Selecting the initial seeds for k-means is a critical step in the process. Running the algorithm multiple times with random initialization helps identify the best outcome [18,19]. Various seed selection methods, such as random and k-means++, each come with tradeoffs involving simplicity, cluster quality, and computational efficiency [20,21]. Furthermore, the GA has emerged as a powerful optimization algorithm for addressing complex problems, including determining the appropriate number of clusters (k) and identifying optimal initial seeds for the k-means algorithm [22,23].

This paper introduces the use of the GA to address the challenges of determining k and selecting initial seeds. The GA effectively mitigates the limitations of previous methods, addressing issues such as data overlap, handling large datasets, and improving the execution time of the k-means algorithm. The GA enhances the traditional k-means algorithm by employing binary strings to represent initial centers and applying GA operators. This enhancement significantly improves the algorithm’s effectiveness in classifying tourist data. Through parameter optimization, the algorithm becomes more proficient in accurately clustering and categorizing the extensive volume of tourism data, thus enhancing its overall performance. The optimization process iterates until convergence, where cluster assignments remain unchanged. Subsequently, the improved GA employs clustered and optimized tourism objectives to solve the TSP and recommend the optimal tourism path [24,25,26,27]. In summary, the main contributions of this paper are as follows:

➢: Enhancing the traditional k-means algorithm by using the GA to determine initial seeds, selecting the appropriate number of clusters (k), and recommending the best tourism path based on survey and social media tourism objectives.
➢: Collecting the tourism objectives from social media platforms through an online questionnaire and from TripAdvisor.
➢: Selecting and visualizing the optimal tourism path using GAs and the geographic information system (GIS) environment.
➢: Demonstrating the optimal time to implement the GA algorithm for finding the best tourism path through a comparison of our approach with other state-of-the-art methods.

The remainder of this article is structured as follows. In Section 2, we present the methodology and the idea of tourism objectives, survey and social media data, along with enhancing the k-means algorithm through the GA. In Section 3, the system implementation and experimental analysis are discussed. In Section 4, the results of this research and discussion are presented. Finally, the conclusion and future work are presented in Section 5.

Related Work

Tourism objectives are collected from various sources, including governmental and international institutions, geographical surveys, and popular social media platforms such as Facebook, WhatsApp, WeChat, and X. To address the static objectives of tourism planning, three main processes must be carried out: collecting tourism data, classifying tourism data, and employing optimization algorithms to determine the best route. In the following sections, we conduct a thorough analysis of studies related to the formulation of tourism objectives, highlighting key findings and methodologies.

Hu et al. [28] presented a method for deriving tourist movement patterns from X data involving a three-step process of cleaning geo-tagged posts to identify those authored by tourists. However, this method’s reliance solely on X data may limit its ability to represent comprehensive tourist activity from other sources or platforms. A separate study by Riaz and Sherani [29] overcame this limitation by focusing on the factors influencing information sharing on multiple social media platforms, particularly the adoption of Facebook and WeChat. Hashimy and T. S. [30] explored the opportunities and challenges of using social media platforms such as WhatsApp and Facebook for tourism development in Afghanistan. These include increased visibility, user-generated content, direct communication, influencer marketing, and destination marketing. However, the paper also highlights challenges that must be addressed, such as ensuring tourist safety and the need for infrastructure development. Sakas et al. [31] considered multiple objectives, including transportation type and tourist preferences, collected from various social media platforms. These objectives collectively describe the tourist destination, falling under the category of internal objectives. However, this approach overlooks the external objectives associated with interactions between tourist destinations. Addressing the challenge of integrating both internal and external objectives within a unified approach is essential for the advancement of this field.

A novel approach developed by Kim et al. [32] focused on developing a deep learning model and an image feature vector clustering technique to automate the categorization of traveler images by tourism destinations. However, the paper has limitations, primarily focusing on spatial data and omitting information about the characteristics and features of tourist destinations. The study by Bouabdallaoui et al. introduces an innovative clustering architecture that integrates the GA and k-means, coupled with a hybrid topic discovery approach incorporating latent Dirichlet allocation (LDA) and bidirectional encoder representations from transformers (BERT). The primary objective of this novel method is to predict and analyze the most significant topics related to tourist shopping destinations in Morocco. However, a significant limitation of this paper is the lack of attention in determining the values for k and the initial seeds. This limitation stems from the paper’s reliance on the random selection of both the number of groups (k) and the initial seeds, raising concerns regarding the robustness and reproducibility of the results.

Yafeng et al. [33] presented a new approach based on the GA to develop 47 tourism areas in Chongqing City, China. While this paper provides an intriguing approach for applying the GA to optimize tourism path planning, which could assist tour operators and planners in developing more effective, fun, and easy tourist trips, it is essential to note that the scope of this study is centered on enhancing the planning of tourist routes specifically for the 47 scenic areas in Chongqing. As a result, the general applicability of the findings to different contexts or regions may be limited. Moreover, this research solely employed the GA to identify the optimal tourism path without delving into the diverse objectives of tourism or the various tourism data sources available, such as social media platforms. Patcharin et al. [34] introduced a method for recognizing aircraft trajectories through statistical analysis clustering. It employs k-means clustering and Gaussian mixture clustering to group unstructured trajectories observed over Suvarnabhumi International Airport. Therefore, the applicability of these findings to other regions, such as tourist destinations, may be limited. Additionally, it is worth mentioning that the k algorithm used in this approach is not optimized, which affects the algorithm’s execution time. Majid et al. proposed the development of urban tourism and branding for spatial modeling. The authors used a novel hybrid modeling approach combining k-mean, fuzzy logic, and an artificial neural network (ANN) to assess urban tourism potential (AUTP). While this modeling provides valuable information for developing future strategies for urban tourism, the paper does not consider tourism data sources such as social media platforms, and it also overlooks various tourism objectives. Mehrdad et al. [35] discuss the use of unsupervised clustering methods as data-driven models for mineral prospectivity mapping (MPM). A hybrid data-driven clustering model combines the k-means clustering algorithm with harmony search (HS) and artificial bee colony (ABC) metaheuristic optimization algorithms. This hybrid model can be used for the selection of optimum cluster centroids to highlight favorable targets in the prospecting stage of mineral explorations.

In conclusion, many papers have addressed the extraction and prediction of tourist paths based on social media platforms. However, the aforementioned papers lack a precise definition of survey and social media data in the context of tourism and its unique characteristics. They primarily analyze data from a single social media platform without comparing the suitability of various platforms for tourism research. Additionally, these papers do not introduce novel methods for analyzing social media data in the tourism domain. Furthermore, the implementation of algorithms to organize and categorize the spatial and attribute data of tourist destinations can be time-consuming.

2. Methodology

The proposed approach utilizes the combined power of the K-means algorithm and GA to optimize and cluster social media tourism data, ultimately identifying the optimal tourism path. This approach can be broken down into several stages:

First, tourism objectives are collected from social media platforms using two main methods. The first method involves distributing questionnaires on the websites of tourist groups within popular social media platforms such as WeChat and WhatsApp. This allows for the direct collection of relevant information from users. The second method involves extracting objectives from the TripAdvisor website, which serves as a valuable source of tourism objectives.

Second, once the tourism objectives have been gathered from these platforms, they are segmented into distinct groups using the K-means algorithm. The initial value of k, representing the number of clusters, is determined using the GA. Additionally, 12 specific tourism objectives are carefully selected to evaluate and assess different tourist destinations.

Third, the GA is employed to suggest and determine the best tourist path based on the clustered data. This comprehensive and systematic approach allows researchers to gain valuable insights into the preferences and patterns of tourists. These stages are all shown in Figure 1.

2.1. Survey and Social Media Data

Social media platforms have evolved into indispensable resources for the collection of tourism data. These platforms hold significant importance within the tourism industry, primarily because they enable the real time sharing of user generated content [36,37]. Tourists utilize social media as a medium for sharing their tourism experiences, recommendations, and feedback, thus contributing to the creation of an extensive repository of information that holds immense value for researchers and tourism experts. By analyzing this rich trove of tourism data, researchers can glean valuable insights into tourist preferences. This wealth of data empowers businesses and destinations to make well informed decisions and tailor their services to align with the evolving requirements and preferences of tourists. The widespread adoption of social media platforms provides a unique opportunity to gather tourism data on a large scale, thereby fostering a deeper understanding of tourists and an overall enhancement of the tourism experience [38].

2.2. Selection Objectives

Tourism objectives in this approach were obtained from social media platforms through two methods: The first method involved the creation of a questionnaire, which was then distributed across various social media platforms groups like Facebook, WhatsApp, and WeChat. Access to the survey could be found at this link: https://forms.gle/6UHHubaiAPA6JhtE7, accessed on 25 February 2023. This questionnaire consisted of a range of inquiries concerning tourism objectives. Table 1 shows the definition of tourism objectives based on tourist preferences. Table 2 shows the sample of online questionnaire results.

The second method focused on acquiring tourist preferences, calculating distances between tourist destinations, and estimating travel costs between destinations. Table 3 proposed the tourism destination in Port Sudan City. These points were chosen on the basis of tourist demand and the aesthetic views available there, so they are considered a point of interest (POI). Tourist preferences indicate the extent to which tourists evaluate tourism destinations on the TripAdvisor website. Ratings range from 1 to 5, where 5 means very good, 4 means good, 3 means average, 2 means weak, and 1 means very weak. Table 4 displays tourist preferences according to the TripAdvisor website. Distances between tourist destinations are measured in kilometers in Table 5, and travel costs between destinations are measured in Sudanese pounds (SDG) in Table 6. This information was sourced from data collected on the TripAdvisor website.

2.3. Genetic Algorithm

GA is a computational method and N-hard algorithm based on the process of natural selection and genetics; it is a kind of meta-heuristic heuristic algorithm that mimics the natural process of evolution to solve complex issues [39,40]. The population of solutions in GA is randomly created and their fitness functions are evaluated to create a new population; the fittest individuals are chosen and then mutated. These operations are repeated until finding of the optimal solutions as the following:

Initialization: population of possible solutions is generated randomly.
Evaluation: each solution is tested for its fitness to use a successful fitness function.
Selection: more fitness functions are chosen to be the parents of the following generation.
Crossover: new individuals are generated by combining the existing genetic material of the selected parents.
Mutation: new individuals may undergo mutation, which introduces small changes in parents and a new generation replaces the old generation. The algorithm stops when a stopping parameter is satisfied, such as a set number of generations or a successful solution. Figure 2 shows the GA operations.

In general, the k-means algorithm is a machine learning tool that may be useful in a wide range of applications, such as clustering, anomaly detection, image compression, recommendation systems, and tourism path planning [41]. In our approach, we used the k-means algorithm to cluster tourism data and group it in the k group, the number of clusters was determined by using the GA.

2.4. K-Means Algorithm

The k-means algorithm is an unsupervised machine learning algorithm using clustering data, k-means classifying, and dividing data into k classes by its properties. The algorithm progresses using iterations; each data point is iteratively assigned to the closest centroid (cluster center), and the centroids are then computed again using the new assignments [42]. This procedure continues until the centroids stop moving altogether or the maximum number of iterations has been achieved. In the following are the steps of k-means algorithms:

➢: Choose several clusters k.
➢: Randomly initialize k centroid.
➢: After the initial centroids have been selected at random, decide on each point nearest to the centroid.
➢: Recalculate the centroids according to the new value of the mean of all the data points in that cluster. If given two points x and y, cluster C with k data points ( $x_{1}, x_{2}$ , …, $x_{k}),$ then the centroid C is calculated as ( $\frac{1}{k}$ ( $x_{1}$ + $x_{2}$ + …… + $x_{k}$ )).
➢: Repeat the 3–4 steps until the centroids stop moving altogether or the maximum number of iterations has been achieved.
➢: The k-means algorithm aims are determined and then find the minimized sum distance between the data and determine the centroid. Many methods can be used to determine distance such as the Euclidean distance method; this method is most commonly used if given two points x and y and the Euclidean distance is calculated as Equation (1):

D_{x, y} = \sqrt{{(X_{2} - X_{1})}^{2} - {(Y_{2} - Y_{1})}^{2}}

(1)

where n is the number of data points.

2.5. Enhancing the K-Means Algorithm though GA

This approach enhances k-means to utilize GA to determine the optimal initial seeds and number of the k value for k-means clustering. Figure 2 provides an overview of the entire framework for GA k-means clustering and enhancing the algorithm showing in pseudo code. Presented below is a detailed explanation of enhancing the k-means clustering process:

In the first step of this approach, a value is defined to generate the initial population in GA and solution fitness is assessed based on clustering quality, measured by the sum of squared errors (SSE), which is utilized to determine the optimal initial seeds and the value of k. To enhance the performance of GA, optimization parameters were adjusted, as shown in Table 3, Table 4, Table 5 and Table 6 these parameter adjustments aim to optimize the GA for more effective initial seed selection and k value determination for clustering. The improved GA, designed to enhance the K-means algorithm, is presented in the pseudo code.

In the second step of this process, once the optimal values for both the number of clusters (k) and the initial seeds have been determined, the tourism data are partitioned into groups. This partitioning is accomplished using the improved k-means algorithm introduced in the first step. The enhanced k-means algorithm effectively assigns each data point to its corresponding cluster based on similarity, taking into account the optimized k value and the initial seeds. By partitioning the tourism data into groups, this step facilitates further analysis and enables the identification of unique patterns, visitor preferences, or notable characteristics within the dataset.

In the third step, after segmenting the tourism data into groups using the enhanced k-means algorithm and considering the identified tourism objectives, an improved GA is employed to recommend the optimal tourist path. Although harnessing the capabilities of the GA, which combines elements of natural selection and genetic operators, the algorithm efficiently searches for the most favorable path that aligns with the specified tourism objectives. Figure 3 and Algorithm 1 shows the Framework of recommended tourism path enhancing the k-means algorithm through GA.

Algorithm 1: Pseudo code of enhancing k-means by GA

Begin
Initialization
Generate a solution population, representing possible data clusterings, cluster centers in k-means.
Fitness Evaluation
Assess solution fitness based on clustering quality, often measured by sum of squared errors (SSE)
Selection
Choose parent solutions for the next generation, with higher fitness solutions having a better chance.
Crossover
Create new solutions by combining features from two parents averaging cluster centers.
Mutation
Randomly alter some new solution features to maintain diversity and prevent premature convergence.
Replacement
Replace some current solutions with the new ones.
Termination
If a stopping criterion is met, stop and return the best intial seeds and k value found. Otherwise.
Repeat
From step 2.
Print:
Else
Print: Fail
end if
end

3. System Implementation and Experimental Analysis

Port Sudan City is in Red Sea State, Sudan. Port Sudan, the capital of Red Sea State in eastern Sudan, functions as Sudan’s primary seaport. Over 90% of Sudan’s international trade flows through Port Sudan’s modern port facilities, which were built between 1905 and 1909 to replace the historical Arab port of Suakin. Port Sudan features key infrastructure like an international airport, an oil refinery, and state of the art cargo and passenger terminals. Located on the eastern coast of the Red Sea, the port handles significant volumes of container traffic, bulk commodities, and roll-on/roll-off shipments. Port Sudan serves as a strategic gateway for the landlocked countries of South Sudan, Ethiopia, and Eritrea. It provides access to key trade routes like the Suez Canal and the Bab el Mandeb Strait. With ample developable land and a deep-water harbor, the port has potential for significant expansion to support Sudan’s growing trade volumes and improve supply chain connectivity. However, challenges remain around port efficiency and infrastructure constraints that have hindered full realization of Port Sudan’s potential as a regional shipping hub. Ongoing developments and investments aim to address these issues, upgrade port facilities, digitize processes, and expand container handling and logistics services. If successful, such efforts could transform Port Sudan into a modern logistics center that enhances Sudan’s trade competitiveness and links the nation to global supply networks. Figure 4 shows Port Sudan City in Red Sea State Sudan. Sex tourist destinations distributed in the study area were selected. Table 3 and Figure 5 proposed this point of interest (POI) in Port Sudan city. Following the selection, questionnaires were distributed to these chosen destinations.

4. Results and Discussion

4.1. Results

Based on the results of online questionnaires distributed on social media platforms in the study area, we determined the static tourism objectives with input from 600 visitors to enhance our approach. We collected the external tourism objectives from the TripAdvisor social media website. To implement our approach, first, we improved the GA by using new parameters, and then we determined the optimal k value and initial seeds. The optimal k value is 5, and the optimal initial seed selection is 10. Second, we used the enhanced k-means algorithm to cluster the static tourism objectives based on the value of k. Five groups of visitors were created for each tourism objective. The optimal path recommendations for visitor destinations, determined through the enhanced GA with improved parameters in Table 7, are presented in Table 8.

Table 9 and Figure 6 show groups of the internal tourism objectives based on enhancing the k-means algorithm.

After creating tables categorizing tourism objectives into groups, we constructed a comprehensive tourism objectives matrix comprising 12 matrices, nine for internal objectives and three for external objectives. The numbers in the matrices represent the numerical differences between the numbers of visitors in the EN groups across various destinations. For instance, the value 25 in the EN objectives matrix corresponds to the difference in the number of visitors between destination P1 and destination P2 in Group 1 in Table 9. Consequently, we calculated the matrices for all nine internal objectives in the same manner as illustrated in Table 10. Following this, we calculated 12 optimal paths based on static tourism objectives using the improved GA. Our approach involved creating objective matrices to address the TSP.

4.2. Discussion

When collecting and classifying tourism data, online surveys and social media play significant roles. In our approach, we propose to enhance the k-means algorithm for optimizing tourism data obtained from social media platforms. The primary challenge lies in determining the optimal k value for the k-means algorithm and selecting initial seeds, which can be addressed using various methods. Our method is based on the fundamental premise of plotting different cost values against varying k values. The elbow point on the graph can be used to compute k, representing the point of diminishing returns or the inflection point at the elbow [43,44]. However, the drawback of the elbow method is that it occasionally struggles to produce effective clusters. As an alternative, we employ an improved GA in our approach to determine the ideal value of k. Compared to other methods, the two-stage GA represents a relatively recent development. Many scholars have employed the k-means algorithm to classify and organize data into groups because of advantages such as ease of implementation, scalability to handle large datasets, guaranteed convergence, the ability to initialize centroids, and adaptability to new data points. However, one challenge associated with the k-means algorithm is the estimation of the optimal number of groups [41,45]. The k-means algorithm is favored by many scholars for data classification due to its strengths, such as ease of implementation and scalability. One of its drawbacks is the difficulty in determining the number of k groups. Our approach employs the GA to overcome this limitation.

In the process of clustering with k-means, initial seeds for clustering are selected. The method used to choose these seeds is dependent on the data and the problem being addressed. One approach commonly employed involves the random selection of initial seed points; the algorithm is executed multiple times, retaining the seeds that yield the lowest clustering error. Alternatively, an initial seed selection algorithm can be utilized, which selects the initial seeds from different clusters within the dataset. The decision regarding these methods is dictated by the characteristics of the data and the clustering objectives at hand [46]. In this approach, we utilize the GA to determine the optimal k value and to select initial seeds. Comparisons were conducted with several algorithms commonly used in the field of clustering. These include the expectation-maximization (EM) algorithm, hierarchical clustering, and the traditional k-means algorithm. Table 11 presents the comparison between the enhancing k-means algorithm and other machine learning clustering algorithms. If we wish to implement machine learning algorithms, the most valuable parameter is the optimization time. The optimization time in the k-means algorithm is 0.01 s, and the number of iterations is five. These results indicate the superiority of the k-means algorithm over other clustering algorithms.

In this study, we made significant enhancements to the k-means algorithm. These enhancements specifically improve the methods used to determine the number of groups (k) and the selection of initial seeds. We achieved these improvements by utilizing an enhanced GA and optimizing its parameters. We applied this improved GA to predict optimal tourist paths. This prediction is based on tourism objectives derived from social media platforms, considering factors such as popular destinations and peak travel times. Although our approach was specifically implemented for the Port Sudan region of the Red Sea State in Sudan, it can be applied to other regions as well. The suitability of other regions for this approach depends on factors such as the variety of tourist targets, the population characteristics, and the prevalence of relevant social media platforms. However, for accurate and effective implementation, it is essential to conduct a comprehensive study of the region’s specific tourist objectives, the population characteristics, and the relevant social networking sites.

5. Conclusions and Future Work

Tourism objectives derived from social media can offer new opportunities for decision support in recommending tourism paths. In this paper, we propose an innovative approach to optimize and classify tourism objectives for recommending the optimal tourism path, using the k-means algorithm and GA. We also integrate various tools for this purpose, demonstrating the applicability of an improved k-means algorithm and GA for developing tourism path planning. Additionally, we utilize the GIS to implement and visualize efficient social media tourism objectives and display the optimal routes. Our approach is organized as follows: First, the tourism objectives were collected from surveys and social media platforms. Second, the GA was used to enhance the k-means algorithm with a new parameter for clustering tourism objectives. Finally, a comparison and combination were performed with the algorithms currently used in the GIS environment. The following points are recommended:

❖: Optimize and classify 12 tourism objectives based on social media platforms to determine the path of tourists.
❖: Apply the GA to determine the number of clusters, initial seeds, and the optimal path planning.
❖: Optimize and visualize the tourism path planning approach based on the social media tourism objectives.

There is still room for improvement in using ML algorithms to improve social media data. Future work can focus on increasing the objectives and fusing both internal and external objectives in the evaluations of web users.

Author Contributions

Conceptualization, Mohamed A. Damos, Rashad Elhabob, and Jun Zhu; methodology, Weilian Li; software, Elhadi Khalifa; validation, Abubakr Hassan, Mohamed A. Damos, Weilian Li, and Jun Zhu; writing–original draft preparation, Abubakr Hassan; writing–review and editing, Elhadi Khalifa; visualization, Jun Zhu, Abubakr Hassan, and Alaa Hm; funding acquisition, Mohamed A. Damos. Resources, Weilian Li and Esra Ei. All authors have read and agreed to the published version of the manuscript.

Funding

This article was supported by the National Natural Science Foundation of China [grant number 42171397].

Data Availability Statement

The data that support the findings of this study are available from the corresponding author, Mohamed A. Damos, upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Minazzi, R. Social Media Marketing in Tourism and Hospitality; Springer: Cham, Switzerland, 2015. [Google Scholar]
Tenemaza, M.; Luján-Mora, S.; De Antonio, A.; Ramirez, J. Improving itinerary recommendations for tourists through metaheuristic algorithms: An optimization proposal. IEEE Access 2020, 8, 79003–79023. [Google Scholar] [CrossRef]
Lee, J.Y.; Tsou, M.-H. Mapping spatiotemporal tourist behaviors and hotspots through location-based photo-sharing service (Flickr) data. In Progress in Location Based Services 2018; Springer: Cham, Switzerland, 2018; pp. 315–334. [Google Scholar]
Wan, L.; Hong, Y.; Huang, Z.; Peng, X.; Li, R. A hybrid ensemble learning method for tourist route recommendations based on geo-tagged social networks. Int. J. Geogr. Inf. Sci. 2018, 32, 2225–2246. [Google Scholar] [CrossRef]
Zhu, J.; Zhang, J.; Zhu, Q.; Li, W.; Wu, J.; Guo, Y. A knowledge-guided visualization framework of disaster scenes for helping the public cognize risk information. Int. J. Geogr. Inf. Sci. 2024, 38, 1–28. [Google Scholar] [CrossRef]
Aftab, S.; Khan, M.M. Role of social media in promoting tourism in Pakistan. J. Soc. Sci. Humanit. 2019, 58, 101–113. [Google Scholar] [CrossRef]
Jimenez-Barreto, J.; Sthapit, E.; Rubio, N.; Campo, S. Exploring the dimensions of online destination brand experience: Spanish and North American tourists’ perspectives. Tour. Manag. Perspect. 2019, 31, 348–360. [Google Scholar] [CrossRef]
Ahsini, Y.; Díaz-Masa, P.; Inglés, B.; Rubio, A.; Martínez, A.; Magraner, A.; Conejero, J.A. The Electric Vehicle Traveling Salesman Problem on Digital Elevation Models for Traffic-Aware Urban Logistics. Algorithms 2023, 16, 402. [Google Scholar] [CrossRef]
Silva, C.E.; César, T.S.; Gomes, I.P.; Silva, J.A.; Wolf, D.F.; Alves, R.; Souza, J.R. Scheduling System for Multiple Self-driving Cars Using K-Means and Bio-inspired Optimization Algorithms. SN Comput. Sci. 2023, 4, 647. [Google Scholar] [CrossRef]
Gaur, L.; Afaq, A.; Solanki, A.; Singh, G.; Sharma, S.; Jhanjhi, N.; My, H.T.; Le, D.-N. Capitalizing on big data and revolutionary 5G technology: Extracting and visualizing ratings and reviews of global chain hotels. Comput. Electr. Eng. 2021, 95, 107374. [Google Scholar] [CrossRef]
Hamid, R.A.; Albahri, A.S.; Alwan, J.K.; Al-Qaysi, Z.; Albahri, O.S.; Zaidan, A.; Alnoor, A.; Alamoodi, A.H.; Zaidan, B. How smart is e-tourism? A systematic review of smart tourism recommendation system applying data management. Comput. Sci. Rev. 2021, 39, 100337. [Google Scholar] [CrossRef]
Li, W.; Zhu, J.; Zhu, Q.; Zhang, J.; Han, X.; Dehbi, Y. Visual attention-guided augmented representation of geographic scenes: A case of bridge stress visualization. Int. J. Geogr. Inf. Sci. 2024, 38. [Google Scholar] [CrossRef]
Ahmed, M.; Seraj, R.; Islam, S.M.S. The k-means algorithm: A comprehensive survey and performance evaluation. Electronics 2020, 9, 1295. [Google Scholar] [CrossRef]
Jahwar, A.F.; Abdulazeez, A.M. Meta-heuristic algorithms for K-means clustering: A review. PalArch’s J. Archaeol. Egypt/Egyptol. 2020, 17, 12002–12020. [Google Scholar]
Huang, J. Design of Tourism Data Clustering Analysis Model Based on K-Means Clustering Algorithm. In International Conference on Multi-Modal Information Analytics; Springer: Cham, Switzerland, 2022; pp. 373–380. [Google Scholar]
Yuan, C.; Yang, H. Research on K-value selection method of K-means clustering algorithm. J 2019, 2, 226–235. [Google Scholar] [CrossRef]
Ikotun, A.M.; Ezugwu, A.E.; Abualigah, L.; Abuhaija, B.; Heming, J. K-means clustering algorithms: A comprehensive review, variants analysis, and advances in the era of big data. Inf. Sci. 2023, 622, 178–210. [Google Scholar] [CrossRef]
Yang, Z.; Jiang, F.; Yu, X.; Du, J. Initial Seeds Selection for K-means Clustering Based on Outlier Detection. In Proceedings of the 2022 5th International Conference on Software Engineering and Information Management (ICSIM), Yokohama, Japan, 21–23 January 2022; pp. 138–143. [Google Scholar]
Li, W.; Zhu, J.; Fu, L.; Zhu, Q.; Xie, Y.; Hu, Y. An augmented representation method of debris flow scenes to improve public perception. Int. J. Geogr. Inf. Sci. 2021, 35, 1521–1544. [Google Scholar] [CrossRef]
Han, M. Research on optimization of K-means Algorithm Based on Spark. In Proceedings of the 2023 IEEE 6th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), Chongqing, China, 24–26 February 2023; pp. 1829–1836. [Google Scholar]
Bahmani, B.; Moseley, B.; Vattani, A.; Kumar, R.; Vassilvitskii, S. Scalable k-means++. arXiv 2012, arXiv:1203.6402. [Google Scholar] [CrossRef]
Chaudhary, M.; Pruthi, J.; Jain, V.K.; Suryakant. A novel squirrel search clustering algorithm for text document clustering. Int. J. Inf. Technol. 2022, 14, 3277–3286. [Google Scholar] [CrossRef]
Al Shaqsi, J.; Wang, W. Robust Clustering Ensemble Algorithm. SSRN Electron. J. 2022. [Google Scholar] [CrossRef]
Alzyadat, T.; Yamin, M.; Chetty, G. Genetic algorithms for the travelling salesman problem: A crossover comparison. Int. J. Inf. Technol. 2020, 12, 209–213. [Google Scholar] [CrossRef]
Al-Kaseem, B.R.; Taha, Z.K.; Abdulmajeed, S.W.; Al-Raweshidy, H.S. Optimized energy–efficient path planning strategy in WSN with multiple Mobile sinks. IEEE Access 2021, 9, 82833–82847. [Google Scholar] [CrossRef]
Chen, J.; Zhang, Y.; Wu, L.; You, T.; Ning, X. An adaptive clustering-based algorithm for automatic path planning of heterogeneous UAVs. IEEE Trans. Intell. Transp. Syst. 2021, 23, 16842–16853. [Google Scholar] [CrossRef]
Ahmed, A.; Ju, H.; Yang, Y.; Xu, H. An Improved Unit Quaternion for Attitude Alignment and Inverse Kinematic Solution of the Robot Arm Wrist. Machines 2023, 11, 669. [Google Scholar] [CrossRef]
Hu, F.; Li, Z.; Yang, C.; Jiang, Y. A graph-based approach to detecting tourist movement patterns using social media data. Cartogr. Geogr. Inf. Sci. 2019, 46, 368–382. [Google Scholar] [CrossRef]
Riaz, M.; Sherani. Investigation of information sharing via multiple social media platforms: A comparison of Facebook and WeChat adoption. Qual. Quant. 2021, 55, 1751–1773. [Google Scholar] [CrossRef]
Hashimy, S.Q.; Halim, T.S. The Impact of Social Media on Afghanistan’s Tourism Industry: A Roadmap for the Future in the Internet Highway. Law Soc. Policy Rev. 2023, 1, 17–50. [Google Scholar]
Sakas, D.P.; Reklitis, D.P.; Terzi, M.C.; Vassilakis, C. Multichannel digital marketing optimizations through Big Data Analytics in the tourism and Hospitality Industry. J. Theor. Appl. Electron. Commer. Res. 2022, 17, 1383–1408. [Google Scholar] [CrossRef]
Kim, J.; Kang, Y. Automatic classification of photos by tourist attractions using deep learning model and image feature vector clustering. ISPRS Int. J. Geo-Inf. 2022, 11, 245. [Google Scholar] [CrossRef]
Chen, Y.; Zheng, X.; Fang, Z.; Yu, Y.; Kuang, Z.; Huang, Y. Research on optimization of tourism route based on genetic algorithm. J. Phys. Conf. Ser. 2020, 1575, 012027. [Google Scholar] [CrossRef]
Kamsing, P.; Torteeka, P.; Yooyen, S.; Yenpiem, S.; Delahaye, D.; Notry, P.; Phisannupawong, T.; Channumsin, S. Aircraft trajectory recognition via statistical analysis clustering for Suvarnabhumi International Airport. In Proceedings of the 2020 22nd International Conference on Advanced Communication Technology (ICACT), Pyeongchang, Republic of Korea, 16–19 February 2020; pp. 290–297. [Google Scholar]
Dadashpour Moghaddam, M.; Ahmadzadeh, H.; Valizadeh, R. A GIS-based assessment of urban tourism potential with a branding approach utilizing hybrid modeling. Spat. Inf. Res. 2022, 30, 399–416. [Google Scholar] [CrossRef]
Zhou, X.; Xu, C.; Kimmons, B. Detecting tourism destinations using scalable geospatial analysis based on cloud computing platform. Comput. Environ. Urban Syst. 2015, 54, 144–153. [Google Scholar] [CrossRef]
Wang, H.; Yan, J. Effects of social media tourism information quality on destination travel intention: Mediation effect of self-congruity and trust. Front. Psychol. 2022, 13, 1049149. [Google Scholar] [CrossRef]
Sarkar, S.K.; George, B. Social media technologies in the tourism industry: An analysis with special reference to their role in sustainable tourism development. Int. J. Tour. Sci. 2018, 18, 269–278. [Google Scholar] [CrossRef]
Tahir, M.; Tubaishat, A.; Al-Obeidat, F.; Shah, B.; Halim, Z.; Waqas, M. A novel binary chaotic genetic algorithm for feature selection and its utility in affective computing and healthcare. Neural Comput. Appl. 2020, 34, 1–22. [Google Scholar] [CrossRef]
Damos, M.A.; Zhu, J.; Li, W.; Hassan, A.; Khalifa, E. A novel urban tourism path planning approach based on a multiobjective genetic algorithm. ISPRS Int. J. Geo-Inf. 2021, 10, 530. [Google Scholar] [CrossRef]
Pizzuti, C.; Procopio, N. A k-means based genetic algorithm for data clustering. In Proceedings of the International Joint Conference SOCO’16-CISIS’16-ICEUTE’16, San Sebastián, Spain, 19–21 October 2016; Proceedings 11. pp. 211–222. [Google Scholar]
Tabianan, K.; Velu, S.; Ravi, V. K-means clustering approach for intelligent customer segmentation using customer purchase behavior data. Sustainability 2022, 14, 7243. [Google Scholar] [CrossRef]
Ghezelbash, R.; Maghsoudi, A.; Shamekhi, M.; Pradhan, B.; Daviran, M. Genetic algorithm to optimize the SVM and K-means algorithms for mapping of mineral prospectivity. Neural Comput. Appl. 2023, 35, 719–733. [Google Scholar] [CrossRef]
Zubair, M.; Iqbal, M.A.; Shil, A.; Chowdhury, M.; Moni, M.A.; Sarker, I.H. An improved K-means clustering algorithm towards an efficient data-driven modeling. Ann. Data Sci. 2022, 9, 1–20. [Google Scholar] [CrossRef]
Daviran, M.; Ghezelbash, R.; Niknezhad, M.; Maghsoudi, A.; Ghaeminejad, H. Hybridizing K-means clustering algorithm with harmony search and artificial bee colony optimizers for intelligence mineral prospectivity mapping. Earth Sci. Inform. 2023, 16, 2143–2165. [Google Scholar] [CrossRef]
Sajidha, S.; Desikan, K.; Chodnekar, S.P. Initial seed selection for mixed data using modified k-means clustering algorithm. Arab. J. Sci. Eng. 2020, 45, 2685–2703. [Google Scholar] [CrossRef]

Figure 1. Process of recommending the best path based on survey and social media tourism data.

Figure 2. GA operations.

Figure 3. Framework of recommended tourism path enhancing the k-means algorithm through GA [40].

Figure 4. Location of Red Sea State Sudan in Sudan.

Figure 5. Tourist sites distributed within the city.

Figure 6. Groups of the internal tourism objectives based on enhancing the k-means algorithm.

Table 1. Definition of tourism objectives.

Selected Objective	Explanation and References	Evaluation Scale	Rating
Entertainment value (EN)	The value of entertainment refers to the entertainment available in the tourist site, which is available to the visitor.	Very high	10
		High	7
		Medium	4
		Low	1
Aesthetic and art (AA)	Aesthetics and arts include the aesthetic and artistic sensitivity. The practical, cultural, and philosophical qualities of the site.	Very high	10
		High	7
		Medium	4
		Low	1
Cultural–historical value (CH)	The historical and cultural value is considered to be one of the most important factors that affects why tourists flock to tourist sites.	Very high	10
		High	7
		Medium	4
		Low	1
Scientific value (SI)	The scientific value of the tourist site indicates the scientific importance of the site, such as universities and others.	Very high	10
		High	7
		Medium	4
		Low	1
Size of tourism destination (TD)	The size of the tourist destination, the height of the place, and the ability of the tourist destination to accommodate tourists.	>50 km²	10
		>10–50 km²	7
		1–10 km²	4
		<1 km²	1
Tourism seasonality (TS)	Tourism seasonality is the possibility of visiting a tourist site in a specific season of year, some sites that can be visited year-round, such as museums, some site have seasonality such as gardens.	>300 days/year	10
		>200–300 days/year	7
		100–200 days/year	4
		<100 days/year	1
Quality of service (QS)	Quality of service includes all services provided within tourist sites such as restaurants, cafes, shops, and others.	Very high	10
		High	7
		Medium	4
		Low	1
Time in site (TI)	This includes the time spent by the visitor inside the site, taking into account the opening and closing times of the gates.	>3	10
		>2–3	7
		>1–2	4
		0–1	1
Biodiversity (BI)	The value of biological diversity is evaluated according to the different types of endemic animals.	Very high	10
		High	7
		Medium	4
		Low	1

Table 2. Sample of online questionnaire results.

Visitor NO	EN	AA	CH	SI	TD	TS	QS	TI	BI
1	low	Medium	Medium	V.high	3	3	Medium	7	low
2	Medium	low	High	V.high	4	4	V.high	5	High
3	Medium	High	V.high	High	7	5	V.high	3	V.high
4	High	V.high	Medium	V.high	5	5	Medium	10	Medium
5	Medium	low	V.high	High	3	7	V.high	7	low
6	High	Medium	low	Medium	10	5	High	5	High
7	low	low	High	Medium	7	3	Medium	7	Medium
8	low	High	Medium	V.high	3	7	V.high	5	Medium
9	High	low	High	V.high	10	4	V.high	3	V.high
10	low	V.high	low	Medium	5	7	High	7	Medium
11	V.high	Medium	low	V.high	5	10	Medium	10	V.high
12	low	Medium	High	V.high	3	4	Medium	3	V.high
13	V.high	Medium	V.high	High	7	5	V.high	5	High
14	High	V.high	Medium	V.high	5	5	low	3	Medium
15	High	V.high	Medium	V.high	10	5	High	5	High
16	low	Medium	Medium	low	10	5	V.high	7	low
17	High	V.high	Medium	V.high	5	5	V.high	5	High
18	Medium	High	V.high	High	7	5	V.high	3	V.high
19	High	V.high	Medium	V.high	3	7	V.high	7	low
20	low	low	High	Medium	7	3	Medium	7	Medium
21	High	Medium	low	Medium	3	7	V.high	5	Medium
22	low	low	High	Medium	10	4	V.high	3	V.high
23	High	Medium	low	Medium	7	3	Medium	7	Medium
24	low	low	High	Medium	3	7	V.high	5	Medium
25	High	V.high	Medium	V.high	5	5	Medium	10	Medium

Table 3. Tourism destinations in Port Sudan City.

POI	E	N	Name
P1	37.45045	19.72309	Sanganeb Reserve
P2	37.34297	19.11629	Othman Digna port
P3	37.33744	19.11293	Suakin city
P4	37.10517	18.76735	Arquette Resort
P5	37.10307	18.77391	Lake Arquette
P6	37.24676	19.55832	Red Sea Resort

Table 4. Matrix of tourist preferences.

POI	P1	P2	P3	P4	P5	P6
P1	0	2	4	1	3	3
P2	2	0	2	4	2	3
P3	4	3	0	1	2	3
P4	1	4	1	0	3	1
P5	3	2	2	3	0	4
P6	3	3	3	1	4	0

Table 5. Matrix of distance between tourism destinations (km).

POI	P1	P2	P3	P4	P5	P6
P1	0	67.5	68.9	112.5	111.9	29.9
P2	67.5	0	0.45	45.4	44.2	50.3
P3	68.9	0.45	0	44.11	44.4	49.9
P4	112.5	45.4	44.11	0	1.21	89.9
P5	111.9	44.2	44.4	1.21	0	88.8
P6	29.9	50.3	49.9	89.9	88.8	0

Table 6. Matrix of travel costs between destinations (SDG).

POI	P1	P2	P3	P4	P5	P6
P1	0	500	400	300	450	300
P2	500	0	250	350	300	700
P3	400	250	0	250	200	150
P4	300	350	250	0	150	400
P5	450	300	200	150	0	300
P6	300	700	150	400	300	0

Table 7. Parameter settings of GA.

Parameters	Values
Population size	100
Crossover probability	0.85
Mutation probability	0.10
Number of generations	4000

Table 8. Optimal tourism paths.

NO	The Objectives	The Path
1	EN	P1-P5-P6-P3-P2-P4-P1
2	AA	P4-P2-P1-P4-P6-P5-P4
3	CH	P5-P2-P4-P6-P3-P1-P5
4	SI	P3-P4-P6-P1-P2-P5-P3
5	TD	P2-P4-P6-P3-P1-P5-P2
6	TS	P4-P6-P4-P1-P2-P3-P4
7	QS	P6-P1-P5-P2-P3-P4-P6
8	TI	P6-P3-P2-P4-P5-P1-P6
9	BI	P5-P3-P4-P1-P2-P6-P5
10	Tourist preferences	P1-P5-P6-P3-P2-P4-P1
11	Travel costs	P6-P5-P2-P1-P3-P4-P5
12	Total distances	P2-P3-P4-P5-P6-P1-P2

Table 9. Results of internal tourism objective groups of visitors using the enhance k-means algorithm.

Group 1
POI/Objective	P1	P2	P3	P4	P5	P6
EN	138	113	101	113	122	131
AA	125	150	95	187	102	151
CH	122	120	135	121	78	88
SI	135	125	131	99	90	135
TD	80	89	106	87	91	78
TS	74	123	105	173	85	190
QS	78	111	135	109	106	136
TI	92	109	105	135	120	198
BI	145	189	143	143	153	110
Group 2
POI/Objective	P1	P2	P3	P4	P5	P6
EN	102	123	132	124	164	122
AA	168	108	132	120	160	199
CH	139	121	144	108	130	151
SI	109	125	154	90	170	176
TD	175	131	131	124	119	190
TS	123	187	89	138	118	120
QS	95	108	156	149	67	99
TI	84	125	139	120	77	98
BI	101	106	76	132	90	121
Group 3
POI/Objective	P1	P2	P3	P4	P5	P6
EN	134	153	90	129	93	106
AA	110	109	77	109	113	77
CH	167	137	87	111	156	127
SI	140	171	100	109	128	118
TD	160	126	178	160	143	98
TS	155	103	90	131	210	76
QS	123	161	124	129	186	67
TI	93	150	159	140	89	127
BI	90	135	87	98	77	145
Group 4
POI/Objective	P1	P2	P3	P4	P5	P6
EN	111	86	137	97	107	109
AA	97	86	98	96	107	86
CH	99	148	155	167	116	150
SI	104	90	114	119	98	115
TD	104	110	82	143	163	109
TS	107	117	98	77	109	62
QS	113	150	93	87	111	120
TI	190	160	107	120	109	121
BI	99	78	130	118	160	129
Group 5
POI/Objective	P1	P2	P3	P4	P5	P6
EN	115	125	140	137	114	132
AA	100	147	198	88	118	87
CH	73	74	79	93	120	84
SI	112	89	101	183	114	56
TD	81	84	103	86	84	125
TS	141	70	218	81	78	152
QS	191	70	92	126	130	178
TI	141	56	90	85	205	56
BI	165	92	164	109	120	95

Table 10. EN objective matrix.

POI	P1	P2	P3	P4	P5	P6
P1	0	25	37	25	16	7
P2	25	0	55	37	48	1
P3	37	55	0	14	57	47
P4	25	37	14	0	9	36
P5	16	48	57	9	0	13
P6	7	1	47	36	13	0

Table 11. Comparisons of experimental results.

Algorithm	Time Optimizations (s)	Number of Alterations	Number of Clusters
Enhancing k-means algorithm	0.01	5	5
Traditional k-means algorithm	0.2	8	5
EM algorithm	0.27	22	5
Hierarchical cluster algorithm	0.70	9	5

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Damos, M.A.; Zhu, J.; Li, W.; Khalifa, E.; Hassan, A.; Elhabob, R.; Hm, A.; Ei, E. Enhancing the K-Means Algorithm through a Genetic Algorithm Based on Survey and Social Media Tourism Objectives for Tourism Path Recommendations. ISPRS Int. J. Geo-Inf. 2024, 13, 40. https://doi.org/10.3390/ijgi13020040

AMA Style

Damos MA, Zhu J, Li W, Khalifa E, Hassan A, Elhabob R, Hm A, Ei E. Enhancing the K-Means Algorithm through a Genetic Algorithm Based on Survey and Social Media Tourism Objectives for Tourism Path Recommendations. ISPRS International Journal of Geo-Information. 2024; 13(2):40. https://doi.org/10.3390/ijgi13020040

Chicago/Turabian Style

Damos, Mohamed A., Jun Zhu, Weilian Li, Elhadi Khalifa, Abubakr Hassan, Rashad Elhabob, Alaa Hm, and Esra Ei. 2024. "Enhancing the K-Means Algorithm through a Genetic Algorithm Based on Survey and Social Media Tourism Objectives for Tourism Path Recommendations" ISPRS International Journal of Geo-Information 13, no. 2: 40. https://doi.org/10.3390/ijgi13020040

APA Style

Damos, M. A., Zhu, J., Li, W., Khalifa, E., Hassan, A., Elhabob, R., Hm, A., & Ei, E. (2024). Enhancing the K-Means Algorithm through a Genetic Algorithm Based on Survey and Social Media Tourism Objectives for Tourism Path Recommendations. ISPRS International Journal of Geo-Information, 13(2), 40. https://doi.org/10.3390/ijgi13020040

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Enhancing the K-Means Algorithm through a Genetic Algorithm Based on Survey and Social Media Tourism Objectives for Tourism Path Recommendations

Abstract

1. Introduction

Related Work

2. Methodology

2.1. Survey and Social Media Data

2.2. Selection Objectives

2.3. Genetic Algorithm

2.4. K-Means Algorithm

2.5. Enhancing the K-Means Algorithm though GA

3. System Implementation and Experimental Analysis

4. Results and Discussion

4.1. Results

4.2. Discussion

5. Conclusions and Future Work

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

POI	P1	P2	P3	P4	P5	P6
P1	0	25	37	25	16	7
P2	25	0	55	37	48	1
P3	37	55	0	14	57	47
P4	25	37	14	0	9	36
P5	16	48	57	9	0	13
P6	7	1	47	36	13	0

POI	P1	P2	P3	P4	P5	P6
P1	0	25	37	25	16	7
P2	25	0	55	37	48	1
P3	37	55	0	14	57	47
P4	25	37	14	0	9	36
P5	16	48	57	9	0	13
P6	7	1	47	36	13	0

POI	P1	P2	P3	P4	P5	P6
P1	0	25	37	25	16	7
P2	25	0	55	37	48	1
P3	37	55	0	14	57	47
P4	25	37	14	0	9	36
P5	16	48	57	9	0	13
P6	7	1	47	36	13	0