Next Article in Journal
Emergence of Longitudinal Queues in Group Navigation: An Interpretable Approach via Projective Simulation
Next Article in Special Issue
Parametric Optimization of VLM Panel Discretization Using Bio-Inspired Crayfish and Aquila Algorithms Coupled with Hybrid RSM-Based Ensemble Machine Learning Surrogate Models: A Case Study
Previous Article in Journal
Bio-Inspired Blade Cascades: Numerical Predictions Versus Experimental Measurements
Previous Article in Special Issue
Evaluating Bio-Inspired Metaheuristics for Dynamic Surgical Scheduling: A Resilient Three-Stage Flow Shop Model Under Stochastic Emergency Arrivals
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Clustering Performance Analysis Using Chaotic and Lévy Flight-Enhanced Black-Winged Kite Algorithms

Department of Computer Engineering, Faculty of Technology, Selcuk University, Konya 42130, Türkiye
*
Author to whom correspondence should be addressed.
Biomimetics 2026, 11(3), 200; https://doi.org/10.3390/biomimetics11030200
Submission received: 7 January 2026 / Revised: 21 February 2026 / Accepted: 26 February 2026 / Published: 9 March 2026

Abstract

Clustering is a fundamental unsupervised learning technique used to uncover hidden patterns in unlabeled data. Although metaheuristic algorithms have demonstrated effectiveness in clustering, many suffer from premature convergence and limited population diversity. This study employs the Black-Winged Kite Algorithm (BKA) and its enhanced variants, Chaotic BKA (CBKA), Lévy Flight-based BKA (LBKA), and Chaotic Levy BKA (CLBKA), to address these limitations in centroid-based clustering formulated as a Sum of Squared Errors (SSE) minimization problem. Chaotic logistic mapping improves search diversity and adaptability, while Levy flight introduces long-range exploration. In addition, Cauchy based perturbations are incorporated to enhance convergence stability. The algorithms are evaluated on sixteen UCI benchmark datasets, with 30 independent runs conducted under different population and iteration settings. Experimental results show that CLBKA consistently achieves superior clustering performance in terms of accuracy and stability. Statistical validation using the Friedman and Wilcoxon tests confirms significant performance differences, with CLBKA obtaining the lowest mean rank across configurations. The findings indicate that integrating chaotic dynamics and Levy flight mechanisms enhances clustering robustness and optimization efficiency.

1. Introduction

Clustering has evolved from early conceptual investigations of grouping phenomena in the social and behavioral sciences into a formalized statistical and computational framework. Initially concerned with identifying latent structures within qualitative observations, clustering gradually became a central analytical tool in quantitative data analysis. This evolution led to the development of classical algorithmic paradigms such as centroid-based, hierarchical, and density-based approaches. In recent years, the focus has expanded toward density-aware and dynamic clustering models capable of handling complex, high-dimensional, and irregular data structures [1,2]. Clustering is a core unsupervised learning approach that aims to organize data samples into meaningful groups based on similarity, while simultaneously enhancing within-cluster compactness and between-cluster separability. Owing to its ability to reveal hidden structures in unlabeled data, clustering has become an essential component in various fields such as pattern recognition, machine learning, data mining, and signal processing [3,4,5]. Since clustering problems can be naturally expressed as continuous optimization tasks, metaheuristic optimization techniques have attracted increasing interest due to their effectiveness in navigating complex, nonlinear, and multimodal search spaces and avoiding poor local solutions [6]. Accordingly, a wide range of nature-inspired metaheuristic algorithms, including Genetic Algorithms (GA), Particle Swarm Optimization (PSO), Grey Wolf Optimizer (GWO), and Ant Lion Optimizer (ALO), have been successfully adapted to clustering by iteratively refining cluster centroids using population-based search strategies [7]. Nevertheless, despite their promising performance, many metaheuristic clustering algorithms are still prone to premature convergence, reduced population diversity, and stagnation in local optima, particularly when applied to high-dimensional or noise-contaminated datasets [8].
The Black-Winged Kite Algorithm (BKA) is a metaheuristic algorithm proposed in 2024 [9], inspired by the hovering, predatory, and migratory behaviors of the black-winged kite bird. The algorithm incorporates key behavioral mechanisms, including attack-based movement patterns, Cauchy-driven migration, and adaptive leader selection, to regulate the balance between exploration and exploitation during the search process. Although BKA has shown competitive results on standard benchmark optimization problems, it may still encounter challenges such as early convergence and limited ability to escape locally optimal regions [10]. To alleviate such shortcomings, chaotic mapping strategies have been widely employed in metaheuristic algorithms, as they exhibit sensitivity to initial conditions, ergodic behavior, and the capability to generate non-periodic and diverse search trajectories [11]. Among various chaotic systems, the logistic map has been frequently reported as an effective mechanism for maintaining population diversity and mitigating stagnation phenomena in evolutionary search processes [12]. In addition, Lévy flight-based search strategies have been increasingly integrated into metaheuristic frameworks due to their heavy-tailed step-length distributions, which enable occasional long-distance movements and significantly enhance global exploration, thereby improving the ability of optimization algorithms to escape local optima and explore broader regions of the search space [13].
Although the Black-Winged Kite Algorithm exhibits an effective balance between exploration and exploitation, its performance may deteriorate in later iterations due to premature convergence and a gradual loss of population diversity. To address these limitations, three enhanced variants of BKA have been developed. The Chaotic Black-Winged Kite Algorithm (CBKA) incorporates chaotic logistic mapping to increase search diversity, while the Lévy Flight-based Black-Winged Kite Algorithm (LBKA) introduces Lévy flight-driven transitions to strengthen global exploration. Furthermore, the hybrid Chaotic Lévy Black-Winged Kite Algorithm (CLBKA) combines both chaotic mapping and Lévy flight mechanisms to achieve a more effective and balanced exploration strategy. These enhanced BKA variants have demonstrated strong performance on standard benchmark functions as well as real-world engineering optimization problems [10]. However, despite their promising optimization capabilities, their applicability to data clustering problems has not yet been comprehensively investigated.
In this study, the BKA, CBKA, LBKA, and CLBKA are employed to address data clustering by formulating it as a global optimization problem. Unlike conventional clustering techniques, the proposed approach does not rely on prior assumptions regarding data distribution or cluster shape. Instead, cluster formation is achieved through a metaheuristic-driven search process that aims to identify optimal cluster centroids by minimizing a predefined clustering objective function. The goal of this study is to address premature convergence and limited exploration in metaheuristic-based clustering by developing chaotic and Lévy flight-enhanced variants of the Black-Winged Kite Algorithm. Unlike existing chaotic or Lévy flight-enhanced clustering algorithms, this study introduces a unified and phase-aware integration of chaotic control, Lévy flight exploration, and Cauchy mutation within the Black-Winged Kite framework, specifically tailored for centroid-based clustering.
The main contributions of this study can be summarized as follows. First, the original Black-Winged Kite Algorithm and its enhanced variants (CBKA, LBKA, and CLBKA) are adapted to the clustering domain through a centroid-based optimization framework. The remainder of this paper is organized as follows. Section 2 presents a comprehensive review of related work in metaheuristic-based clustering. Section 3 describes the methodological framework, including the original Black Winged Kite Algorithm and its enhanced variants, CBKA, LBKA, and CLBKA, together with their mathematical foundations and integration mechanisms. The clustering formulation, benchmark datasets, statistical evaluation methods, and time complexity analysis are also detailed in this section. Section 4 provides the experimental results and discussion, including comparative performance analysis, literature-based evaluations, and statistical significance tests such as the Friedman, Nemenyi, and Wilcoxon procedures, as well as sensitivity analysis of key parameters. Finally, Section 5 concludes the paper with key findings and directions for future research.

2. Related Work

Prior to the emergence of metaheuristic clustering methods, classical algorithms such as K-means [4,14], hierarchical clustering [3], and DBSCAN [15] served as foundational tools in unsupervised learning. These approaches remain widely used due to their simplicity, computational efficiency, and interpretability, and they have played a pivotal role in numerous early clustering applications. However, despite their practical advantages, classical methods often rely on strong assumptions about data distribution, such as spherical clusters in K-means or fixed density thresholds in DBSCAN. As a result, they can face challenges when applied to noisy, non-convex, or high-dimensional datasets [16]. For example, K-means is sensitive to initial centroid selection and may converge to local minima, while DBSCAN may struggle with clusters of varying densities. These limitations have motivated the exploration of alternative clustering strategies that can provide more flexible, global search capabilities. In this context, metaheuristic optimization algorithms have gained increasing attention for their ability to overcome local traps, maintain diversity, and adapt to complex search spaces without strong assumptions about data geometry [17,18].
Building upon this motivation, metaheuristic algorithms have emerged as a powerful alternative for clustering, reformulating it as a global optimization problem. Metaheuristic algorithms have become a dominant paradigm in data clustering by reformulating clustering as a global optimization problem that minimizes intra-cluster distance while maximizing inter-cluster separation. Unlike classical clustering techniques, these approaches provide flexible search mechanisms capable of escaping local optima and reducing sensitivity to initialization. Survey studies consistently report that swarm intelligence and evolutionary algorithms such as Genetic Algorithms (GA), Particle Swarm Optimization (PSO), and Ant Colony Optimization (ACO) outperform traditional clustering approaches across diverse datasets [19,20]. Overall, recent research trends indicate a shift from classical deterministic clustering toward adaptive population-based optimization frameworks that emphasize exploration–exploitation balance and robustness against complex search landscapes.
Several metaheuristic algorithms have been successfully adapted to clustering tasks. For example, Grey Wolf Optimizer (GWO)-based clustering demonstrates competitive performance compared with k-means and fuzzy c-means [21], while history-driven Artificial Bee Colony (Hd-ABC) algorithms enhance convergence stability [22]. Similarly, Whale Optimization Algorithm (WOA), Chimp Optimization Algorithm (ChOA), Tree Seed Algorithm (TSA), and Artificial Algae Algorithm (AAA) have achieved promising clustering results in terms of accuracy and robustness [23,24,25,26,27]. Despite these advancements, many studies reveal recurring methodological limitations, including premature convergence, loss of population diversity, and performance degradation in high-dimensional or noisy datasets. These limitations have motivated the development of hybrid and enhanced metaheuristic frameworks.
Hybridization has emerged as a major research direction, where complementary search strategies are combined to improve convergence and stability. The Genetic Black Hole (GBH) algorithm, for instance, integrates global exploration with intensive local exploitation to achieve faster convergence and higher clustering accuracy [28]. Early foundational works on ACO- and ABC-based clustering established the effectiveness of cooperative swarm behaviors for centroid optimization and laid the groundwork for subsequent hybrid metaheuristic designs [29,30]. A key overarching trend in recent literature is the increasing reliance on hybrid mechanisms to address exploration–exploitation imbalance, suggesting that single-strategy algorithms may be insufficient for complex clustering landscapes.
Chaotic maps have gained significant attention as diversity-enhancing mechanisms in metaheuristic clustering. Logistic and other chaotic mappings introduce ergodic, non-periodic search behaviors that improve exploration capability and mitigate stagnation [31,32]. Studies integrating chaotic dynamics into algorithms such as BKA, PSO, Fox Optimization, Bee Colony Optimization, and ACO consistently report improvements in convergence speed and solution quality [33,34,35,36,37,38]. However, existing chaotic approaches often focus primarily on parameter perturbation rather than structural search adaptation, leaving open challenges regarding scalability and adaptive control in dynamic or high-dimensional clustering scenarios.
Another prominent enhancement strategy involves Lévy flight, which introduces long-range stochastic movements to improve global exploration. Originating from Cuckoo Search, Lévy-based search dynamics have demonstrated superior performance in escaping local optima and exploring large search spaces [39,40]. Lévy-enhanced WOA, PSO–K-means hybrids, and Black Hole-based clustering algorithms have shown improved stability and clustering accuracy compared to classical methods [41]. Several studies have directly applied Lévy flight-enhanced metaheuristic algorithms to data clustering problems. Lévy flight-based WOA and hybrid PSO–K-means clustering frameworks have consistently shown superior clustering accuracy, stability, and robustness compared to classical clustering methods and their non-enhanced counterparts, with statistical validation confirming the effectiveness of Lévy flight strategies [42,43,44]. Applications in real-world domains further confirm the effectiveness of Lévy-based exploration strategies [45]. Nevertheless, many Lévy-based methods exhibit sensitivity to parameter settings and may introduce excessive randomness if not carefully balanced with local exploitation mechanisms.
More recently, hybrid strategies combining chaotic dynamics with Lévy flight have attracted increasing interest. Such approaches aim to integrate nonlinear adaptive control with long-range exploration, yielding improved convergence behavior in optimization and clustering tasks [46,47,48,49]. These studies suggest that combining complementary enhancement strategies can address the limitations of single-mechanism metaheuristics. Despite these advances, a systematic integration of chaotic control and Lévy-driven exploration within a unified clustering framework remains relatively underexplored, particularly for newly developed algorithms such as the Black-Winged Kite Algorithm.
Overall, the literature indicates a clear transition toward hybrid and adaptive metaheuristic clustering frameworks, accompanied by the widespread integration of chaotic and Lévy-based mechanisms to enhance exploration and population diversity. Despite these advances, persistent challenges such as premature convergence, scalability limitations, and stability issues across heterogeneous datasets remain unresolved. Motivated by these observations, this study introduces a hybrid Chaotic Lévy Black-Winged Kite Algorithm (CLBKA) to achieve a more balanced and robust clustering optimization strategy.

3. Materials and Methods

The methodological design of this study is structured to comprehensively evaluate the clustering performance of the proposed Black-Winged Kite Algorithm (BKA) and its enhanced variants, CBKA, LBKA, and CLBKA. The overall process encompasses a series of systematic stages, including dataset acquisition, preprocessing, initial population generation, fitness evaluation via the sum of squared errors (SSE), and iterative optimization through chaos-induced and Lévy flight-based mechanisms. Each solution undergoes dynamic updates via attack and migration behaviors, with periodic leader selection to guide convergence. Finally, clustering quality is assessed using both internal (SSE) and external (Rand Index) metrics. To facilitate clarity and reproducibility, the entire methodological pipeline is schematically illustrated in Figure 1, which outlines the sequential flow of operations from data input to result visualization.

3.1. Black-Winged Kite (BKA) Algorithm

The Black-Winged Kite Algorithm (BKA) is a nature-inspired metaheuristic whose design is motivated by the distinctive hovering capability and efficient hunting strategies of the BKA. These biological characteristics enable the bird to effectively scan large areas, precisely identify targets, and rapidly converge toward prey, which are analogous to global exploration and convergence behaviors in optimization processes. By abstracting the kite’s movement patterns and hunting dynamics into computational rules, BKA provides an efficient mechanism for guiding candidate solutions toward optimal regions within the search space [9].

3.1.1. Attack Behavior

The black-winged kite employs a highly specialized hunting strategy characterized by stable hovering, continuous observation, and rapid target-oriented descent [50]. As illustrated in Figure 2, the kite remains suspended in midair to assess potential prey and determine the most suitable attack trajectory. Once a target is identified, the bird initiates a swift and controlled descent, capturing its prey with high precision and minimal energy expenditure. This behavior inspires the attack phase of the Black-Winged Kite Algorithm, which emphasizes accuracy and efficient movement toward promising solutions. Mathematically, the attack mechanism of BKA is modeled using the position update rule given in Equation (1), where the control parameter defined in Equation (2) regulates the step size and search intensity during the attack phase:
x i , j t + 1 =   x i , j t + n 1 + s i n r × x i , j t   p < r x i , j t + n × 2 r 1 × x i , j t   e l s e  
n = 0.05 × e 2 × t T 2
In these equations X i , j t and x i , j t + 1 denote the position of the i black-winged kite and the j black-winged kite at time t and ( t + 1 ) , respectively. The variable r represents a uniformly distributed random number within the interval [0,1], while P is a predefined constant set to 0.9. The parameter T indicates the maximum number of iterations, and t corresponds to the current iteration count.

3.1.2. Migration Behavior

The migration mechanism of the BKA is inspired by the collective movement behavior of black-winged kites, which enhances the algorithm’s exploration capability. In natural environments, bird migration is influenced by external factors such as climate conditions and food availability, while dominant individuals play a key role in directing the flock. As illustrated in Figure 3, these leadership-driven dynamics result in adaptive movement patterns during migration. In the BKA framework, this behavior is modeled by associating leadership with solution fitness: when a candidate solution exhibits inferior fitness compared to a randomly selected counterpart, it follows that solution; otherwise, it assumes the leadership role and guides the search process [51].
This strategy enables the dynamic selection of effective leaders, which plays a crucial role in maintaining a successful migration process. The mathematical formulation describing the migration behavior of the black-winged kite within the BKA framework is given in Equation (3), where the control parameter m is defined in Equation (4).
In this formulation L j t denotes the value of the best solution in the j t h dimension at iteration t , while χ i t + 1 represent the current and updated positions of the i t h candidate solution, respectively. The fitness values of the current candidate and the randomly selected solution are represented by F i   and F r i . The term C 0,1   introduces a perturbation based on the Cauchy distribution, which enhances population diversity and prevents premature convergence, as defined in Equation (5) and simplified in Equation (6).
The parameter m , calculated using a sinusoidal function as shown in Equation (4), controls the step size of the migration movement and introduces nonlinearity into the update process. Through the combined effect of fitness-based leadership selection and Cauchy-distributed perturbations, the migration mechanism improves exploration efficiency while preserving the algorithm’s ability to converge toward promising regions of the search space.
x i , j t + 1 = x i , j t + c 0,1 × x i , j t L j t   F i < F r i   x i , j t + c 0,1 × L t j m × x i , j t   e l s e
m = 2 × s i n ( r + π 2 )
f x , δ , μ = 1 π δ δ 2 + x μ 2 < x <
When δ = 1 and μ = 0 , the expression for the Cauchy mutation is:
f x , δ , μ =   1 π 1 x 2 + 1   < x <
Algorithm 1 presents the pseudocode of BKA, outlining its main steps and core operations in the optimization process [9].
Algorithm 1. Pseudo-code of BKA.
Algorithm: Black-winged kite algorithm
Input: The population size p o p , maximum number of iterations T , and variable dimension d i m .
Output: The best quasi-optimal solution obtained by BKA for a given optimization problem.
1. Initialization phase
2. Initialization of the position of Black-winged kites and evaluation of the objective function.
3. Calculate the fitness value of each Black-winged kite
4.      while (t < T) do
5.    Attacking behavior
6.    if p < r
7.      y t + 1 i , j = y t i , j + n(1 + sin( r )) × y t i , j
8.    else if do
9.      y t + 1 i , j = y t i , j + n × (2r − 1) × y t i , j
10.    end if

    Migration behavior
11.    if Fi < Fri do
12.      y t + 1 i , j = y t i , j + C(0,1) × ( y t i , j L t i )
13.     else if do
14.      y t + 1 i , j = y t i , j + C(0,1) × ( L t i − m × y t + 1 i , j )
15.    end if

    Select the best individual
16.    if  y t + 1 i , j < L t i
17.      Xbest = yᵢⱼ, Fbest = f( y t + 1 i , j )
18.     else if do
19.      Xbest = L t i , Fbest = f( L t i )
20.     end if
21.    end while
22. Return Xbest and Fbest

3.2. Logistic Map

The logistic map is a widely studied one-dimensional discrete-time dynamical system known for exhibiting chaotic behavior, as defined in Equation (7). Although it was originally proposed to describe population growth dynamics, its simple mathematical formulation has led to extensive applications in various domains, including encryption, data security, and nonlinear system modeling [52].
x k + 1 = a x k 1 x k ,  
where x [ 0,1 ] denotes the system state at iteration k , while a   represents the control parameter that governs the system’s behavior. Depending on the value of a . The logistic map can exhibit a wide range of dynamical regimes, transitioning from stable fixed points to periodic oscillations and ultimately to fully chaotic behavior.
A key characteristic of the logistic map is its strong sensitivity to initial conditions, which is a fundamental property of chaotic systems. This sensitivity makes the logistic map particularly suitable for applications that require high diversity and unpredictability, such as stochastic optimization and cryptographic processes. In the context of metaheuristic optimization, logistic chaotic sequences are frequently employed to enhance exploration and exploitation by generating non-repetitive pseudo-random patterns that guide the search process. By embedding the logistic map into algorithmic parameters, the search dynamics can be adaptively adjusted, effectively reducing premature convergence and improving overall optimization performance [53].

3.3. Lévy Flight

Lévy flight, originating from the studies of mathematician Paul Pierre Lévy, describes a stochastic movement pattern characterized by frequent short steps interrupted by occasional long-distance jumps, governed by a heavy-tailed probability distribution [39]. This type of movement has been empirically observed in various organisms, including birds and fruit flies, whose foraging behaviors exhibit statistical characteristics consistent with Lévy flights. Such naturally occurring patterns have served as inspiration for the design of novel optimization algorithms. In general, Lévy flight is defined by step lengths drawn from a Lévy distribution, resulting in a mixture of localized movements and sporadic long-range transitions. Similar behavioral dynamics have been documented across numerous animal and insect species [54,55]. The mathematical foundations of Lévy flight, established in the early twentieth century, have enabled its later adoption as an effective exploration mechanism within modern metaheuristic optimization frameworks [56]. Lévy flights are broadly recognized as an effective mathematical model for representing the search and movement behaviors of animals and insects. By integrating long-range exploratory steps with short-range exploitative movements, this approach provides a balanced trade-off between exploration and exploitation, which is highly desirable in global optimization algorithms.
A survey of existing studies indicates that Lévy flight has been applied both in its original formulation and through various modified versions. Several adaptations, such as trimmed Lévy flight, smoothed Lévy approaches, and segmented Lévy motion, have been introduced to improve algorithmic efficiency in specific optimization problems [57]. In this study, however, the classical Lévy flight formulation is employed without modification in order to retain the inherent dynamics of the original model.
Lévy flight is classified as a non-Gaussian stochastic process exhibiting heavy-tailed properties, in which movement behavior follows a Lévy stable distribution [58]. This distribution demonstrates power-law characteristics, permitting the occurrence of infrequent yet significant jumps in step length [48]. A simplified expression of the Lévy distribution is presented in Equations (8) and (9).
S = μ | ν | 1 k
for 1 < k ≤ 3, and μ ∼ N(0, σ μ 2 ), ν ∼ N(0, σ v 2 ). μ and ν are random numbers obeying Gaussian distribution, and σ μ and σ v satisfy the following equations:
σ μ = Γ   ( 1   +   k )   s i n ( 0.5 k π ) Γ   [ 0.5 ( 1   +   k ) ] 2 0.5 ( k 1 ) 1 k ,   σ v = 1  
where Γ (·) denotes Gamma function [59].

3.4. Description of the Proposed CBKA

In the proposed CBKA framework, the logistic chaotic map is embedded into the original BKA structure to enhance population diversity throughout the search process. As a representative chaotic system, the logistic map produces highly irregular and non-periodic sequences that are extremely sensitive to initial conditions. This inherent unpredictability introduces controlled variations into the search dynamics, thereby increasing diversity among candidate solutions and strengthening the algorithm’s exploratory behavior, particularly in complex and high-dimensional search spaces. In the clustering context, CBKA is adapted to optimize cluster centroids by minimizing the Sum of Squared Errors (SSE) between data points and their nearest centroids. Each individual in the population represents a set of k centroids in d-dimensional space. During each iteration, centroids are updated using chaos-modulated control parameters, while data points are assigned to the closest centroid based on Euclidean distance. This formulation transforms clustering into a continuous optimization problem suitable for metaheuristic search.
By dynamically modulating key control parameters through chaotic sequences, CBKA achieves a more adaptive balance between exploration and exploitation. This mechanism reduces the likelihood of premature convergence by continuously perturbing the search trajectory, allowing the algorithm to escape local optima while maintaining steady convergence toward promising regions. Consequently, the integration of chaotic dynamics enables CBKA to preserve convergence efficiency while improving its ability to explore the solution space effectively. Comparative evaluations on benchmark optimization problems indicate that CBKA consistently outperforms the standard BKA, confirming the positive impact of chaotic mapping on metaheuristic optimization performance [10]. Figure 4 presents the workflow of the Chaotic Black-Winged Kite Algorithm.

3.5. Description of the Proposed LBKA

The proposed Lévy Flight-based Black-Winged Kite Algorithm (LBKA) incorporates Lévy flight mechanisms into the BKA framework to reinforce global exploration during the optimization process. Lévy flight is characterized by step-length distributions that generate infrequent but significant long-distance movements interspersed with shorter steps, enabling the algorithm to traverse distant regions of the search space efficiently. This property enhances the algorithm’s ability to explore complex landscapes and reduces the risk of stagnation in locally optimal regions. Through the integration of Lévy flight dynamics, LBKA achieves a more effective trade-off between global exploration and local exploitation. The stochastic yet structured movement patterns introduced by Lévy flight guide the search toward unexplored and potentially promising areas while preventing excessive confinement around suboptimal solutions. In the clustering context, LBKA is adapted to minimize the Sum of Squared Errors (SSE) between data points and their closest centroids. Each individual encodes a set of k centroids in a d-dimensional space. At each iteration, data samples are assigned to the nearest centroid using Euclidean distance, and centroids are updated via Lévy flight-driven steps. This allows the algorithm to explore diverse clustering configurations while improving convergence stability. As a result, LBKA demonstrates improved convergence behavior, higher solution accuracy, and increased robustness compared to the standard BKA across various benchmark optimization problems. The LBKA procedure consists of population initialization, fitness evaluation, leader selection, adaptive parameter adjustment, Lévy flight-based position updates, boundary handling, fitness comparison, and diversity-preserving migration, with these steps iteratively executed until the predefined stopping criterion is satisfied [10]. The workflow of the Lévy Black-Winged Kite Algorithm (LBKA) is illustrated in Figure 5.

3.6. Description of the Proposed CLBKA

The proposed Chaotic Lévy-based Black-Winged Kite Algorithm (CLBKA) integrates Lévy flight and logistic chaotic mapping into the BKA framework to jointly enhance global exploration and local exploitation. Lévy flight enables occasional long-distance moves that facilitate escape from local optima, while the logistic chaotic map adaptively regulates control parameters through non-periodic dynamics, promoting diverse and non-repetitive search trajectories. Together, these mechanisms establish a synergistic search strategy in which Lévy flight supports wide-range exploration and chaotic control guides convergence. In addition, Cauchy-distributed perturbations applied during the migration phase further improve population diversity and mitigate premature convergence. Consequently, CLBKA demonstrates superior performance compared to the standard BKA and its single-enhanced variants, CBKA and LBKA [10]. While chaotic maps and Lévy flight have individually been employed in prior metaheuristics, CLBKA introduces a distinct hybrid integration strategy that tightly couples dynamic parameter control with conditionally triggered exploration. Unlike conventional approaches where chaotic maps are restricted to population initialization or random number generation, logistic chaos in CLBKA is applied at every iteration to modulate both attack and migration behaviors. Lévy flight is selectively activated with a fixed probability during the attack phase and conditionally re-invoked during migration when stagnation is detected relative to the population mean, thereby preserving long-range search capability without inducing excessive randomness. Together with Cauchy-based diversity enhancement, these components form a non-trivial extension of the BKA framework tailored for centroid-based clustering.
In CLBKA, each individual represents a set of k cluster centroids in a d-dimensional space. The algorithm is adapted for clustering by minimizing the Sum of Squared Errors (SSE) between data points and their nearest centroids. At each iteration, data points are assigned to the closest centroid using Euclidean distance, and centroid positions are updated to reduce intra-cluster distances. This formulation transforms clustering into a continuous optimization task, enabling CLBKA to efficiently search for optimal clustering configurations.
To clarify the joint integration of chaotic mapping, Lévy flight, and Cauchy mutation, their phase-specific roles within the optimization process are summarized as follows:
Chaotic Logistic Maps are used at each iteration to modulate control parameters such as r, which affect both attack and migration behaviors. Their deterministic yet sensitive nature ensures dynamic variation, preventing stagnation and cyclic search patterns.
Lévy Flight is selectively employed during the attack phase with a fixed probability and conditionally re-triggered during migration under stagnation, facilitating escape from local optima through long-range exploration.
Cauchy Mutation operates in the migration phase, introducing localized high-kurtosis perturbations that enhance population diversity while maintaining convergence stability.
These mechanisms are mathematically complementary: chaos provides adaptive non-repetitive control, Lévy flight supports probabilistic global exploration, and Cauchy mutation ensures localized stochastic diversity. Their coordinated integration yields a tiered stochastic architecture that balances exploration, exploitation, and diversity, resulting in improved convergence stability and superior clustering performance across diverse datasets. The pseudocode of the proposed CLBKA is provided in Algorithm 2.
Algorithm 2. Pseudo-code of CLBKA [10].
Algorithm: BKA with Lévy Flight and Chaotic Map
 
  • Initialize positions X and evaluate fitness
  • Set Lévy parameter β and chaos parameter r

  3.
For t = 1 to T do
      -
Update r using chaotic map
      -
For each kite
Compute noise n
If p < r:
 If r and <  0.5:
  Apply Lévy flight
Else:
   X i = X i + n×(1 + sin(r))× X i
Else:
   X i = X i + n×(2×rand 1) + 1)× X i
Apply bounds and update if better

          ●
Migration:
 If mod (t, 20) == 0 and X i bad:
  Apply Lévy jump
 Else:
  Use Cauchy noise for movement

          ●
Apply bounds and update if better
        -
Update global best

  4.
Return best position and fitness
Figure 6 presents the flowchart of the proposed Lévy-based Chaotic Black-Winged Kite Algorithm (CLBKA), outlining its main stages. The diagram depicts the overall algorithmic structure, covering parameter initialization, population creation, fitness assessment, chaotic updates, migration and attack phases, Lévy flight application, leader updating, and the iterative optimization procedure.

3.7. Clustering Problem

Clustering methods aim to divide datasets with unknown class labels into meaningful subgroups by grouping data samples that share similar characteristics [42]. The fundamental objective of clustering is to form clusters in which data points within the same group exhibit high similarity, while data points belonging to different groups are well separated. Accordingly, an effective clustering solution seeks to minimize intra-cluster distances while simultaneously maximizing inter-cluster separation [60,61].
Clustering is typically applied in scenarios where prior information about the underlying structure of the dataset is unavailable. Given a set of n observations drawn from a population, each observation is treated as a data instance characterized by a set of variables. During the clustering process, data instances with comparable properties are assigned to the same cluster, enabling the aggregation of observations while preserving essential information content. This process facilitates the discovery of inherent patterns within the data with minimal information loss [62].
In this study, the clustering task is formulated as a centroid-based optimization problem, where the objective is to minimize the Sum of Squared Errors (SSE), a widely adopted criterion in unsupervised learning. The objective function is defined as:
S S E = i = 1 k x j C i x j μ i 2
where C i represents the i t h   cluster, μ i is the centroid of cluster C i , and x j is a data point assigned to that cluster. The norm denotes the Euclidean distance, which was selected due to its simplicity, interpretability, and common usage in clustering literature. While squared distances are accumulated in the algorithm, the implementation also includes a square root operation at the final level to improve numerical stability in optimization. No feature standardization or normalization procedure was applied prior to the optimization process. The clustering algorithm operates directly on the original feature values of each dataset. Therefore, distance calculations reflect the inherent scale and distribution characteristics of the datasets used in the experiments.

3.8. Dataset

In this study, sixteen benchmark datasets for classification and clustering were selected from the UCI (University of California, Irvine) Machine Learning Repository [63]. These datasets were chosen to evaluate the performance of the proposed methods under varying conditions, including different numbers of features, cluster centers, and data instances. For each dataset, the number of clusters (k) was set equal to the number of distinct class labels, as commonly adopted in benchmarking studies using labeled UCI datasets. While this setting enables a direct comparison with ground-truth labels, it is acknowledged that in real-world unsupervised scenarios the true number of clusters is typically unknown. Estimating k therefore constitutes an important extension of clustering algorithms. Future research may explore incorporating automatic k estimation into the CLBKA framework, either through internal validation indices or by treating k as an optimization variable within the metaheuristic search process. Prior to clustering, a uniform preprocessing procedure was applied across all datasets. Missing values, when present, were handled using mean imputation based on the corresponding feature column. No additional normalization or standardization was applied, and all datasets were processed in their original numerical form. The same preprocessing protocol was consistently used for all datasets to ensure methodological fairness and reproducibility. An overview of the datasets and their key characteristics is provided in Table 1.

3.9. Friedman Test

The Friedman test is a non-parametric statistical method commonly employed to compare multiple related samples, especially when parametric assumptions such as normality are violated [64]. In optimization research, it is frequently used to assess the comparative performance of several algorithms over multiple benchmark problems by ranking their results. Based on these rankings, the test determines whether statistically significant performance differences exist among the algorithms. Owing to its suitability for dependent samples and its distribution-free nature, the Friedman test has become a standard tool for algorithm evaluation in benchmark-based studies [65]. To statistically validate the comparative performance of the algorithms across multiple datasets, the Friedman test was employed. The null hypothesis (H0) states that all algorithms perform equivalently and therefore have equal median ranks across the considered benchmark datasets. The alternative hypothesis (H1) states that at least one algorithm performs significantly differently. For each dataset, the algorithms were ranked according to their performance, measured in terms of Sum of Squared Errors (SSE) and Rand Index (RI), where rank 1 was assigned to the best-performing algorithm and higher ranks to inferior ones. In cases of ties, average ranks were assigned. The mean rank of each algorithm across all datasets was then computed and used in the Friedman test statistic. The significance level was set to α = 0.05. If the Friedman test indicated statistically significant differences, post hoc pairwise comparisons were conducted using the Nemenyi test to identify which algorithms differed significantly.
The Friedman test was preferred over parametric alternatives because the distributional assumptions of normality and homoscedasticity cannot be guaranteed for performance metrics across heterogeneous benchmark datasets. Compared to other non-parametric alternatives such as the Quade test, the Friedman test is widely adopted in multi-dataset algorithm comparison studies due to its robustness and interpretability.

3.10. Wilcoxon Signed-Rank Test

The Wilcoxon signed-rank test is a well-established non-parametric method used to assess differences between two dependent samples, such as paired observations or repeated measurements taken from the same population. In contrast to the paired t-test, this approach does not rely on the assumption of normally distributed data, which makes it particularly appropriate when the underlying distribution is unknown or deviates from normality. For this reason, it is frequently employed in optimization and metaheuristic algorithm studies to perform pairwise comparisons of algorithmic performance across multiple benchmark functions, where outcomes are typically obtained from several independent executions [66,67].

3.11. Time Complexity Analysis

This section presents a theoretical analysis of the computational complexity of the proposed Black-Winged Kite Algorithm (BKA) and its enhanced variants (CBKA, LBKA, and CLBKA), in order to substantiate the claims regarding computational efficiency. Let the following notations be defined:
n : Population size (number of candidate solutions);
d : Dimensionality of the dataset (number of features);
k : Number of clusters;
T : Maximum number of iterations.
In centroid-based metaheuristic clustering frameworks, the dominant computational cost arises from the fitness evaluation process, which requires assigning data samples to cluster centroids and computing the clustering objective function, typically the sum of squared errors (SSE). For each candidate solution, the assignment of data points to the nearest cluster centroid involves distance calculations with a time complexity proportional to O ( k d ) . Consequently, the evaluation of the objective function for a single candidate solution also scales as O ( k d ) . During each iteration, every search agent undergoes a sequence of operations including position updates (attack and migration behaviors), fitness evaluation, and leader comparison. Therefore, the overall computational cost per iteration across the entire population can be expressed as:
O ( n k d )
Over T iterations, the total time complexity of the algorithm becomes:
O ( n k d T )
The proposed enhancements introduced in CBKA, LBKA, and CLBKA, namely chaotic logistic mapping and Lévy flight mechanisms, primarily consist of stochastic perturbations, random number generation, and elementary mathematical operations. These operations are executed in constant time and do not introduce additional nested loops or population-level evaluations. As a result, they do not alter the asymptotic order of the computational complexity. Accordingly, despite incorporating additional exploration mechanisms, the CLBKA preserves the same theoretical time complexity as the standard BKA. The improvements in clustering performance achieved by CLBKA are therefore obtained without increasing the algorithm’s asymptotic computational burden, indicating a favorable trade-off between solution quality and computational cost.

4. Result and Discussion

4.1. Comparative Analysis of BKA, CBKA, LBKA, and CLBKA

In this study, the clustering performance of four BKA-based approaches (BKA, LBKA, CBKA, and CLBKA) was compared under different combinations of population size (P = 30, P = 50) and iteration number (T = 500, T = 1000). An examination of the average objective function values and rankings reported in the tables shows that CLBKA achieves the best or an equally good performance across all datasets. This finding indicates that CLBKA can establish a more effective balance between exploration and exploitation during the search process, thereby reducing the likelihood of being trapped in local minimum. In contrast, the classical BKA tends to remain at higher objective function values for most datasets, while LBKA and CBKA provide a noticeable improvement over BKA but still lag behind CLBKA.
This superiority of CLBKA is most evident under limited resource settings (P = 30, T = 500), where both computational budget and population diversity are constrained. The integration of Lévy flight and chaotic logistic mapping equips CLBKA with two complementary mechanisms: Lévy-based jumps allow the algorithm to break free from local optima through long-distance exploration, while chaos-driven updates introduce controlled diversity, preventing premature convergence. These characteristics are especially critical in complex datasets like Glass and Btisuse, where high-dimensional features and class overlap commonly hinder traditional metaheuristics. Indeed, CLBKA not only yields the lowest average SSE in these datasets but also maintains significantly lower standard deviations, confirming its stability across runs. Moreover, the performance gap between CBKA/LBKA and CLBKA highlights the nonlinear synergy achieved by combining chaos and Lévy dynamics within a single framework. CBKA’s performance is enhanced by diversity but suffers from limited reach, while LBKA explores widely but lacks adaptive control. CLBKA successfully unifies these strengths, allowing it to both explore the global search space and converge efficiently when promising regions are identified. The results from Diabetes and Parkinson datasets, where overlapping clusters pose difficulty, further support this, as CLBKA remains robust where other algorithms degrade.
Overall, CLBKA’s performance is not only numerically superior but also structurally justified, demonstrating that thoughtful hybridization of metaheuristic strategies can yield both accuracy and consistency, especially in low-budget clustering scenarios.
To ensure reproducibility and enable robust statistical evaluation, all algorithms were independently executed 30 times on each dataset. For each algorithm–dataset pair, the best, worst, mean, and standard deviation values of the clustering objective function were computed. The hyperparameter settings used in all experiments are provided in Table 2, and the results are summarized in Table 3, Table 4, Table 5, Table 6.
Table 2. Hyperparameter Settings Used in the Experiments of proposed methods.
Table 2. Hyperparameter Settings Used in the Experiments of proposed methods.
ParameterValue(s) Used
Population Size (P)30, 50
Iteration Number (T)500, 1000
Independent Runs30
Lévy Flight Parameter (β)1.5
Logistic Map TypeLogistic Map
Initial Chaos Value (x0)0.7
Probability Parameter (p)0.9
Migration Coefficient (m) 2 × s i n ( r + π 2 )
Table 3. Comparative Clustering Results: BKA, LBKA, CBKA, and CLBKA (P = 30, T = 500).
Table 3. Comparative Clustering Results: BKA, LBKA, CBKA, and CLBKA (P = 30, T = 500).
P = 30, T = 500
Dataset BKALBKACBKACLBKA
BalanceB1434.101423.861423.851423.84
W1456.281423.961423.971423.88
A1447.661423.891423.881423.87
S5.776320.023250.023670.01189
Rank4321
CreditB563695556747556753556746
W606112557209557211557159
A583138556953556982556834
S11354.3197.967214.150130.690
Rank4231
DermatologyB3088.182812.452793.262773.17
W3243.332912.232934.342860.30
A3182.722861.852860.772838.48
S33.032726.981531.890122.9783
Rank4321
E. coliB104.35468.996768.513169.8495
W123.87474.743074.806973.9756
A115.56872.263272.591572.1794
S5.303651.764151.609131.44045
Rank4231
GlassB418.437294.017283.512281.92
W491.441332.000326.243308.887
A457.13308.743305.671299.149
S19.5289.0392311.58156.8524
Rank4321
IrisB129.50496.69496.701196.7043
W170.697.16497.599896.8414
A148.1996.827496.872896.7764
S8.99120.1160030.2265340.042248
Rank4231
ThyroidB2493.491874.541878.421875.11
W2878.41967.052000.61922.1
A2691.151912.561908.091899.24
S105.01326.601823.859312.5892
Rank4321
WineB1648416315.316310.616316
W17214.816343.616344.516329.7
A16811.51632616326.216324.5
S184.0796.360587.390383.86568
Rank4231
HeartB9819.989442.359442.459441.81
W10791.49446.319445.429443.94
A10314.49443.849443.749443.25
S263.7790.8946610.7395880.466652
Rank4321
SpectB593.746555.296554.817555.056
W636.271561.334561.966557.343
A620.493557.417558.056556.296
S11.51091.39161.619340.70751
Rank4231
DiabetsB73433.672107.272107.272107.2
W81335.472186.174100.572107.3
A75681.372109.972173.772107.2
S1879.7514.4001363.910.01464
Rank4231
HepatitB9792.769442.779442.29441.99
W11014.99446.029445.649443.95
A10384.39443.829443.89443.31
S314.8440.795440.768530.46028
Rank4321
BtissueB198575130817128806129383
W274666151215152176140714
A239229140130137016135735
S17800.95799.355802.753439.14
Rank4321
ParkinsonB16466.5164631646316462.9
W16732.816463.116463.216463
A16547.5164631646316463
S65.18920.041430.048460.02922
Rank2111
SomervilleB302.765280.534280.529280.528
W327.903280.642280.669280.569
A318.476280.567280.579280.553
S5.725830.021640.032570.01052
Rank4231
User ModelingB108.8397.478797.58197.4115
W118.59399.3549100.6398.9282
A113.15998.204399.039598.1158
S2.358140.441380.932390.48966
Rank4231
As shown in Table 4, increasing the number of iterations to P = 50 while keeping T = 1000 improves the clustering performance of all methods; however, this improvement is markedly less pronounced for CLBKA. This indicates that CLBKA is able to reach high-quality solutions at earlier stages of the search, whereas BKA, LBKA, and CBKA require extended iterations to compensate for slower convergence. The early saturation behavior of CLBKA can be attributed to the combined effect of chaotic parameter control and Lévy flight exploration, which enables efficient global search in the early iterations while rapidly refining promising regions. The reduced performance gap at higher iteration budgets suggests that prolonged search primarily benefits algorithms lacking strong diversification mechanisms. In contrast, CLBKA maintains both low SSE values and low variance across datasets with different structural characteristics, such as Glass and User Modeling, indicating stable convergence behavior. These results confirm that the hybrid design of CLBKA not only enhances exploration but also reduces sensitivity to iteration count, making it effective under both limited and extended computational budgets.
Table 4. Comparative Clustering Results: BKA, LBKA, CBKA, and CLBKA (P = 30, T = 1000).
Table 4. Comparative Clustering Results: BKA, LBKA, CBKA, and CLBKA (P = 30, T = 1000).
P = 30, T = 1000
Dataset BKALBKACBKACLBKA
BalanceB1434.381423.841423.841423.84
W1454.411423.881423.881423.86
A1443.921423.861423.851423.84
S4.459210.010230.008220.00544
Rank4321
CreditB566003556740556747556742
W599243557158594453556977
A579439556835558203556781
S9678.94149.216849.5156.1696
Rank4231
DermatologyB3059.212660.72638.852625.25
W3239.092733.972746.202704.36
A3158.232694.522703.092671.65
S36.158818.615921.578220.2732
Rank4231
E. coliB101.19768.60965.967767.2796
W122.59573.147572.900970.6152
A112.16270.157870.236569.6348
S5.067331.101301.484820.69459
Rank4231
GlassB366.561275.199270.708276.05
W481.605309.463322.238290.102
A440.211289.142293.247283.884
S26.68569.3381413.12094.05847
Rank4231
IrisB120.71296.676296.692796.6689
W158.88196.676296.769196.7143
A143.95396.707696.717796.6996
S7.545850.018840.019960.00935
Rank4231
ThyroidB2197.451868.051868.881868.35
W2864.431902.061912.141890.24
A2625.511881.371882.251873.42
S127.69311.03613.68234.44119
Rank4231
WineB16533.716311.316309.916307.8
W16971.816336.416328.416318.7
A16705.416318.116316.816314.6
S104.9746.33395.206923.02425
Rank4321
HeartB9765.089441.129441.369441.24
W10898.49443.559443.799442.23
A10306.59442.189442.229441.82
S307.7550.519880.551740.23585
Rank4231
SpectB601.502554.546554.546554.546
W632.728554.553554.565554.548
A612.34554.549554.549554.547
S7.556070.001600.0034510.000612
Rank3221
DiabetsB72696.872107.272107.272107.2
W79053.172107.272186.172107.2
A74839.672107.272109.972107.2
S15360.00042814.40655.05836 × 10−5
Rank3121
HepatitB9822.059441.49441.69440.88
W10823.99442.989443.439442.23
A10204.59442.189442.299441.85
S251.6550.413340.469520.32672
Rank4231
BtissueB197898127761126541127535
W263980140316149688132089
A228515131256132632129099
S14765.63028.015987.261253.76
Rank4231
ParkinsonB16480.716462.916462.916462.9
W16693.316463.116463.116463
A16530.4164631646316463
S45.30180.04690.04950.0232
Rank2111
SomervilleB290.487280.526280.52280.516
W325.498280.555280.574280.537
A313.688280.537280.542280.53
S8.510440.0084920.0118890.00506
Rank4231
User ModelingB106.62297.355797.358297.3519
W115.72197.525798.350797.3872
A111.92197.376197.474697.3697
S2.134930.0302320.233400.00866
Rank4231
As seen in Table 5, increasing the population size to P = 50 while keeping T = 500 improves the performance of all methods due to greater solution diversity and wider search coverage. However, CLBKA maintains its leading position, delivering the lowest average SSE values and smallest standard deviations across most datasets. This indicates that CLBKA’s hybrid design enables it to leverage a smaller population more effectively, making additional population size less critical for its convergence behavior. In contrast, BKA, LBKA, and CBKA benefit more from population growth, as they rely on a larger swarm to escape local minima and enhance search stability. Yet even with this advantage, they fall short of CLBKA’s results, highlighting that the integration of chaotic perturbations and Lévy-based jumps provides a more powerful mechanism for maintaining exploration without relying solely on population size. Notably, CLBKA performs best even in challenging datasets like Btissue and Glass, where high dimensionality and complex cluster structures typically degrade algorithm performance.
Table 5. Comparative Clustering Results: BKA, LBKA, CBKA, and CLBKA (P = 50, T = 500).
Table 5. Comparative Clustering Results: BKA, LBKA, CBKA, and CLBKA (P = 50, T = 500).
P = 50, T = 500
Dataset BKALBKACBKACLBKA
BalanceB1436.411423.841423.851423.84
W1454.581423.911423.911423.88
A1444.361423.881423.881423.86
S4.70410.01577780.01351120.00911
Rank3221
CreditB565283556745556748556745
W597568557209557210556806
A581213556945556979556778
S9879.75198.885204.01425.6746
Rank4231
DermatologyB3027.82769.862790.572759.27
W3210.582905.452900.542856.32
A3157.562852.7328512829.63
S43.498328.416225.149425.4440
Rank4321
E. coliB100.88668.258569.226668.6975
W117.85874.274274.862172.6633
A111.80771.811572.177171.2404
S5.036791.513581.686550.961163
Rank4231
GlassB358.529275.855280.364271.63
W502.566327.764340.983298.615
A436.177301.786301.395290.056
S35.693212.728812.33736.50043
Rank4321
IrisB125.97296.707196.702996.7166
W164.66796.8796.9696.7779
A143.9796.767696.779496.7458
S8.379630.037110.064180.01758
Rank4231
ThyroidB2450.231871.361874.991871.41
W2931.191931.841956.711891.98
A2637.051895.341901.331880
S129.6317.918523.07534.4781
Rank4231
WineB16532.716315.916305.716310.3
W17288.916341.916344.716328.2
A16758.316326.316324.816321.5
S184.5356.522637.850194.38218
Rank4321
HeartB9833.549441.89441.99441.91
W10952.19445.49444.229443.31
A10218.69443.239443.069442.72
S258.8910.742420.527400.37627
Rank4321
SpectB588.584555.479555.047554.921
W641.295559.913559.849557.19
A616.065557.228556.872556.293
S10.551.098411.164310.71150
Rank4321
DiabetsB73044.372107.272107.272107.2
W78234.372107.372107.372107.2
A7502872107.272107.272107.2
S1338.630.016890.014210.00451
Rank2111
HepatitB9680.269441.869442.359442.3
W10815.69444.769444.619443.2
A10227.69443.189443.419442.81
S275.8630.676210.612090.26600
Rank4231
BtissueB207483130477129814129538
W261335146255150189136673
A230022136942134851132930
S14587.64049.214628.851856.65
Rank4321
ParkinsonB16475.11646316462.916462.9
W16652.716463.11646316463
A16551.8164631646316463
S42.19150.029860.031410.01874
Rank2111
SomervilleB306.112280.534280.537280.525
W331.67280.58280.882280.561
A315.78280.556280.578280.545
S6.839610.010630.061320.00959
Rank4231
User ModelingB108.74197.400697.391497.4178
W115.51499.0021100.4198.5497
A111.93898.0798.454797.9132
S1.652190.510400.775530.35806
Rank4231
Table 6 shows that increasing both the population size and the number of iterations leads to general performance gains across all methods. However, CLBKA consistently outperforms the others by achieving the lowest objective function values and standard deviations on nearly all datasets. This result confirms that its hybrid structure is effective not only in complex datasets like Btissue and Parkinson, but also in structured datasets such as Iris and Wine, and in overlapping-class datasets like Credit and Dermatology. The method’s ability to maintain high accuracy and stability under varying data characteristics highlights its robustness and adaptability, regardless of problem complexity.
Table 6. Comparative Clustering Results: BKA, LBKA, CBKA, and CLBKA (P = 50, T = 1000).
Table 6. Comparative Clustering Results: BKA, LBKA, CBKA, and CLBKA (P = 50, T = 1000).
P = 50, T = 1000
Dataset BKALBKACBKACLBKA
BalanceB1436.041423.831423.841423.83
W1449.261423.861423.861423.85
A1441.671423.851423.851423.85
S3.605070.006310.005500.00411
Rank2111
CreditB564389556740556742556742
W595991557156557210556802
A573943556788556897556764
S6897.1101.561195.64624.2983
Rank4231
DermatologyB3060.722635.592634.062631.57
W3201.692711.242742.232691.70
A3144.382682.162687.342675.54
S32.168517.1927.388715.4241
Rank4231
E. coliB102.27765.874765.705767.9426
W117.58671.989272.977670.1752
A110.62569.280070.031169.3270
S3.787991.292551.362440.67581
Rank4132
GlassB392.891267.009271.683267.761
W460.379306.057305.949285.459
A429.551283.274283.976279.932
S17.38018.724289.988774.67789
Rank4231
IrisB111.69196.675396.671296.6783
W154.02496.728496.732296.7016
A137.73596.701796.703596.6920
S9.248440.013260.013480.00602
Rank4231
ThyroidB2147.91869.311868.41868.06
W2739.081901.61899.671872.65
A2577.731879.561877.151870.12
S146.45711.667511.28861.30922
Rank4321
WineB16398.216306.916307.316307.9
W16973.716320.416326.316317.1
A16646.716314.21631616313.6
S122.1273.143384.36932.90356
Rank4231
HeartB9677.219441.149441.139441.12
W105119442.129442.49441.85
A10034.59441.699441.839441.61
S207.9950.268100.349250.16645
Rank4231
SpectB594.06554.546554.546554.545
W621.359554.55554.555554.548
A609.532554.548554.548554.547
S7.634090.000980.001640.00064
Rank3221
DiabetsB72639.572107.272107.272107.2
W7581872107.272107.272107.2
A73965.872107.272107.272107.2
S783.2490.0001042.39579 × 10−54.11171 × 10−6
Rank2111
HepatitB9804.359441.469441.249440.95
W10750.89443.039442.929441.87
A10144.19441.929441.889441.54
S201.5810.377960.451880.22902
Rank4321
BtissueB196346126316126719126236
W237662143134137434130039
A218282132009129590128231
S13020.34507.112474.581129.44
Rank4321
ParkinsonB16478.216462.916462.916462.9
W16574.116463.116463.116463
A16510164631646316463
S25.95480.029850.045610.01986
Rank2111
SomervilleB298.905280.519280.518280.522
W322.133280.552280.555280.538
A311.296280.534280.536280.532
S5.237410.007410.007990.00358
Rank4231
User ModelingB105.13497.353597.357597.3576
W113.2497.395098.555697.3766
A110.3797.365397.459497.3649
S1.654050.008180.272280.00475
Rank4231
Figure 7, Figure 8, Figure 9, Figure 10, Figure 11, Figure 12, Figure 13, Figure 14, Figure 15, Figure 16, Figure 17, Figure 18, Figure 19, Figure 20 and Figure 21 present the convergence behavior of all four algorithms across the evaluated datasets. Overall, CLBKA achieves faster and more stable convergence, typically reaching near-optimal solutions within the first few hundred iterations. While LBKA and CBKA show improved performance compared to BKA, their convergence is generally slower and less stable. Although CLBKA exhibits early stabilization in the convergence curves, this does not necessarily indicate harmful over-convergence. Rather, it reflects efficient descent toward high-quality solutions. In CLBKA, search diversity is preserved even in later iterations through chaotic parameter control, conditionally activated Lévy flights, and Cauchy-based migration perturbations. These mechanisms generate non-periodic variations and intermittent long-range steps, which prevent population stagnation and sustain adaptive search dynamics throughout the optimization process. The consistent performance across datasets of varying complexity confirms the robustness of this hybrid strategy.

4.2. Performance Evaluation of the Proposed Methods Against Literature-Reported Algorithms

Table 7 compares the proposed BKA variants with literature-reported algorithms under identical evaluation settings (population size = 30, iterations = 500). CLBKA consistently achieves top rankings on the majority of datasets, such as Balance, Credit, E. coli, Glass, Iris, Thyroid, Wine, Heart, Somerville, and User Modeling, demonstrating its robust and generalizable performance. However, CLBKA does not always outperform all competing methods. In the Spect dataset, PSO achieves the best SSE (537.339), suggesting that its strong local search capability better suits binary and overlapping data. In Dermatology, WOA performs best (2670.14), likely benefiting from balanced exploration–exploitation in high-dimensional spaces. The Thyroid dataset sees GWO as the top performer (1933.91), while in Diabetes, PSO again leads (49,269.24), possibly due to its convergence speed on noisy data. Additionally, GWO ranks first on B. Tissue (129,653), and PSO outperforms the others on Parkinson (12,363). These cases highlight that specific data characteristics, such as noise, feature distribution, or cluster shape, can affect algorithm suitability. Despite these instances, CLBKA remains the most consistently high-ranking approach across diverse benchmarks.

4.3. Statistical Evaluation via the Friedman Test

The Friedman test was employed to statistically compare the proposed methods across different population sizes and iteration budgets. As reported in Table 8, CLBKA consistently achieves the lowest Friedman mean rank in all parameter settings, indicating the best overall performance among the proposed approaches. LBKA and CBKA exhibit intermediate ranks that vary slightly across configurations, while BKA remains ranked last in every case, reflecting comparatively weaker clustering performance. The small variation in CLBKA’s mean rank values across settings further suggests that its superiority is robust and only weakly affected by changes in population size or iteration number. The Friedman test indicated statistically significant differences among the compared algorithms (p < 0.05).
Table 9 summarizes the Friedman mean-rank comparison between the literature-reported algorithms and the proposed methods under the benchmark setting with population size 30 and 500 iterations. CLBKA achieves the best overall rank under the same evaluation budget, followed by LBKA and CBKA as competitive alternatives. In contrast, BKA and several literature baselines yield higher mean ranks, indicating inferior overall performance. The distribution shown in Figure 22 further highlights the consistent advantage of CLBKA under matched experimental conditions.

4.4. Post Hoc Statistical Analysis Using the Nemenyi Test

Although the Friedman test provides a global indication of statistically significant differences among multiple algorithms, it does not identify which specific algorithm pairs differ significantly. Therefore, to strengthen the statistical analysis and comply with best practices in algorithm comparison, a Nemenyi post hoc test was conducted following the Friedman test.
The Nemenyi test compares all algorithm pairs based on their average ranks obtained from the Friedman test. The critical difference (CD) is calculated as:
C D = q α k ( k + 1 ) 6 N
where k   is the number of algorithms, N is the number of datasets, and q α is the critical value of the Studentized range distribution. In this study, k = 4   (BKA, LBKA, CBKA, and CLBKA), N = 16   datasets, and q 0.05 = 2.569   for a significance level of α = 0.05. Substituting these values yields:
C D = 2.569 4 ( 4 + 1 ) 6 16 1.17
Accordingly, any difference in average ranks greater than 1.17 is considered statistically significant. Using the Friedman mean ranks reported in Table 7, pairwise comparisons between CLBKA and the other algorithms were performed under all parameter settings. The results of the Nemenyi post hoc analysis are summarized in Table 10.
A post hoc Nemenyi test was conducted following the Friedman test to determine which algorithm pairs exhibit statistically significant differences. Based on 16 datasets and 4 algorithms, the critical difference (CD) at α = 0.05 was calculated as 1.17. Table 10 summarizes the pairwise comparisons using CLBKA as the reference. Results show that CLBKA significantly outperforms BKA and CBKA across all parameter settings. Although CLBKA often has better ranks than LBKA, the differences were not statistically significant in some configurations (P = 30, T = 1000 and P = 50, T = 1000), as the rank differences remained below the critical threshold. These results reinforce the superior performance of CLBKA and validate the statistical significance of its advantage under multiple experimental conditions.

4.5. Statistical Analysis Using the Wilcoxon Signed-Rank Test

The results of the Wilcoxon signed-rank test, presented in Table 9, Table 10, Table 11 and Table 12, evaluate the statistical significance of the performance differences between the proposed algorithms under various parameter settings. These pairwise comparisons provide deeper insight into whether the observed performance improvements of CLBKA are statistically meaningful or merely incidental.
As shown in Table 11, under the configuration P = 30 and T = 500, CLBKA significantly outperforms both CBKA and LBKA with very small p-values (p = 0.00006), and large negative effect sizes (r = −1.002), indicating strong and consistent superiority. Furthermore, BKA performs significantly worse than all three enhanced variants (p = 0.00044), confirming its relatively weaker optimization capability. On the other hand, the comparison between CBKA and LBKA yields a non-significant result (p = 0.89038), suggesting similar behavior between these two methods
As shown in Table 12, under the setting P = 30 and T = 1000, as shown in Table 10, the pattern remains consistent. CLBKA continues to significantly outperform CBKA (p = 0.00006) and LBKA (p = 0.00012), indicating that the hybrid strategy maintains its advantage even at higher iteration counts. The difference between CBKA and LBKA remains statistically non-significant (p = 0.09058), reinforcing the notion that these two algorithms are performance-wise comparable.
As reported in Table 13, when the population size increases to P = 50 while keeping T = 500, CLBKA retains its statistically significant superiority over both CBKA and LBKA (p = 0.00012 for both comparisons). BKA once again shows the weakest performance, significantly lagging behind all enhanced variants (p = 0.00044). These findings indicate that CLBKA remains effective even with a larger population, likely due to its balanced exploration–exploitation dynamics.
Finally, in Table 14 under the setting P = 50 and T = 1000, although p-values slightly increase, CLBKA still demonstrates significant performance advantages over CBKA (p = 0.00012, r = −0.960) and LBKA (p = 0.00159, r = −0.790). The performance gap between CBKA and LBKA remains statistically non-significant (p = 0.46484), which is consistent with earlier observations.

4.6. Sensitivity Analysis of the Lévy Exponent β

To examine the robustness of CLBKA with respect to its internal parameters, a comprehensive sensitivity analysis was conducted on the Lévy flight exponent β, which governs the step-size distribution in the global exploration phase. The β parameter was varied across two values (1.3 and 1.7), while all other parameters were kept constant. The analysis was carried out on 16 UCI benchmark datasets, and the results are summarized in Table 15. As observed, CLBKA demonstrates highly stable clustering performance under both β settings. The average SSE and Rand Index (RI) values show only minor fluctuations, suggesting that the algorithm is relatively insensitive to moderate changes in β. This stability further reinforces the practical reliability of CLBKA, especially in applications where parameter tuning is limited or computationally expensive.

5. Conclusions

This study examined the clustering performance of the Black-Winged Kite Algorithm and its enhanced variants, CBKA, LBKA, and CLBKA, developed to mitigate premature convergence and limited exploration capability observed in the standard BKA. Chaotic logistic mapping was employed to enhance population diversity and adaptive parameter regulation, while Lévy flight mechanisms improved long-range exploration. The hybrid CLBKA framework integrated these strategies with Cauchy-based perturbations to achieve a more balanced transition between exploration and exploitation during centroid optimization. The algorithms were evaluated on sixteen UCI benchmark datasets under different population sizes and iteration settings. Across all experimental configurations, CLBKA consistently achieved lower SSE values and improved convergence stability compared to the standard BKA and its single-enhanced variants. Statistical analyses using the Friedman and Wilcoxon tests confirmed significant performance differences, with CLBKA attaining the lowest mean rank across test conditions. Comparative evaluations against established metaheuristic clustering algorithms, including PSO, GWO, WOA, and ChOA, further demonstrated competitive and frequently superior performance across diverse datasets.
Despite these findings, several limitations should be acknowledged. The datasets considered are primarily small- to medium-scale, and the number of clusters was assumed to be known in advance. In addition, clustering performance was evaluated using the SSE objective function within a centroid-based framework relying on Euclidean distance, which may not be suitable for all data structures. Future research may extend this framework by incorporating alternative clustering objectives such as density-based, graph-based, or validity-index-driven optimization criteria, rather than relying solely on SSE minimization. Furthermore, large-scale implementations using parallel or GPU-based computation, automatic cluster number estimation, alternative distance metrics, and robustness improvements for noisy or imbalanced datasets represent promising research directions.

Author Contributions

Conceptualization, T.A. and S.S.; methodology, T.A. and S.S.; software, T.A. and S.S.; validation, T.A. and S.S.; formal analysis, T.A. and S.S.; investigation, T.A. and S.S.; resources, T.A. and S.S.; data curation, T.A. and S.S.; writing—original draft preparation, T.A. and S.S.; writing—review and editing, T.A. and S.S.; visualization, T.A. and S.S.; supervision, T.A. and S.S.; project administration, T.A. and S.S. All authors have read and agreed to the published version of the manuscript.

Funding

Part of the APC of this article was supported by Selcuk University Scientific Research Coordinatorship.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets used in this study were obtained from the UCI Machine Learning Repository and are publicly accessible. The relevant dataset links are cited in the References section. No new data were generated in this study.

Acknowledgments

This work is derived from Taybe Alabed’s master’s thesis at Selçuk University and was partially supported by the Selçuk University Scientific Research Projects (BAP) Coordination Unit. Additionally, ChatGPT-4o was used to improve the clarity of some sentences and enhance the English grammar of the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Cambe, J.; Grauwin, S.; Flandrin, P.; Jensen, P. A new clustering method to explore the dynamics of research communities. Scientometrics 2022, 127, 4459–4482. [Google Scholar] [CrossRef]
  2. Lukauskas, M.; Ruzgas, T. Reduced clustering method based on the inversion formula density estimation. Mathematics 2023, 11, 661. [Google Scholar] [CrossRef]
  3. Xu, D.; Tian, Y. A comprehensive survey of clustering algorithms. Ann. Data Sci. 2015, 2, 165–193. [Google Scholar] [CrossRef]
  4. Jain, A.K. Data clustering: 50 years beyond K-means. Pattern Recognit. Lett. 2010, 31, 651–666. [Google Scholar] [CrossRef]
  5. Pektaş, A.; Hacıbeyoğlu, M.; İnAn, O. Hybridization of the Snake Optimizer and Particle Swarm Optimization for continuous optimization problems. Eng. Sci. Technol. Int. J. 2025, 67, 102077. [Google Scholar] [CrossRef]
  6. Kumar, A.; Kumar, D.; Jarial, S. A review on artificial bee colony algorithms and their applications to data clustering. Cybern. Inf. Technol. 2017, 17, 3–28. [Google Scholar] [CrossRef]
  7. Mirjalili, S.; Mirjalili, S.M.; Lewis, A. Grey wolf optimizer. Adv. Eng. Softw. 2014, 69, 46–61. [Google Scholar] [CrossRef]
  8. Jahwar, A.F.; Abdulazeez, A.M. Meta-heuristic algorithms for K-means clustering: A review. PalArch’s J. Archaeol. Egypt/Egyptol. 2020, 17, 12002–12020. [Google Scholar]
  9. Wang, J.; Wang, W.-C.; Hu, X.-X.; Qiu, L.; Zang, H.-F. Black-winged kite algorithm: A nature-inspired meta-heuristic for solving benchmark functions and engineering problems. Artif. Intell. Rev. 2024, 57, 98. [Google Scholar] [CrossRef]
  10. Alabed, T.; Servi, S. A Lévy flight based chaotic black winged kite algorithm for solving optimization problems. Sci. Rep. 2025, 15, 34608. [Google Scholar] [CrossRef]
  11. Algelany, A.M.; El-Shorbagy, M.A. Chaotic enhanced genetic algorithm for solving the nonlinear system of equations. Comput. Intell. Neurosci. 2022, 2022, 1376479. [Google Scholar] [CrossRef] [PubMed]
  12. Zhang, Y.; Lu, J.; Zhao, C.; Li, Z.; Yan, J. Chaos optimization algorithms: A survey. Int. J. Bifurc. Chaos 2024, 34, 2450205. [Google Scholar] [CrossRef]
  13. Kamaruzaman, A.F.; Zain, A.M.; Yusuf, S.M.; Udin, A. Levy flight algorithm for optimization problems—A literature review. Appl. Mech. Mater. 2013, 421, 496–501. [Google Scholar] [CrossRef]
  14. Akar, A.U.; Uymaz, S.A. Clustering neighborhoods according to urban functions and development levels by different clustering algorithms: A case in konya. Konya J. Eng. Sci. 2022, 10, 889–902. [Google Scholar] [CrossRef]
  15. Kazemi, U.; Soleimani, S. A new approach data processing: Density-based spatial clustering of applications with noise (DBSCAN) clustering using game-theory. Soft Comput. 2025, 29, 1331–1346. [Google Scholar] [CrossRef]
  16. Celebi, M.E.; Kingravi, H.A.; Vela, P.A. A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Syst. Appl. 2013, 40, 200–210. [Google Scholar] [CrossRef]
  17. Maulik, U.; Bandyopadhyay, S. Genetic algorithm-based clustering technique. Pattern Recognit. 2000, 33, 1455–1465. [Google Scholar] [CrossRef]
  18. Schubert, E.; Sander, J.; Ester, M.; Kriegel, H.P.; Xu, X. DBSCAN revisited, revisited: Why and how you should (still) use DBSCAN. ACM Trans. Database Syst. (TODS) 2017, 42, 1–21. [Google Scholar]
  19. Agarwal, S.; Kumar, S. Survey on Clustering Problems Using Metaheuristic Algorithms. In 2024 11th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), Noida, India, 14–15 March 2024; IEEE: New York, NY, USA, 2024. [Google Scholar]
  20. Ikotun, A.M.; Almutari, M.S.; Ezugwu, A.E. K-means-based nature-inspired metaheuristic algorithms for automatic data clustering problems: Recent advances and future directions. Appl. Sci. 2021, 11, 11246. [Google Scholar] [CrossRef]
  21. Akto, İ.; İnan, O.; Karakoyun, M. Grey Wolf Optimizer (GWO) algorithm to solve the partitional clustering problem. Int. J. Intell. Syst. Appl. Eng. 2019, 7, 201–206. [Google Scholar] [CrossRef]
  22. Zabihi, F.; Nasiri, B. A novel history-driven artificial bee colony algorithm for data clustering. Appl. Soft Comput. 2018, 71, 226–241. [Google Scholar] [CrossRef]
  23. Boroujeni, S.P.H.; Pashaei, E. Data clustering using chimp optimization algorithm. In 2021 11th International Conference on Computer Engineering and Knowledge (ICCKE), Mashhad, Iran, 28–29 October 2021; IEEE: New York, NY, USA, 2021. [Google Scholar]
  24. Pektaş, A.; İnan, O. Ağaç tohum algoritmasının kümeleme problemlerine uygulanması. Necmettin Erbakan Üniversitesi Fen Mühendislik Bilim. Derg. 2022, 4, 1–10. [Google Scholar]
  25. Nasiri, J.; Khiyabani, F.M. A whale optimization algorithm (WOA) approach for clustering. Cogent Math. Stat. 2018, 5, 1483565. [Google Scholar] [CrossRef]
  26. Anwer, K.I.; Servi, S. Clustering method based on artificial algae algorithm. Int. J. Intell. Syst. Appl. Eng. 2021, 9, 136–151. [Google Scholar] [CrossRef]
  27. Turkoglu, B.; Uymaz, S.A.; Kaya, E. Clustering analysis through artificial algae algorithm. Int. J. Mach. Learn. Cybern. 2022, 13, 1179–1196. [Google Scholar] [CrossRef]
  28. Taloba, A.I.; Mohammed, O.S.; Sewisy, A.A. A Hybrid Black Hole Algorithm with Genetic Algorithm for Solving Data Clustering Problems. Turk. J. Comput. Math. Educ. 2021, 12, 1067–1079. [Google Scholar]
  29. Shelokar, P.; Jayaraman, V.K.; Kulkarni, B.D. An ant colony approach for clustering. Anal. Chim. Acta 2004, 509, 187–195. [Google Scholar] [CrossRef]
  30. Karaboga, D.; Ozturk, C. A novel clustering approach: Artificial Bee Colony (ABC) algorithm. Appl. Soft Comput. 2011, 11, 652–657. [Google Scholar] [CrossRef]
  31. Limane, A.; Zitouni, F.; Harous, S.; Lakbichi, R.; Ferhat, A.; Almazyad, A.S.; Jangir, P.; Mohamed, A.W. Chaos-enhanced metaheuristics: Classification, comparison, and convergence analysis. Complex Intell. Syst. 2025, 11, 177. [Google Scholar] [CrossRef]
  32. Turkoglu, B.; Uymaz, S.A.; Kaya, E. Chaos theory in metaheuristics. In Comprehensive Metaheuristics; Elsevier: Amsterdam, The Netherlands, 2023; pp. 1–20. [Google Scholar]
  33. Zhao, M.; Su, Z.; Zhao, C.; Hua, Z. Improved black-winged kite algorithm based on chaotic mapping and adversarial learning. In Journal of Physics: Conference Series; IOP Publishing: Bristol, UK, 2024. [Google Scholar]
  34. Wang, J. CPSO: Chaotic particle swarm optimization for cluster analysis. J. Artif. Intell. Technol. 2023, 3, 46–52. [Google Scholar] [CrossRef]
  35. Chuang, L.-Y.; Hsiao, C.-J.; Yang, C.-H. Chaotic particle swarm optimization for data clustering. Expert Syst. Appl. 2011, 38, 14555–14563. [Google Scholar] [CrossRef]
  36. Dağlı, İ.; İnAn, O.; Başçïftçï, F. A hybrid Fox optimization algorithm with chaotic maps and polynomial mutation for clustering applications. Evol. Syst. 2025, 16, 122. [Google Scholar] [CrossRef]
  37. Sahoo, S.K.; Pattanaik, P.; Das, D.K. Modified Chaotic Bee Colony Optimization (MCBCO) algorithm for data clustering. In 2021 1st Odisha International Conference on Electrical Power Engineering, Communication and Computing Technology (ODICON), Bhubaneswar, India, 8–9 January 2021; IEEE: New York, NY, USA, 2021. [Google Scholar]
  38. Yang, L.; Hu, X.; Wang, H.; Zhang, W.; Huang, K.; Wang, D. An ACO-based clustering algorithm with chaotic function mapping. Int. J. Cogn. Informatics Nat. Intell. 2021, 15, 1–21. [Google Scholar] [CrossRef]
  39. Roy, S.; Chaudhuri, S.S. Cuckoo search algorithm using Lévy flight: A review. Int. J. Mod. Educ. Comput. Sci. 2013, 5, 10. [Google Scholar]
  40. Yang, X.-S.; Deb, S. Cuckoo search via Lévy flights. In 2009 World congress on nature & biologically inspired computing (NaBIC), Coimbatore, India, 9–11 December 2009; IEEE: New York, NY, USA, 2009. [Google Scholar]
  41. Ling, Y.; Zhou, Y.; Luo, Q. Lévy flight trajectory-based whale optimization algorithm for global optimization. IEEE Access 2017, 5, 6168–6186. [Google Scholar] [CrossRef]
  42. Mat, A.N.; İnan, O.; Karakoyun, M. An application of the whale optimization algorithm with Levy flight strategy for clustering of medical datasets. Int. J. Optim. Control. Theor. Appl. (IJOCTA) 2021, 11, 216–226. [Google Scholar] [CrossRef]
  43. Gao, H.; Li, Y.; Kabalyants, P.; Xu, H.; Martinez-Bejar, R. A novel hybrid PSO-K-means clustering algorithm using Gaussian estimation of distribution method and Lévy flight. IEEE Access 2020, 8, 122848–122863. [Google Scholar]
  44. Abdulwahab, H.A.; Noraziah, A.; Alsewari, A.A.; Salih, S.Q. An enhanced version of black hole algorithm via levy flight for optimization and data clustering problems. IEEE Access 2019, 7, 142085–142096. [Google Scholar] [CrossRef]
  45. He, J.; Xiao, X. Student’s Information Management System using Levy Flight with K-Means Clustering Algorithm. In International Conference on Data Science and Network Security (ICDSNS), Tiptur, India, 26–27 July 2024; IEEE: New York, NY, USA, 2024. [Google Scholar]
  46. Rather, S.A.; Das, S. Levy flight and chaos theory-based gravitational search algorithm for image segmentation. Mathematics 2023, 11, 3913. [Google Scholar] [CrossRef]
  47. Hossain, M.A.A.; Sağ, T. Chaotic Levy-Flight-Driven Siberian Tiger Optimization for Enhanced Data Clustering. Cybern. Syst. 2026, 57, 44–83. [Google Scholar] [CrossRef]
  48. Devarakonda, N.; Saidala, R.K.; Kamarajugadda, R. A Hybrid Between TOA and Lévy Flight Trajectory for Solving Different Cluster Problems. Int. J. Cogn. Inform. Nat. Intell. (IJCINI) 2021, 15, 1–25. [Google Scholar] [CrossRef]
  49. Yıldız, B.S.; Kumar, S.; Pholdee, N.; Bureerat, S.; Sait, S.M.; Yildiz, A.R. A new chaotic Lévy flight distribution optimization algorithm for solving constrained engineering problems. Expert Syst. 2022, 39, e12992. [Google Scholar] [CrossRef]
  50. Zhou, Y.; Wu, X.; Liu, Y.; Jiang, X. BKA optimization algorithm based on sine-cosine guidelines. In 2024 4th International Symposium on Computer Technology and Information Science (ISCTIS), Xi’an, China, 12–14 July 2024; IEEE: New York, NY, USA, 2024. [Google Scholar]
  51. Almutairi, S.Z.; Shaheen, A.M. Improved Black-Winged Kite Algorithm for Sustainable Photovoltaic Energy Modeling and Accurate Parameter Estimation. Sustainability 2026, 18, 731. [Google Scholar] [CrossRef]
  52. Han, C. An image encryption algorithm based on modified logistic chaotic map. Optik 2019, 181, 779–785. [Google Scholar] [CrossRef]
  53. Çelik, H.; Doğan, N. A hybrid color image encryption method based on extended logistic map. Multimed. Tools Appl. 2024, 83, 12627–12650. [Google Scholar] [CrossRef]
  54. Brown, C.T.; Liebovitch, L.S.; Glendon, R. Lévy flights in Dobe Ju/’hoansi foraging patterns. Hum. Ecol. 2007, 35, 129–138. [Google Scholar] [CrossRef]
  55. Pavlyukevich, I. Lévy flights, non-local search and simulated annealing. J. Comput. Phys. 2007, 226, 1830–1844. [Google Scholar] [CrossRef]
  56. Shlesinger, M.F. Search research. Nature 2006, 443, 281–282. [Google Scholar] [CrossRef]
  57. Barshandeh, S.; Haghzadeh, M. A new hybrid chaotic atom search optimization based on tree-seed algorithm and Levy flight for solving optimization problems. Eng. Comput. 2021, 37, 3079–3122. [Google Scholar] [CrossRef]
  58. Chechkin, A.V.; Metzler, R.; Klafter, J.; Gonchar, V.Y. Introduction to the theory of Lévy flights. In Anomalous Transport: Foundations and Applications; Wiley-VCH Verlag: Berlin, Germany, 2008; pp. 129–162. [Google Scholar]
  59. Mantegna, R.N. Fast, accurate algorithm for numerical simulation of Levy stable stochastic processes. Phys. Rev. E 1994, 49, 4677–4683. [Google Scholar] [CrossRef]
  60. Hair, J.F.; Anderson, R.E.; Tatham, R.L.; Black, W.C. Multivariate Data Analysis Prentice Hall; Annabel Ainscow: Upper Saddle River, NJ, USA, 1998; Volume 730. [Google Scholar]
  61. Alabed, T.; Servi, S. Multilayer analysis of nicotine-induced gene expression alterations in breast cancer cells using clustering and supervised learning methods. Kahramanmaraş Sütçü İmam Üniversitesi Mühendislik Bilim. Derg. 2025, 28, 1558–1573. [Google Scholar] [CrossRef]
  62. Lorr, M. Cluster Analysis for Social Scientists; Jossey-Bass: San Francisco, CA, USA, 1983. [Google Scholar]
  63. Dua, D.; Casey, G. UCI Machine Learning Repository. 2017. Available online: http://archive.ics.uci.edu/ml (accessed on 20 November 2025).
  64. Niu, Y.; Yin, J. PA-Net: Trustworthy weakly supervised point cloud semantic segmentation with primary–auxiliary structure. Comput. Electr. Eng. 2024, 119, 109555. [Google Scholar] [CrossRef]
  65. Arslan, S. Güncel Metasezgisel Algoritmalarının Performansları Üzerine Karşılaştırılmalı Bir Çalışma. Düzce Üniversitesi Bilim Ve Teknol. Derg. 2023, 11, 1861–1884. [Google Scholar] [CrossRef]
  66. Isaksson, L.J.; Pepa, M.; Summers, P.; Zaffaroni, M.; Vincini, M.G.; Corrao, G.; Mazzola, G.C.; Rotondi, M.; Presti, G.L.; Raimondi, S.; et al. Comparison of automated segmentation techniques for magnetic resonance images of the prostate. BMC Med. Imaging 2023, 23, 32. [Google Scholar] [CrossRef]
  67. Brest, J.; Maučec, M.S. Comparative Study of Modern Differential Evolution Algorithms: Perspectives on Mechanisms and Performance. Mathematics 2025, 13, 1556. [Google Scholar] [CrossRef]
  68. Rasooli, A.Q.; Inan, O.; Servi, S. Clustering with the Blackwinged Kite Algorithm. Int. J. Comput. Sci. Commun. (IJCSC) 2024, 9, 22–33. [Google Scholar] [CrossRef]
Figure 1. Overview of the clustering procedure.
Figure 1. Overview of the clustering procedure.
Biomimetics 11 00200 g001
Figure 2. Two hunting behaviors of the black-winged kite: (a) pre-attack hovering and (b) hovering during prey search.
Figure 2. Two hunting behaviors of the black-winged kite: (a) pre-attack hovering and (b) hovering during prey search.
Biomimetics 11 00200 g002
Figure 3. Strategic migration behavior of Black-winged kites in BKA.
Figure 3. Strategic migration behavior of Black-winged kites in BKA.
Biomimetics 11 00200 g003
Figure 4. Workflow of the Chaotic Black-Winged Kite Algorithm (CBKA).
Figure 4. Workflow of the Chaotic Black-Winged Kite Algorithm (CBKA).
Biomimetics 11 00200 g004
Figure 5. Workflow of the Lévy Black-Winged Kite Algorithm (LBKA).
Figure 5. Workflow of the Lévy Black-Winged Kite Algorithm (LBKA).
Biomimetics 11 00200 g005
Figure 6. Schematic representation of the proposed CLBKA’s workflow.
Figure 6. Schematic representation of the proposed CLBKA’s workflow.
Biomimetics 11 00200 g006
Figure 7. Convergence of the total square distance for Balance data set.
Figure 7. Convergence of the total square distance for Balance data set.
Biomimetics 11 00200 g007
Figure 8. Convergence of the total square distance for Credit data set.
Figure 8. Convergence of the total square distance for Credit data set.
Biomimetics 11 00200 g008
Figure 9. Convergence of the total square distance for Dermatology data set.
Figure 9. Convergence of the total square distance for Dermatology data set.
Biomimetics 11 00200 g009
Figure 10. Convergence of the total square distance for E. coli data set.
Figure 10. Convergence of the total square distance for E. coli data set.
Biomimetics 11 00200 g010
Figure 11. Convergence of the total square distance for Glass data set.
Figure 11. Convergence of the total square distance for Glass data set.
Biomimetics 11 00200 g011
Figure 12. Convergence of the total square distance for Iris data set.
Figure 12. Convergence of the total square distance for Iris data set.
Biomimetics 11 00200 g012
Figure 13. Convergence of the total square distance for Thyroid data set.
Figure 13. Convergence of the total square distance for Thyroid data set.
Biomimetics 11 00200 g013
Figure 14. Convergence of the total square distance for Wine data set.
Figure 14. Convergence of the total square distance for Wine data set.
Biomimetics 11 00200 g014
Figure 15. Convergence of the total square distance for Heart data set.
Figure 15. Convergence of the total square distance for Heart data set.
Biomimetics 11 00200 g015
Figure 16. Convergence of the total square distance for Spect data set.
Figure 16. Convergence of the total square distance for Spect data set.
Biomimetics 11 00200 g016
Figure 17. Convergence of the total square distance for Diabets data set.
Figure 17. Convergence of the total square distance for Diabets data set.
Biomimetics 11 00200 g017
Figure 18. Convergence of the total square distance for Hepatit data set.
Figure 18. Convergence of the total square distance for Hepatit data set.
Biomimetics 11 00200 g018
Figure 19. Convergence of the total square distance for Btissue data set.
Figure 19. Convergence of the total square distance for Btissue data set.
Biomimetics 11 00200 g019
Figure 20. Convergence of the total square distance for Parkinson data set.
Figure 20. Convergence of the total square distance for Parkinson data set.
Biomimetics 11 00200 g020
Figure 21. Convergence of the total square distance for Somerville data set.
Figure 21. Convergence of the total square distance for Somerville data set.
Biomimetics 11 00200 g021
Figure 22. Friedman Mean Ranks for Literature and Proposed Algorithms.
Figure 22. Friedman Mean Ranks for Literature and Proposed Algorithms.
Biomimetics 11 00200 g022
Table 1. Summary of the characteristics of the datasets used in this study.
Table 1. Summary of the characteristics of the datasets used in this study.
DatasetNumber of Cluster CentersNumber of FeaturesNumber of Instances
Balance34625
Credit214690
Dermatology634366
E. coli57327
Glass69214
Iris34150
Thyroid35215
Wine313178
Heart213270
Spect222267
Diabets28768
Hepatit221155
Btissue69106
Parkinson222195
Somerville26143
User Modeling45258
Table 7. Performance comparison of the proposed methods with literature-reported clustering algorithms.
Table 7. Performance comparison of the proposed methods with literature-reported clustering algorithms.
Dataset K-MED
[24]
TSA
[68]
ChOA
[36]
WOA
[36]
PSO
[36]
GWO
[36]
BKALBKACBKACLBKA
BalanceA1686.471502.13981452.441433.5671426.631423.911447.661423.891423.881423.87
Rank10986547321
CreditA5622841653922558675566192589789556944583138556953556982556834
Rank61057928341
DermatologyA2864.963410.76122842.062670.1432708.782459.793182.722861.852860.772838.48
Rank81052319764
E. coliA145.9961138.9561128.270484.10480.47874.878115.56872.263272.591572.1794
Rank10986547231
GlassA311.0533687.2979494.8411406.691311.348319.34457.13308.743305.671299.149
Rank41097568321
IrisA183.6139213.3086147.640997.1151110.99099.265148.9796.827496.872896.7764
Rank91074658231
ThyroidA2097.6813929.00822490.9932125.2002178.3461933.912691.151912.561908.091899.24
Rank51086749321
WineA17656.6719588.1916893.21164501633616328.816811.51632616326.216324.5
Rank91086547231
HeartA11814.2012290.6510514.379659.579497.749448.4010314.49443.849443.749443.25
Rank91086547321
SpectA633.544659.8949572.2234562.972537.339556.563620.493557.417558.056556.296
Rank91076138452
DiabetesA73172.0793733.43980797.2572933.4349269.2472204.775681.372109.972173.772107.2
Rank71096158342
HepatitA10376.5512471.31210416.529571.6879600.549447.1710384.39443.829443.809443.31
Rank71095648321
B. TissueA143417405517.18153597139204.3186916129653239229140130137016135735
Rank61074819532
ParkinsonA17120181401652716465123631646416547164631646316463
Rank7854136222
SomervilleA327.701372.8451316.421283.599287.471280.654318.476280.567280.579280.553
Rank91075648231
User ModelingA152.5859122.9621144.9154104.87499.98898.844113.15998.204399.039598.1158
Rank10896537241
Table 8. Friedman Mean Ranks of the Proposed Methods across Parameter Settings.
Table 8. Friedman Mean Ranks of the Proposed Methods across Parameter Settings.
Parameter SettingBKALBKACBKACLBKA
P = 30, T = 5004.00002.43752.50001.0625
4231
P = 30, T = 10004.00002.12502.78121.0938
4231
P = 50, T = 5004.00002.40622.46881.1250
4231
P = 50, T = 10004.00002.15622.59381.2500
4231
Table 9. Friedman Mean Ranks for Literature and Proposed Algorithms at P = 30, T = 500.
Table 9. Friedman Mean Ranks for Literature and Proposed Algorithms at P = 30, T = 500.
Parameter SettingK-MEDChOAWOAPSOGWOBKALBKACBKACLBKA
P = 30, T= 5007.75007.50005.50004.87503.68757.87503.12503.18751.5000
876549231
Table 10. Nemenyi post hoc comparisons using CLBKA as the reference algorithm.
Table 10. Nemenyi post hoc comparisons using CLBKA as the reference algorithm.
Parameter SettingBKA-CLBKALBKA-CLBKACBKA-CLBKASig. Diff.
P = 30,T = 5002.93751.37501.4375BKA, LBKA, CBKA
P = 30, T = 10002.90621.03121.6874BKA, CBKA
P = 50, T = 5002.87501.28121.3438BKA, LBKA, CBKA
P = 50, T = 10002.75000.90621.3438BKA, CBKA
Table 11. Wilcoxon signed-rank test results for P = 30 and T = 500.
Table 11. Wilcoxon signed-rank test results for P = 30 and T = 500.
P = 30, T = 500
Comparisonp-Valuezr
BKAvs. CBKA0.000443.5160.879
BKA vs. LBKA0.000443.5160.879
BKA vs. CLBKA0.000443.5160.879
CBKA vs. LBKA0.89038−0.138−0.034
CLBKA vs. CBKA0.00006−4.009−1.002
CLBKA vs. LBKA0.00006−4.009−1.002
Table 12. Wilcoxon signed-rank test results for P = 30 and T = 1000.
Table 12. Wilcoxon signed-rank test results for P = 30 and T = 1000.
P = 30, T = 1000
Comparisonp-Valuezr
BKA vs. CBKA0.000443.5160.879
BKA vs. LBKA0.000443.5160.879
BKA vs. CLBKA0.000443.5160.879
CBKA vs. LBKA0.09058−1.692−0.423
CLBKA vs. CBKA0.00006−4.009−1.002
CLBKA vs. LBKA0.00012−3.842−0.960
Table 13. Wilcoxon signed-rank test results for P = 50 and T = 500.
Table 13. Wilcoxon signed-rank test results for P = 50 and T = 500.
P = 50, T = 500
Comparisonp-Valuezr
BKA vs. CBKA0.000443.5160.879
BKA vs. LBKA0.000443.5160.879
BKA vs. CLBKA0.000443.5160.879
CBKA vs. LBKA0.89258−0.135−0.034
CLBKA vs. CBKA0.00012−3.842−0.960
CLBKA vs. LBKA0.00012−3.842−0.960
Table 14. Wilcoxon signed-rank test results for P = 50 and T = 1000.
Table 14. Wilcoxon signed-rank test results for P = 50 and T = 1000.
P = 50, T = 1000
Comparisonp-Valuezr
BKAvs. CBKA0.005232.7920.698
BKA vs. LBKA0.005232.7920.698
BKA vs. CLBKA0.005232.7920.698
CBKA vs. LBKA0.46484−0.731−0.183
CLBKA vs. CBKA0.00012−3.842−0.960
CLBKA vs. LBKA0.00159−3.158−0.790
Table 15. Sensitivity analysis of CLBKA with respect to the Lévy exponent β.
Table 15. Sensitivity analysis of CLBKA with respect to the Lévy exponent β.
P = 30, T = 500
DatasetβAvgSSEStdSSEAvgRIStdRI
Balance1.31423.87160.0133340.5877270.007425
1.71423.87210.0110120.5864050.007357
Credit1.355679371.5436940.5243180
1.7556836120.512530.5243180
Dermatology1.32840.29022.9344190.6939530.007704
1.72864.92127.1358640.6931860.008821
E. coli1.371.662691.2730790.8582570.019743
1.771.453061.2967430.8529870.025958
Glass1.3299.44406.5798180.5617450.018459
1.7297.59275.2020030.5632130.016911
Iris1.396.764540.0360450.8861980.002478
1.796.765300.0365210.8851830.002986
Thyroid1.31897.91912.669680.5862840.012291
1.71902.05912.025650.5939600.018926
Wine1.316324.094.8407150.7255390.003500
1.716324.463.8765920.7269450.003347
Heart1.39443.200.4373360.5703882.26 × 10−16
1.79443.320.5211060.5703882.25 × 10−16
Spect1.3556.7210.8563050.6741013.9 × 10−16
1.7556.7810.7161080.6741013.3 × 10−16
Diabetes1.372107.240.0136730.5462180
1.772107.250.0181860.5462180
Hepatit1.39443.480.4863730.5703882.26 × 10−16
1.79443.360.5648370.5703882.25 × 10−16
B. Tissue1.3136510.83632.8380.6946620.023726
1.7136198.03065.6300.6936460.020872
Parkinson1.316463.010.0284100.6307691.13 × 10−16
1.716463.010.0240710.6307691.12 × 10−16
Somerville1.3280.55830.0113050.5295970.005498
1.7280.55670.0117420.5311640.004092
User Modeling1.398.154170.4441000.6748560.009166
1.798.160580.4559530.6737700.010470
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Alabed, T.; Servi, S. Clustering Performance Analysis Using Chaotic and Lévy Flight-Enhanced Black-Winged Kite Algorithms. Biomimetics 2026, 11, 200. https://doi.org/10.3390/biomimetics11030200

AMA Style

Alabed T, Servi S. Clustering Performance Analysis Using Chaotic and Lévy Flight-Enhanced Black-Winged Kite Algorithms. Biomimetics. 2026; 11(3):200. https://doi.org/10.3390/biomimetics11030200

Chicago/Turabian Style

Alabed, Taybe, and Sema Servi. 2026. "Clustering Performance Analysis Using Chaotic and Lévy Flight-Enhanced Black-Winged Kite Algorithms" Biomimetics 11, no. 3: 200. https://doi.org/10.3390/biomimetics11030200

APA Style

Alabed, T., & Servi, S. (2026). Clustering Performance Analysis Using Chaotic and Lévy Flight-Enhanced Black-Winged Kite Algorithms. Biomimetics, 11(3), 200. https://doi.org/10.3390/biomimetics11030200

Article Metrics

Back to TopTop