Skip to Content

49,643 Results Found

  • Article
  • Open Access
1 Citations
1,360 Views
29 Pages

Automated Exploratory Clustering to Democratize Clustering Analysis

  • Georg Stefan Schlake,
  • Max Pernklau and
  • Christian Beecks

18 June 2025

AutoML is enabling many practitioners to use sophisticated Machine Learning pipelines even without being experienced in building application-specific solutions. Adapting AutoML to the field of unsupervised learning, particularly to the task of cluste...

  • Article
  • Open Access
93 Citations
11,574 Views
14 Pages

Comparison of Internal Clustering Validation Indices for Prototype-Based Clustering

  • Joonas Hämäläinen,
  • Susanne Jauhiainen and
  • Tommi Kärkkäinen

6 September 2017

Clustering is an unsupervised machine learning and pattern recognition method. In general, in addition to revealing hidden groups of similar observations and clusters, their number needs to be determined. Internal clustering validation indices estima...

  • Article
  • Open Access
9 Citations
4,092 Views
13 Pages

15 January 2024

In the era of big data, unsupervised learning algorithms such as clustering are particularly prominent. In recent years, there have been significant advancements in clustering algorithm research. The Clustering by Density Peaks algorithm is known as...

  • Article
  • Open Access
1,620 Views
15 Pages

Entropy-Randomized Clustering

  • Yuri S. Popkov,
  • Yuri A. Dubnov and
  • Alexey Yu. Popkov

10 October 2022

This paper proposes a clustering method based on a randomized representation of an ensemble of possible clusters with a probability distribution. The concept of a cluster indicator is introduced as the average distance between the objects included in...

  • Feature Paper
  • Article
  • Open Access
17 Citations
4,134 Views
24 Pages

14 July 2020

Density peak clustering (DPC) is a density-based clustering method that has attracted much attention in the academic community. DPC works by first searching density peaks in the dataset, and then assigning each data point to the same cluster as its n...

  • Article
  • Open Access
3 Citations
1,309 Views
27 Pages

Graphimages, which represent data structures through nodes and edges, present significant challenges for clustering due to their intricate topological properties. Traditional clustering algorithms, such as K-means and Density-Based Spatial Clustering...

  • Article
  • Open Access
3 Citations
2,643 Views
15 Pages

30 August 2021

In an era of big data, face images captured in social media and forensic investigations, etc., generally lack labels, while the number of identities (clusters) may range from a few dozen to thousands. Therefore, it is of practical importance to clust...

  • Article
  • Open Access
24 Citations
6,529 Views
17 Pages

Semantic Clustering of Functional Requirements Using Agglomerative Hierarchical Clustering

  • Hamzeh Eyal Salman,
  • Mustafa Hammad,
  • Abdelhak-Djamel Seriai and
  • Ahed Al-Sbou

3 September 2018

Software applications have become a fundamental part in the daily work of modern society as they meet different needs of users in different domains. Such needs are known as software requirements (SRs) which are separated into functional (software ser...

  • Article
  • Open Access
9 Citations
3,835 Views
31 Pages

Positive and Negative Evidence Accumulation Clustering for Sensor Fusion: An Application to Heartbeat Clustering

  • David G. Márquez,
  • Paulo Félix,
  • Constantino A. García,
  • Javier Tejedor,
  • Ana L.N. Fred and
  • Abraham Otero

24 October 2019

In this work, a new clustering algorithm especially geared towards merging data arising from multiple sensors is presented. The algorithm, called PN-EAC, is based on the ensemble clustering paradigm and it introduces the novel concept of negative evi...

  • Feature Paper
  • Article
  • Open Access
1 Citations
1,631 Views
31 Pages

Clustering Validation Inference

  • Pau Figuera,
  • Alfredo Cuzzocrea and
  • Pablo García Bringas

27 July 2024

Clustering validation is applied to evaluate the quality of classifications. This step is crucial for unsupervised machine learning. A plethora of methods exist for this purpose; however, a common drawback is that statistical inference is not possibl...

  • Article
  • Open Access
30 Citations
11,115 Views
15 Pages

Incremental Clustering of News Reports

  • Joel Azzopardi and
  • Christopher Staff

24 August 2012

When an event occurs in the real world, numerous news reports describing this event start to appear on different news sites within a few minutes of the event occurrence. This may result in a huge amount of information for users, and automated process...

  • Article
  • Open Access
28 Citations
4,528 Views
16 Pages

How the Outliers Influence the Quality of Clustering?

  • Agnieszka Nowak-Brzezińska and
  • Igor Gaibei

30 June 2022

In this article, we evaluate the efficiency and performance of two clustering algorithms: AHC (Agglomerative Hierarchical Clustering) and KMeans. We are aware that there are various linkage options and distance measures that influence the clus...

  • Article
  • Open Access
2 Citations
2,311 Views
20 Pages

The α-Groups under Condorcet Clustering

  • Tarik Faouzi,
  • Luis Firinguetti-Limone,
  • José Miguel Avilez-Bozo and
  • Rubén Carvajal-Schiaffino

24 February 2022

We introduce a new approach to clustering categorical data: Condorcet clustering with a fixed number of groups, denoted α-Condorcet. As k-modes, this approach is essentially based on similarity and dissimilarity measures. The paper is divided i...

  • Article
  • Open Access
1 Citations
4,147 Views
28 Pages

Insurance Analytics with Clustering Techniques

  • Charlotte Jamotton,
  • Donatien Hainaut and
  • Thomas Hames

5 September 2024

The K-means algorithm and its variants are well-known clustering techniques. In actuarial applications, these partitioning methods can identify clusters of policies with similar attributes. The resulting partitions provide an actuarial framework for...

  • Article
  • Open Access
12 Citations
3,536 Views
23 Pages

28 January 2024

Fuzzy clustering, as a powerful method for pattern recognition and data analysis, often produces complex results that require careful examination of individual clusters. In this paper, advanced visualization techniques are presented that aim to facil...

  • Article
  • Open Access
3 Citations
2,103 Views
18 Pages

A chemical reaction and its reaction environment are intrinsically linked, especially within the confines of narrow cellular spaces. Traditional models of chemical reactions often use differential equations with concentration as the primary variable,...

  • Article
  • Open Access
18 Citations
3,434 Views
20 Pages

5 September 2022

As an important part of intelligent transportation systems, traffic state classification plays a vital role for traffic managers when formulating measures to alleviate traffic congestion. The proliferation of traffic data brings new opportunities for...

  • Article
  • Open Access
5 Citations
4,305 Views
15 Pages

15 September 2020

The goal of partitioning clustering analysis is to divide a dataset into a predetermined number of homogeneous clusters. The quality of final clusters from a prototype-based partitioning algorithm is highly affected by the initially chosen centroids....

  • Article
  • Open Access
1 Citations
1,022 Views
13 Pages

A Validity Index for Clustering Evaluation by Grid Structures

  • Jiachen Wang,
  • Zuojing Zhang and
  • Shihong Yue

20 March 2025

The evaluation of clustering results plays an important role in clustering analysis. Most existing indexes are designed for the evaluation of results from the most-used K-means clustering algorithm; it can identify only spherical clusters rather than...

  • Article
  • Open Access
3 Citations
2,075 Views
11 Pages

Double-Constraint Fuzzy Clustering Algorithm

  • Shiyuan Zhu,
  • Yuwei Zhao and
  • Shihong Yue

18 February 2024

Given a set of data objects, the fuzzy c-means (FCM) partitional clustering algorithm is favored due to easy implementation, rapid response, and feasible optimization. However, FCM fails to reflect either the importance degree of the individual data...

  • Article
  • Open Access
2,898 Views
16 Pages

29 June 2021

In the conventional k-means framework, seeding is the first step toward optimization before the objects are clustered. In random seeding, two main issues arise: the clustering results may be less than optimal and different clustering results may be o...

  • Article
  • Open Access
2 Citations
2,911 Views
17 Pages

12 December 2018

A scholarly article often discusses multiple research issues. The clustering of scholarly articles based on research issues can facilitate analyses of related articles on specific issues in scientific literature. It is a task of overlapping clusterin...

  • Article
  • Open Access
2,149 Views
17 Pages

30 July 2024

Structural graph clustering is a data analysis technique that groups nodes within a graph based on their connectivity and structural similarity. The Structural graph clustering SCAN algorithm, a density-based clustering method, effectively identifies...

  • Article
  • Open Access
3 Citations
1,518 Views
15 Pages

Research on Automatic Alignment for Corn Harvesting Based on Euclidean Clustering and K-Means Clustering

  • Bin Zhang,
  • Hao Xu,
  • Kunpeng Tian,
  • Jicheng Huang,
  • Fanting Kong,
  • Senlin Mu,
  • Teng Wu,
  • Zhongqiu Mu,
  • Xingsong Wang and
  • Deqiang Zhou

18 November 2024

Aiming to meet the growing need for automated harvesting, an automatic alignment method based on Euclidean clustering and K-means clustering is proposed to address issues of driver fatigue and inaccurate driving in manually operated corn harvesters....

  • Article
  • Open Access
5 Citations
2,335 Views
20 Pages

A Novel Dynamic Transmission Power of Cluster Heads Based Clustering Scheme

  • Mengchu Nie,
  • Pingmu Huang,
  • Jie Zeng,
  • Yueming Lu,
  • Tao Zhang and
  • Tiejun Lv

Clustering methods are promising tools for ensuring the network scalability and maintainability of large-scale flying ad hoc networks (FANETs). However, due to the high mobility and limited energy resources of unmanned aerial vehicles (UAVs), it is d...

  • Article
  • Open Access
23 Citations
4,753 Views
26 Pages

6 December 2020

The clustering analysis algorithm is used to reveal the internal relationships among the data without prior knowledge and to further gather some data with common attributes into a group. In order to solve the problem that the existing algorithms alwa...

  • Article
  • Open Access
20 Citations
5,317 Views
14 Pages

4 August 2020

The aim of the study was to group the lactation curve (LC) of Holstein cows in several clusters based on their milking characteristics and to investigate physiological differences among the clusters. Milking data of 330 lactations which have a milk y...

  • Article
  • Open Access
1 Citations
2,210 Views
39 Pages

Directions for the Sustainability of Innovative Clustering in a Country

  • Vito Bobek,
  • Vladislav Streltsov and
  • Tatjana Horvat

15 February 2023

This paper presents potential improvements through utilizing the cyclical nature of clusters by human capital, technology, policies, and management. A historical review of the formation and sustainable development of clusters in the US, Europe, Japan...

  • Article
  • Open Access
5 Citations
3,374 Views
18 Pages

Improved Selective Deep-Learning-Based Clustering Ensemble

  • Yue Qian,
  • Shixin Yao,
  • Tianjun Wu,
  • You Huang and
  • Lingbin Zeng

15 January 2024

Clustering ensemble integrates multiple base clustering results to improve the stability and robustness of the single clustering method. It consists of two principal steps: a generation step, which is about the creation of base clusterings, and a con...

  • Article
  • Open Access
5 Citations
2,964 Views
22 Pages

Ising-Based Kernel Clustering

  • Masahito Kumagai,
  • Kazuhiko Komatsu,
  • Masayuki Sato and
  • Hiroaki Kobayashi

19 April 2023

Combinatorial clustering based on the Ising model is drawing attention as a high-quality clustering method. However, conventional Ising-based clustering methods using the Euclidean distance cannot handle irregular data. To overcome this problem,...

  • Article
  • Open Access
12 Citations
3,729 Views
27 Pages

30 November 2022

Metaheuristic algorithms have been hybridized with the standard K-means to address the latter’s challenges in finding a solution to automatic clustering problems. However, the distance calculations required in the standard K-means phase of the...

  • Article
  • Open Access
7 Citations
2,754 Views
33 Pages

24 March 2025

Efficient energy management relies on uncovering meaningful consumption patterns from large-scale electricity load demand profiles. With the widespread adoption of sensor technologies such as smart meters and IoT-based monitoring systems, granular an...

  • Article
  • Open Access
12 Citations
9,399 Views
16 Pages

Information Theoretic Hierarchical Clustering

  • Mehdi Aghagolzadeh,
  • Hamid Soltanian-Zadeh and
  • Babak Nadjar Araabi

10 February 2011

Hierarchical clustering has been extensively used in practice, where clusters can be assigned and analyzed simultaneously, especially when estimating the number of clusters is challenging. However, due to the conventional proximity measures recruited...

  • Article
  • Open Access
116 Views
21 Pages

17 March 2026

Deep clustering aims to boost clustering performance by learning powerful representations via deep learning. Despite their superiority over conventional shallow algorithms, autoencoder-based methods are typically hindered by heavy dependencies on lar...

  • Article
  • Open Access
12 Citations
3,558 Views
15 Pages

A Novel Neighborhood Granular Meanshift Clustering Algorithm

  • Qiangqiang Chen,
  • Linjie He,
  • Yanan Diao,
  • Kunbin Zhang,
  • Guoru Zhao and
  • Yumin Chen

31 December 2022

The most popular algorithms used in unsupervised learning are clustering algorithms. Clustering algorithms are used to group samples into a number of classes or clusters based on the distances of the given sample features. Therefore, how to define th...

  • Article
  • Open Access
10 Citations
3,153 Views
19 Pages

9 December 2022

Block modeling is an effective way to understand Earth’s crustal deformation. However, the choice of block boundaries and the number of blocks affect the model results. Therefore, the subjectivity of this analysis should be avoided. Clustering...

  • Article
  • Open Access
1 Citations
4,779 Views
28 Pages

28 September 2017

Closeness measures are crucial to clustering methods. In most traditional clustering methods, the closeness between data points or clusters is measured by the geometric distance alone. These metrics quantify the closeness only based on the concerned...

  • Article
  • Open Access
8 Citations
4,176 Views
21 Pages

Scalable Clustering of Complex ECG Health Data: Big Data Clustering Analysis with UMAP and HDBSCAN

  • Vladislav Kaverinskiy,
  • Illya Chaikovsky,
  • Anton Mnevets,
  • Tatiana Ryzhenko,
  • Mykhailo Bocharov and
  • Kyrylo Malakhov

This study explores the potential of unsupervised machine learning algorithms to identify latent cardiac risk profiles by analyzing ECG-derived parameters from two general groups: clinically healthy individuals (Norm dataset, n = 14,863) and patients...

  • Article
  • Open Access
1 Citations
1,266 Views
17 Pages

Fair Spectral Clustering Based on Coordinate Descent

  • Ruixin Feng,
  • Caiming Zhong and
  • Tiejun Pan

25 December 2024

Research on the fairness of spectral clustering has gradually increased attention. Normally, existing methods of fair spectral clustering add a fairness constraint to the original objective function so that fairness is guaranteed. However, similar to...

  • Article
  • Open Access
5 Citations
4,794 Views
16 Pages

Spectral Embedded Deep Clustering

  • Yuichiro Wada,
  • Shugo Miyamoto,
  • Takumi Nakagama,
  • Léo Andéol,
  • Wataru Kumagai and
  • Takafumi Kanamori

15 August 2019

We propose a new clustering method based on a deep neural network. Given an unlabeled dataset and the number of clusters, our method directly groups the dataset into the given number of clusters in the original space. We use a conditional discrete pr...

  • Article
  • Open Access
10 Citations
4,149 Views
17 Pages

Intra-Storm Pattern Recognition through Fuzzy Clustering

  • Konstantinos Vantas and
  • Epaminondas Sidiropoulos

The identification and recognition of temporal rainfall patterns is important and useful not only for climatological studies, but mainly for supporting rainfall–runoff modeling and water resources management. Clustering techniques applied to rainfall...

  • Article
  • Open Access
610 Views
20 Pages

24 April 2025

The performance prediction of the Five-hundred-meter Aperture Spherical radio Telescope (FAST) project represents one of the primary challenges faced by the system. To address the performance prediction issues of the FAST hydraulic actuator cluster s...

  • Article
  • Open Access
532 Views
27 Pages

26 November 2025

Deep clustering aims to discover meaningful data groups by jointly learning representations and cluster probability distributions. Yet existing methods rarely consider the underlying information characteristics of these distributions, causing ambigui...

  • Article
  • Open Access
44 Citations
7,838 Views
26 Pages

MorphoCluster: Efficient Annotation of Plankton Images by Clustering

  • Simon-Martin Schröder,
  • Rainer Kiko and
  • Reinhard Koch

28 May 2020

In this work, we present MorphoCluster, a software tool for data-driven, fast, and accurate annotation of large image data sets. While already having surpassed the annotation rate of human experts, volume and complexity of marine data will continue t...

  • Article
  • Open Access
2 Citations
2,158 Views
17 Pages

12 May 2021

In this paper, a novel algorithm (IBC1) for graph clustering with no prior assumption of the number of clusters is introduced. Furthermore, an additional algorithm (IBC2) for graph clustering when the number of clusters is given beforehand is present...

  • Article
  • Open Access
6 Citations
2,466 Views
17 Pages

20 July 2023

This paper proposes two algorithms for clustering data, which are variable-sized sets of elementary items. An example of such data occurs in the analysis of a medical diagnosis, where the goal is to detect human subjects who share common diseases to...

of 993