Skip to Content

565 Results Found

  • Article
  • Open Access
5 Citations
3,570 Views
18 Pages

Consensus Big Data Clustering for Bayesian Mixture Models

  • Christos Karras,
  • Aristeidis Karras,
  • Konstantinos C. Giotopoulos,
  • Markos Avlonitis and
  • Spyros Sioutas

9 May 2023

In the context of big-data analysis, the clustering technique holds significant importance for the effective categorization and organization of extensive datasets. However, pinpointing the ideal number of clusters and handling high-dimensional data c...

  • Article
  • Open Access
11 Citations
2,855 Views
14 Pages

3 April 2020

It is necessary to optimize clustering processing of communication big data numerical attribute feature information in order to improve the ability of numerical attribute mining of communication big data, and thus a big data clustering algorithm base...

  • Article
  • Open Access
2 Citations
1,316 Views
19 Pages

21 January 2025

K-means clustering is a fundamental tool in data mining, yet its scalability and efficacy decline when faced with massive datasets. In this work, we introduce BiModalClust, a novel clustering algorithm that leverages a bimodal optimization paradigm t...

  • Article
  • Open Access
48 Citations
8,109 Views
20 Pages

Clustering is one of the most significant applications in the big data field. However, using the clustering technique with big data requires an ample amount of processing power and resources due to the complexity and resulting increment in the cluste...

  • Article
  • Open Access
7 Citations
2,844 Views
27 Pages

3 September 2022

Clustering of multi-source geospatial big data provides opportunities to comprehensively describe urban structures. Most existing studies focus only on the clustering of a single type of geospatial big data, which leads to biased results. Although mu...

  • Feature Paper
  • Article
  • Open Access
15 Citations
5,524 Views
23 Pages

15 August 2018

Unsupervised machine learning and knowledge discovery from large-scale datasets have recently attracted a lot of research interest. The present paper proposes a distributed big data clustering approach-based on adaptive density estimation. The propos...

  • Article
  • Open Access
11 Citations
4,799 Views
19 Pages

PARSUC: A Parallel Subsampling-Based Method for Clustering Remote Sensing Big Data

  • Huiyu Xia,
  • Wei Huang,
  • Ning Li,
  • Jianzhong Zhou and
  • Dongying Zhang

5 August 2019

Remote sensing big data (RSBD) is generally characterized by huge volumes, diversity, and high dimensionality. Mining hidden information from RSBD for different applications imposes significant computational challenges. Clustering is an important dat...

  • Article
  • Open Access
5 Citations
2,214 Views
17 Pages

12 June 2023

Complex networks in reality are not just single-layer networks. The connection of nodes in an urban metro network includes two kinds of connections: line and passenger flow. In fact, it is a multilayer network. The line network constructed by the Spa...

  • Article
  • Open Access
1,981 Views
20 Pages

26 September 2023

The proliferation of the Internet and the widespread adoption of mobile devices have given rise to an immense volume of real-time trajectory big data. However, a single computer and conventional databases with limited scalability struggle to manage t...

  • Article
  • Open Access
2,383 Views
18 Pages

An Optimal-Transport-Based Multimodal Big Data Clustering

  • Zheng Yang,
  • Chongyang Shi and
  • Ying Guan

Multimodal clustering achieves outstanding performance in various applications by aggregating information from heterogeneous devices. However, previous methods rely on strong-notion distances to fuse crossmodal complementary knowledge, established on...

  • Article
  • Open Access
119 Citations
11,250 Views
29 Pages

Advances in Meta-Heuristic Optimization Algorithms in Big Data Text Clustering

  • Laith Abualigah,
  • Amir H. Gandomi,
  • Mohamed Abd Elaziz,
  • Husam Al Hamad,
  • Mahmoud Omari,
  • Mohammad Alshinwan and
  • Ahmad M. Khasawneh

This paper presents a comprehensive survey of the meta-heuristic optimization algorithms on the text clustering applications and highlights its main procedures. These Artificial Intelligence (AI) algorithms are recognized as promising swarm intellige...

  • Article
  • Open Access
19 Citations
6,287 Views
16 Pages

Big Data Usage in European Countries: Cluster Analysis Approach

  • Mirjana Pejić Bach,
  • Tine Bertoncel,
  • Maja Meško,
  • Dalia Suša Vugec and
  • Lucija Ivančić

12 March 2020

The goal of this research was to investigate the level of digital divide among selected European countries according to the big data usage among their enterprises. For that purpose, we apply the K-means clustering methodology on the Eurostat data abo...

  • Proceeding Paper
  • Open Access
3,218 Views
10 Pages

A new Big Data cluster method was developed to forecast the hotel accommodation market. The simulation and training of time series data are from January 2008 to December 2019 for the Spanish case. Applying the Hierarchical and Sequential Clustering A...

  • Article
  • Open Access
25 Citations
5,332 Views
15 Pages

Detecting and Evaluating Urban Clusters with Spatiotemporal Big Data

  • Luliang Tang,
  • Jie Gao,
  • Chang Ren,
  • Xia Zhang,
  • Xue Yang and
  • Zihan Kan

23 January 2019

The design of urban clusters has played an important role in urban planning, but realizing the construction of these urban plans is quite a long process. Hence, how the progress is evaluated is significant for urban managers in the process of urban c...

  • Article
  • Open Access
5 Citations
6,884 Views
21 Pages

Heterogeneous Distributed Big Data Clustering on Sparse Grids

  • David Pfander,
  • Gregor Daiß and
  • Dirk Pflüger

7 March 2019

Clustering is an important task in data mining that has become more challenging due to the ever-increasing size of available datasets. To cope with these big data scenarios, a high-performance clustering approach is required. Sparse grid clustering i...

  • Article
  • Open Access
68 Citations
8,648 Views
20 Pages

Kernel Spectral Clustering for Big Data Networks

  • Raghvendra Mall,
  • Rocco Langone and
  • Johan A.K. Suykens

3 May 2013

This paper shows the feasibility of utilizing the Kernel Spectral Clustering (KSC) method for the purpose of community detection in big data networks. KSC employs a primal-dual framework to construct a model. It results in a powerful property of effe...

  • Article
  • Open Access
5 Citations
5,988 Views
31 Pages

Multi-Dimensional Validation of the Integration of Syntactic and Semantic Distance Measures for Clustering Fibromyalgia Patients in the Rheumatic Monitor Big Data Study

  • Ayelet Goldstein,
  • Yuval Shahar,
  • Michal Weisman Raymond,
  • Hagit Peleg,
  • Eldad Ben-Chetrit,
  • Arie Ben-Yehuda,
  • Erez Shalom,
  • Chen Goldstein,
  • Shmuel Shay Shiloh and
  • Galit Almoznino

This study primarily aimed at developing a novel multi-dimensional methodology to discover and validate the optimal number of clusters. The secondary objective was to deploy it for the task of clustering fibromyalgia patients. We present a comprehens...

  • Article
  • Open Access
9 Citations
2,236 Views
17 Pages

15 October 2021

In modern systems, there is a tendency to model issues more accurately with low computational cost and considering multiscale decision-making which increases the complexity of the optimization. Therefore, it is necessary to develop tools to cope with...

  • Article
  • Open Access
5,418 Views
15 Pages

5 April 2016

A multi-variable visualization technique on a 2D bitmap for big data is introduced. If A and B are two data points that are represented using two similar shapes with m pixels, where each shape is colored with RGB color of (0, 0, k), when AB ≠ ɸ, a...

  • Article
  • Open Access
1,501 Views
35 Pages

Detecting Cyber Threats in UWF-ZeekDataFall22 Using K-Means Clustering in the Big Data Environment

  • Sikha S. Bagui,
  • Germano Correa Silva De Carvalho,
  • Asmi Mishra,
  • Dustin Mink,
  • Subhash C. Bagui and
  • Stephanie Eager

In an era marked by the rapid growth of the Internet of Things (IoT), network security has become increasingly critical. Traditional Intrusion Detection Systems, particularly signature-based methods, struggle to identify evolving cyber threats such a...

  • Article
  • Open Access
10 Citations
5,271 Views
21 Pages

Big Data Clustering via Community Detection and Hyperbolic Network Embedding in IoT Applications

  • Vasileios Karyotis,
  • Konstantinos Tsitseklis,
  • Konstantinos Sotiropoulos and
  • Symeon Papavassiliou

15 April 2018

In this paper, we present a novel data clustering framework for big sensory data produced by IoT applications. Based on a network representation of the relations among multi-dimensional data, data clustering is mapped to node clustering over the prod...

  • Article
  • Open Access
1 Citations
3,042 Views
20 Pages

30 August 2023

With the development of internet technology, the number of illicit websites such as gambling and pornography has dramatically increased, posing serious threats to people’s physical and mental health, as well as their financial security. Current...

  • Article
  • Open Access
21 Citations
5,603 Views
22 Pages

Recent technological advancements in geomatics and mobile sensing have led to various urban big data, such as Tencent street view (TSV) photographs; yet, the urban objects in the big dataset have hitherto been inadequately exploited. This paper aims...

  • Article
  • Open Access
57 Citations
7,483 Views
34 Pages

3 March 2023

Big-medical-data classification and image detection are crucial tasks in the field of healthcare, as they can assist with diagnosis, treatment planning, and disease monitoring. Logistic regression and YOLOv4 are popular algorithms that can be used fo...

  • Article
  • Open Access
106 Citations
16,752 Views
12 Pages

A New K-Nearest Neighbors Classifier for Big Data Based on Efficient Data Pruning

  • Hamid Saadatfar,
  • Samiyeh Khosravi,
  • Javad Hassannataj Joloudari,
  • Amir Mosavi and
  • Shahaboddin Shamshirband

20 February 2020

The K-nearest neighbors (KNN) machine learning algorithm is a well-known non-parametric classification method. However, like other traditional data mining methods, applying it on big data comes with computational challenges. Indeed, KNN determines th...

  • Article
  • Open Access
77 Citations
12,247 Views
26 Pages

28 January 2020

Recently, the popularity of big data as a research field has shown continuous and wide-scale growth. This study aims to capture the scientific structure and topic evolution of big data research using bibliometrics and text mining-based analysis metho...

  • Article
  • Open Access
1 Citations
2,572 Views
16 Pages

13 December 2022

The emergence of geospatial big data has opened up new avenues for identifying urban environments. Although both geographic information systems (GIS) and expert systems (ES) have been useful in resolving geographical decision issues, they are not wit...

  • Article
  • Open Access
27 Citations
3,943 Views
12 Pages

Clustering Neutrosophic Data Sets and Neutrosophic Valued Metric Spaces

  • Ferhat Taş,
  • Selçuk Topal and
  • Florentin Smarandache

24 September 2018

In this paper, we define the neutrosophic valued (and generalized or G) metric spaces for the first time. Besides, we newly determine a mathematical model for clustering the neutrosophic big data sets using G-metric. Furthermore, relative weighted ne...

  • Article
  • Open Access
9 Citations
4,116 Views
14 Pages

17 April 2019

Given the issues relating to big data and privacy-preserving challenges, distributed data mining (DDM) has received much attention recently. Here, we focus on the clustering problem of distributed environments. Several distributed clustering algorith...

  • Article
  • Open Access
15 Citations
3,277 Views
21 Pages

Achieving Differential Privacy Publishing of Location-Based Statistical Data Using Grid Clustering

  • Yan Yan,
  • Zichao Sun,
  • Adnan Mahmood,
  • Fei Xu,
  • Zhuoyue Dong and
  • Quan Z. Sheng

Statistical partitioning and publishing is commonly used in location-based big data services to address queries such as the number of points of interest, available vehicles, traffic flows, infected patients, etc., within a certain range. Adding noise...

  • Article
  • Open Access
6 Citations
4,325 Views
16 Pages

Parallelism Strategies for Big Data Delayed Transfer Entropy Evaluation

  • Jonas R. Dourado,
  • Jordão Natal de Oliveira Júnior and
  • Carlos D. Maciel

9 September 2019

Generated and collected data have been rising with the popularization of technologies such as Internet of Things, social media, and smartphone, leading big data term creation. One class of big data hidden information is causality. Among the tools to...

  • Article
  • Open Access
3 Citations
2,907 Views
26 Pages

31 August 2023

Relying on user-generated content narrating individual experiences and personalized contextualization of location-specific realities, this study introduced a novel methodological approach and analysis tool that can aid health informatics in understan...

  • Article
  • Open Access
92 Citations
12,633 Views
19 Pages

Big Data Analytics for Discovering Electricity Consumption Patterns in Smart Cities

  • Rubén Pérez-Chacón,
  • José M. Luna-Romera,
  • Alicia Troncoso,
  • Francisco Martínez-Álvarez and
  • José C. Riquelme

18 March 2018

New technologies such as sensor networks have been incorporated into the management of buildings for organizations and cities. Sensor networks have led to an exponential increase in the volume of data available in recent years, which can be used to e...

  • Article
  • Open Access
17 Citations
4,099 Views
30 Pages

23 October 2021

Enormous heterogeneous sensory data are generated in the Internet of Things (IoT) for various applications. These big data are characterized by additional features related to IoT, including trustworthiness, timing and spatial features. This reveals m...

  • Article
  • Open Access
42 Citations
9,386 Views
18 Pages

24 January 2022

Since the turn of the millennium, the volume of data has increased significantly in both industries and scientific institutions. The processing of these volumes and variety of data we are dealing with are unlikely to be accomplished with conventional...

  • Article
  • Open Access
4 Citations
2,850 Views
19 Pages

A PID-Based kNN Query Processing Algorithm for Spatial Data

  • Baiyou Qiao,
  • Ling Ma,
  • Linlin Chen and
  • Bing Hu

9 October 2022

As a popular spatial operation, the k-Nearest Neighbors (kNN) query is widely used in various spatial application systems. How to efficiently process a kNN query on spatial big data has always been an important research topic in the field of spatial...

  • Article
  • Open Access
11 Citations
3,851 Views
23 Pages

A Framework of Modeling Large-Scale Wireless Sensor Networks for Big Data Collection

  • Asside Christian Djedouboum,
  • Ado Adamou Abba Ari,
  • Abdelhak Mourad Gueroui,
  • Alidou Mohamadou,
  • Ousmane Thiare and
  • Zibouda Aliouat

3 July 2020

Large Scale Wireless Sensor Networks (LS-WSNs) are Wireless Sensor Networks (WSNs) composed of an impressive number of sensors, with inherent detection and processing capabilities, to be deployed over large areas of interest. The deployment of a very...

  • Article
  • Open Access
2 Citations
5,461 Views
17 Pages

Continuous Learning Graphical Knowledge Unit for Cluster Identification in High Density Data Sets

  • K.K.L.B. Adikaram,
  • Mohamed A. Hussein,
  • Mathias Effenberger and
  • Thomas Becker

14 December 2016

Big data are visually cluttered by overlapping data points. Rather than removing, reducing or reformulating overlap, we propose a simple, effective and powerful technique for density cluster generation and visualization, where point marker (graphical...

  • Article
  • Open Access
8 Citations
4,176 Views
21 Pages

Scalable Clustering of Complex ECG Health Data: Big Data Clustering Analysis with UMAP and HDBSCAN

  • Vladislav Kaverinskiy,
  • Illya Chaikovsky,
  • Anton Mnevets,
  • Tatiana Ryzhenko,
  • Mykhailo Bocharov and
  • Kyrylo Malakhov

This study explores the potential of unsupervised machine learning algorithms to identify latent cardiac risk profiles by analyzing ECG-derived parameters from two general groups: clinically healthy individuals (Norm dataset, n = 14,863) and patients...

  • Feature Paper
  • Article
  • Open Access
1 Citations
2,899 Views
11 Pages

26 November 2022

Many parts of big data, such as web documents, online posts, papers, patents, and articles, are in text form. So, the analysis of text data in the big data domain is an important task. Many methods based on statistics or machine learning algorithms h...

  • Article
  • Open Access
4 Citations
2,333 Views
24 Pages

Applying Parallel and Distributed Models on Bio-Inspired Algorithms via a Clustering Method

  • Álvaro Gómez-Rubio,
  • Ricardo Soto,
  • Broderick Crawford,
  • Adrián Jaramillo,
  • David Mancilla,
  • Carlos Castro and
  • Rodrigo Olivares

16 January 2022

In the world of optimization, especially concerning metaheuristics, solving complex problems represented by applying big data and constraint instances can be difficult. This is mainly due to the difficulty of implementing efficient solutions that can...

  • Article
  • Open Access
5 Citations
1,906 Views
27 Pages

Algebraic Multi-Layer Network: Key Concepts

  • Igor Khanykov,
  • Vadim Nenashev and
  • Mikhail Kharinov

The paper refers to interdisciplinary research in the areas of hierarchical cluster analysis of big data and ordering of primary data to detect objects in a color or in a grayscale image. To perform this on a limited domain of multidimensional data,...

  • Article
  • Open Access
3 Citations
2,618 Views
15 Pages

12 August 2021

In view of the practical application requirements for the rapid expansion of electric taxis (ETs) and the reasonable planning of charging stations, this paper presents a method for mining latent semantic correlation of large data by the trajectory of...

  • Article
  • Open Access
7 Citations
3,311 Views
10 Pages

16 January 2023

Due to the dwindling maintenance budget and lack of qualified bridge inspectors, bridge-management agencies in Taiwan need to develop cost-effective maintenance and inspection strategies to preserve the safety and functionality of their aging, natura...

  • Review
  • Open Access
30 Citations
10,491 Views
16 Pages

We are now generating exponentially more data from more sources than a few years ago. Big data, an already familiar term, has been generally defined as a massive volume of structured, semi-structured, and/or unstructured data, which may not be effect...

  • Article
  • Open Access
298 Views
18 Pages

14 January 2026

Detecting fraudulent and anomalous transactions in large-scale digital payment systems is significantly challenging due to severe class imbalance and the fact that transactional risk is tightly coupled to the historical interactions and behaviors of...

  • Article
  • Open Access
16 Citations
5,259 Views
21 Pages

19 October 2019

Although industrial agglomeration and specialization have been studied for more than 100 years, it is still a controversial field. In the era of big data, it is of great significance to study industrial agglomeration and regional specialization by us...

  • Article
  • Open Access
1 Citations
2,261 Views
16 Pages

A Novel Stream Mining Approach as Stream-Cluster Feature Tree Algorithm: A Case Study in Turkish Job Postings

  • Yunus Doğan,
  • Feriştah Dalkılıç,
  • Alp Kut,
  • Kemal Can Kara and
  • Uygar Takazoğlu

6 August 2022

Large numbers of job postings with complex content can be found on the Internet at present. Therefore, analysis through natural language processing and machine learning techniques plays an important role in the evaluation of job postings. In this stu...

  • Article
  • Open Access
2 Citations
2,284 Views
27 Pages

More detailed and precise mobility patterns are needed for policies to reduce monomodal automotive dependency and promote multimodality in travel behaviors. Yet, empirical evidence from an integrated view of a complete door-to-door trip mode chain wi...

  • Proceeding Paper
  • Open Access
1,719 Views
11 Pages

The color of a cityscape plays a significant role in its atmosphere; however, the traditional city color analysis methods cover a wide range but are not precise enough, requiring field sampling, a lot of manual comparisons, and lacking quantitative a...

of 12