Submit to Entropy Review for Entropy Propose a Special Issue

Journal Menu

Journal Browser

Pattern Recognition and Data Clustering in Information Theory

Print Special Issue Flyer
Special Issue Editors
Special Issue Information
Keywords
Benefits of Publishing in a Special Issue
Published Papers

A special issue of Entropy (ISSN 1099-4300). This special issue belongs to the section "Information Theory, Probability and Statistics".

Deadline for manuscript submissions: closed (30 November 2023) | Viewed by 47191

Share This Special Issue

Special Issue Editors

Dr. Francisco J. Gallegos-Funes

E-Mail Website
Guest Editor

Higher School of Mechanical and Electrical Engineering (ESIME), National Polytechnic Institute of Mexico (Instituto Politécnico Nacional, IPN), Mexico city 07738, CDMX, Mexico
Interests: pattern recognition; artificial intelligence; neural networks; image processing; segmentation

Dr. Alberto J. Rosales Silva

E-Mail Website
Guest Editor

Higher School of Mechanical and Electrical Engineering (ESIME), National Polytechnic Institute of Mexico (Instituto Politécnico Nacional, IPN), Mexico city 07738, CDMX, Mexico
Interests: image processing; real-time processing; computer vision; deep learning

Special Issue Information

Dear Colleagues,

This Special Issue on Pattern Recognition and Data Clustering in Information Theory applies specialized algorithms in signals acquired by different sensors to solve problems related to the automated recognition of patterns and regularities in data in the fields of engineering and computer science.

In pattern recognition, the data analysis is related to predictive modeling, which aims to enable the use of training data to predict the behavior of unseen test data. This task is known as “learning”. One type of learning problem can be solved using clustering.

Clustering is the process of partitioning a set of objects (pattern vectors) into subsets of similar objects called clusters. Some algorithms based on clustering include: connectivity models (hierarchical clustering), centroid models (k-means and fuzzy C-means), distribution models (multivariate normal distributions used by the expectation-maximization algorithm), density models (DBSCAN and OPTICS), subspace models (biclustering), graph-based models (HCS), and neural models (artificial neural networks, self-organizing maps, and principal component analysis). In recent years, considerable effort has been put into improving the performance of existing clustering-based algorithms and the development of new methods.

The goal of the Special Issue is to collect original clustering-based research papers that develop or apply new theory to solve issues, for example, in the fields of artificial vision, signal and image processing, information retrieval, data compression, computer graphics, and machine learning. Topics of interest include, but are not limited to:

Filtering;
Enhancement and restoration;
Segmentation;
Classification and recognition.

Dr. Francisco J. Gallegos-Funes
Dr. Alberto J. Rosales Silva
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Entropy is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

information theory
data analysis
statistics
computing
machine learning and systems theory

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (10 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

19 pages, 1924 KB

Open AccessArticle

Multiview Data Clustering with Similarity Graph Learning Guided Unsupervised Feature Selection

by Ni Li, Manman Peng and Qiang Wu

Entropy 2023, 25(12), 1606; https://doi.org/10.3390/e25121606 - 30 Nov 2023

Viewed by 2160

Abstract

In multiview data clustering, consistent or complementary information in the multiview data can achieve better clustering results. However, the high dimensions, lack of labeling, and redundancy of multiview data certainly affect the clustering effect, posing a challenge to multiview clustering. A clustering algorithm based on multiview feature selection clustering (MFSC), which combines similarity graph learning and unsupervised feature selection, is designed in this study. During the MFSC implementation, local manifold regularization is integrated into similarity graph learning, with the clustering label of similarity graph learning as the standard for unsupervised feature selection. MFSC can retain the characteristics of the clustering label on the premise of maintaining the manifold structure of multiview data. The algorithm is systematically evaluated using benchmark multiview and simulated data. The clustering experiment results prove that the MFSC algorithm is more effective than the traditional algorithm. Full article

(This article belongs to the Special Issue Pattern Recognition and Data Clustering in Information Theory)

► Show Figures

Figure 1

17 pages, 8159 KB

Open AccessArticle

Denoising Vanilla Autoencoder for RGB and GS Images with Gaussian Noise

by Armando Adrián Miranda-González, Alberto Jorge Rosales-Silva, Dante Mújica-Vargas, Ponciano Jorge Escamilla-Ambrosio, Francisco Javier Gallegos-Funes, Jean Marie Vianney-Kinani, Erick Velázquez-Lozada, Luis Manuel Pérez-Hernández and Lucero Verónica Lozano-Vázquez

Entropy 2023, 25(10), 1467; https://doi.org/10.3390/e25101467 - 20 Oct 2023

Cited by 13 | Viewed by 3230

Abstract

Noise suppression algorithms have been used in various tasks such as computer vision, industrial inspection, and video surveillance, among others. The robust image processing systems need to be fed with images closer to a real scene; however, sometimes, due to external factors, the data that represent the image captured are altered, which is translated into a loss of information. In this way, there are required procedures to recover data information closest to the real scene. This research project proposes a Denoising Vanilla Autoencoding (DVA) architecture by means of unsupervised neural networks for Gaussian denoising in color and grayscale images. The methodology improves other state-of-the-art architectures by means of objective numerical results. Additionally, a validation set and a high-resolution noisy image set are used, which reveal that our proposal outperforms other types of neural networks responsible for suppressing noise in images. Full article

(This article belongs to the Special Issue Pattern Recognition and Data Clustering in Information Theory)

► Show Figures

Figure 1

16 pages, 2442 KB

Open AccessArticle

Graph Clustering with High-Order Contrastive Learning

by Wang Li, En Zhu, Siwei Wang and Xifeng Guo

Entropy 2023, 25(10), 1432; https://doi.org/10.3390/e25101432 - 10 Oct 2023

Cited by 6 | Viewed by 3942

Abstract

Graph clustering is a fundamental and challenging task in unsupervised learning. It has achieved great progress due to contrastive learning. However, we find that there are two problems that need to be addressed: (1) The augmentations in most graph contrastive clustering methods are manual, which can result in semantic drift. (2) Contrastive learning is usually implemented on the feature level, ignoring the structure level, which can lead to sub-optimal performance. In this work, we propose a method termed Graph Clustering with High-Order Contrastive Learning (GCHCL) to solve these problems. First, we construct two views by Laplacian smoothing raw features with different normalizations and design a structure alignment loss to force these two views to be mapped into the same space. Second, we build a contrastive similarity matrix with two structure-based similarity matrices and force it to align with an identity matrix. In this way, our designed contrastive learning encompasses a larger neighborhood, enabling our model to learn clustering-friendly embeddings without the need for an extra clustering module. In addition, our model can be trained on a large dataset. Extensive experiments on five datasets validate the effectiveness of our model. For example, compared to the second-best baselines on four small and medium datasets, our model achieved an average improvement of 3% in accuracy. For the largest dataset, our model achieved an accuracy score of 81.92%, whereas the compared baselines encountered out-of-memory issues. Full article

(This article belongs to the Special Issue Pattern Recognition and Data Clustering in Information Theory)

► Show Figures

Figure 1

16 pages, 36164 KB

Open AccessArticle

Enhancing Image Quality via Robust Noise Filtering Using Redescending M-Estimators

by Ángel Arturo Rendón-Castro, Dante Mújica-Vargas, Antonio Luna-Álvarez and Jean Marie Vianney Kinani

Entropy 2023, 25(8), 1176; https://doi.org/10.3390/e25081176 - 7 Aug 2023

Cited by 4 | Viewed by 2514

Abstract

In the field of image processing, noise represents an unwanted component that can occur during signal acquisition, transmission, and storage. In this paper, we introduce an efficient method that incorporates redescending M-estimators within the framework of Wiener estimation. The proposed approach effectively suppresses impulsive, additive, and multiplicative noise across varied densities. Our proposed filter operates on both grayscale and color images; it uses local information obtained from the Wiener filter and robust outlier rejection based on Insha and Hampel’s tripartite redescending influence functions. The effectiveness of the proposed method is verified through qualitative and quantitative results, using metrics such as PSNR, MAE, and SSIM. Full article

(This article belongs to the Special Issue Pattern Recognition and Data Clustering in Information Theory)

► Show Figures

Figure 1

32 pages, 9097 KB

Open AccessArticle

Benign and Malignant Breast Tumor Classification in Ultrasound and Mammography Images via Fusion of Deep Learning and Handcraft Features

by Clara Cruz-Ramos, Oscar García-Avila, Jose-Agustin Almaraz-Damian, Volodymyr Ponomaryov, Rogelio Reyes-Reyes and Sergiy Sadovnychiy

Entropy 2023, 25(7), 991; https://doi.org/10.3390/e25070991 - 28 Jun 2023

Cited by 51 | Viewed by 10141

Abstract

Breast cancer is a disease that affects women in different countries around the world. The real cause of breast cancer is particularly challenging to determine, and early detection of the disease is necessary for reducing the death rate, due to the high risks associated with breast cancer. Treatment in the early period can increase the life expectancy and quality of life for women. CAD (Computer Aided Diagnostic) systems can perform the diagnosis of the benign and malignant lesions of breast cancer using technologies and tools based on image processing, helping specialist doctors to obtain a more precise point of view with fewer processes when making their diagnosis by giving a second opinion. This study presents a novel CAD system for automated breast cancer diagnosis. The proposed method consists of different stages. In the preprocessing stage, an image is segmented, and a mask of a lesion is obtained; during the next stage, the extraction of the deep learning features is performed by a CNN—specifically, DenseNet 201. Additionally, handcrafted features (Histogram of Oriented Gradients (HOG)-based, ULBP-based, perimeter area, area, eccentricity, and circularity) are obtained from an image. The designed hybrid system uses CNN architecture for extracting deep learning features, along with traditional methods which perform several handcraft features, following the medical properties of the disease with the purpose of later fusion via proposed statistical criteria. During the fusion stage, where deep learning and handcrafted features are analyzed, the genetic algorithms as well as mutual information selection algorithm, followed by several classifiers (XGBoost, AdaBoost, Multilayer perceptron (MLP)) based on stochastic measures, are applied to choose the most sensible information group among the features. In the experimental validation of two modalities of the CAD design, which performed two types of medical studies—mammography (MG) and ultrasound (US)—the databases mini-DDSM (Digital Database for Screening Mammography) and BUSI (Breast Ultrasound Images Dataset) were used. Novel CAD systems were evaluated and compared with recent state-of-the-art systems, demonstrating better performance in commonly used criteria, obtaining ACC of 97.6%, PRE of 98%, Recall of 98%, F1-Score of 98%, and IBA of 95% for the abovementioned datasets. Full article

(This article belongs to the Special Issue Pattern Recognition and Data Clustering in Information Theory)

► Show Figures

Figure 1

18 pages, 3785 KB

Open AccessArticle

Infrared Image Caption Based on Object-Oriented Attention

by Junfeng Lv, Tian Hui, Yongfeng Zhi and Yuelei Xu

Entropy 2023, 25(5), 826; https://doi.org/10.3390/e25050826 - 22 May 2023

Cited by 5 | Viewed by 2895

Abstract

With the ongoing development of image technology, the deployment of various intelligent applications on embedded devices has attracted increased attention in the industry. One such application is automatic image captioning for infrared images, which involves converting images into text. This practical task is widely used in night security, as well as for understanding night scenes and other scenarios. However, due to the differences in image features and the complexity of semantic information, generating captions for infrared images remains a challenging task. From the perspective of deployment and application, to improve the correlation between descriptions and objects, we introduced the YOLOv6 and LSTM as encoder-decoder structure and proposed infrared image caption based on object-oriented attention. Firstly, to improve the domain adaptability of the detector, we optimized the pseudo-label learning process. Secondly, we proposed the object-oriented attention method to address the alignment problem between complex semantic information and embedded words. This method helps select the most crucial features of the object region and guides the caption model in generating words that are more relevant to the object. Our methods have shown good performance on the infrared image and can produce words explicitly associated with the object regions located by the detector. The robustness and effectiveness of the proposed methods were demonstrated through evaluation on various datasets, along with other state-of-the-art methods. Our approach achieved BLUE-4 scores of 31.6 and 41.2 on KAIST and Infrared City and Town datasets, respectively. Our approach provides a feasible solution for the deployment of embedded devices in industrial applications. Full article

(This article belongs to the Special Issue Pattern Recognition and Data Clustering in Information Theory)

► Show Figures

Figure 1

28 pages, 22100 KB

Open AccessArticle

Adaptive Density Spatial Clustering Method Fusing Chameleon Swarm Algorithm

by Wei Zhou, Limin Wang, Xuming Han, Yizhang Wang, Yufei Zhang and Zhiyao Jia

Entropy 2023, 25(5), 782; https://doi.org/10.3390/e25050782 - 11 May 2023

Cited by 12 | Viewed by 3844

Abstract

The density-based spatial clustering of application with noise (DBSCAN) algorithm is able to cluster arbitrarily structured datasets. However, the clustering result of this algorithm is exceptionally sensitive to the neighborhood radius (Eps) and noise points, and it is hard to obtain the best result quickly and accurately with it. To solve the above problems, we propose an adaptive DBSCAN method based on the chameleon swarm algorithm (CSA-DBSCAN). First, we take the clustering evaluation index of the DBSCNA algorithm as the objective function and use the chameleon swarm algorithm (CSA) to iteratively optimize the evaluation index value of the DBSCAN algorithm to obtain the best Eps value and clustering result. Then, we introduce the theory of deviation in the data point spatial distance of the nearest neighbor search mechanism to assign the identified noise points, which solves the problem of over-identification of the algorithm noise points. Finally, we construct color image superpixel information to improve the CSA-DBSCAN algorithm’s performance regarding image segmentation. The simulation results of synthetic datasets, real-world datasets, and color images show that the CSA-DBSCAN algorithm can quickly find accurate clustering results and segment color images effectively. The CSA-DBSCAN algorithm has certain clustering effectiveness and practicality. Full article

(This article belongs to the Special Issue Pattern Recognition and Data Clustering in Information Theory)

► Show Figures

Figure 1

19 pages, 410 KB

Open AccessArticle

An Ensemble and Multi-View Clustering Method Based on Kolmogorov Complexity

by Juan Zamora and Jérémie Sublime

Entropy 2023, 25(2), 371; https://doi.org/10.3390/e25020371 - 17 Feb 2023

Cited by 5 | Viewed by 3730

Abstract

The ability to build more robust clustering from many clustering models with different solutions is relevant in scenarios with privacy-preserving constraints, where data features have a different nature or where these features are not available in a single computation unit. Additionally, with the booming number of multi-view data, but also of clustering algorithms capable of producing a wide variety of representations for the same objects, merging clustering partitions to achieve a single clustering result has become a complex problem with numerous applications. To tackle this problem, we propose a clustering fusion algorithm that takes existing clustering partitions acquired from multiple vector space models, sources, or views, and merges them into a single partition. Our merging method relies on an information theory model based on Kolmogorov complexity that was originally proposed for unsupervised multi-view learning. Our proposed algorithm features a stable merging process and shows competitive results over several real and artificial datasets in comparison with other state-of-the-art methods that have similar goals. Full article

(This article belongs to the Special Issue Pattern Recognition and Data Clustering in Information Theory)

► Show Figures

Figure 1

12 pages, 4712 KB

Open AccessArticle

Efficient System for Delimitation of Benign and Malignant Breast Masses

by Dante Mújica-Vargas, Manuel Matuz-Cruz, Christian García-Aquino and Celia Ramos-Palencia

Entropy 2022, 24(12), 1775; https://doi.org/10.3390/e24121775 - 5 Dec 2022

Cited by 4 | Viewed by 2479

Abstract

In this study, a high-performing scheme is introduced to delimit benign and malignant masses in breast ultrasound images. The proposal is built upon by the Nonlocal Means filter for image quality improvement, an Intuitionistic Fuzzy C-Means local clustering algorithm for superpixel generation with high adherence to the edges, and the DBSCAN algorithm for the global clustering of those superpixels in order to delimit masses’ regions. The empirical study was performed using two datasets, both with benign and malignant breast tumors. The quantitative results with respect to the BUSI dataset were

J S C \geq 0.907

D M \geq 0.913

H D \geq 7.025

, and

M C R \leq 6.431

for benign masses and

J S C \geq 0.897

D M \geq 0.900

H D \geq 8.666

, and

M C R \leq 8.016

for malignant ones, while the MID dataset resulted in

J S C \geq 0.890

D M \geq 0.905

H D \geq 8.370

, and

M C R \leq 7.241

along with

J S C \geq 0.881

D M \geq 0.898

H D \geq 8.865

, and

M C R \leq 7.808

for benign and malignant masses, respectively. These numerical results revealed that our proposal outperformed all the evaluated comparative state-of-the-art methods in mass delimitation. This is confirmed by the visual results since the segmented regions had a better edge delimitation. Full article

(This article belongs to the Special Issue Pattern Recognition and Data Clustering in Information Theory)

► Show Figures

Figure 1

19 pages, 11569 KB

Open AccessArticle

Grid-Based Clustering Using Boundary Detection

by Mingjing Du and Fuyu Wu

Entropy 2022, 24(11), 1606; https://doi.org/10.3390/e24111606 - 4 Nov 2022

Cited by 23 | Viewed by 9748

Abstract

Clustering can be divided into five categories: partitioning, hierarchical, model-based, density-based, and grid-based algorithms. Among them, grid-based clustering is highly efficient in handling spatial data. However, the traditional grid-based clustering algorithms still face many problems: (1) Parameter tuning: density thresholds are difficult to adjust; (2) Data challenge: clusters with overlapping regions and varying densities are not well handled. We propose a new grid-based clustering algorithm named GCBD that can solve the above problems. Firstly, the density estimation of nodes is defined using the standard grid structure. Secondly, GCBD uses an iterative boundary detection strategy to distinguish core nodes from boundary nodes. Finally, two clustering strategies are combined to group core nodes and assign boundary nodes. Experiments on 18 datasets demonstrate that the proposed algorithm outperforms 6 grid-based competitors. Full article

(This article belongs to the Special Issue Pattern Recognition and Data Clustering in Information Theory)

► Show Figures

Journal Menu

Journal Browser

Pattern Recognition and Data Clustering in Information Theory

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (10 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI