entropy-logo

Journal Browser

Journal Browser

Information Theory in Emerging Machine Learning Techniques

A special issue of Entropy (ISSN 1099-4300). This special issue belongs to the section "Information Theory, Probability and Statistics".

Deadline for manuscript submissions: 15 October 2025 | Viewed by 2461

Special Issue Editor


E-Mail Website
Guest Editor
1. Data61, CSIRO, Eveleigh, NSW 2015, Australia
2. School of Computing, Australian National University, Canberra, ACT 2601, Australia
Interests: machine learning; information geometry; deep learning; model selection; dimensionality reduction; manifold learning

Special Issue Information

Dear Colleagues,

In the past two decades, deep learning, as part of machine learning, has undergone significant development. Many emerging techniques have achieved state-of-the-art performance across diverse learning tasks and areas of application, such as natural language processing, robotics, multimedia processing, and healthcare. However, many of these new methods are based on empirical evidence. While theoretical machine learning and its relationships with information theory are well developed, the theoretical analysis for deep learning has not kept pace with the engineering advancements of new learning mechanisms.

There are substantial aspects of deep learning that are not common in other areas, like its unique properties of generalization, representation learning, and latent features, its interaction with optimization, generalization and over-parameterization, layer-wise aspects of the representation, stability, and robustness. These provide a rich foundation for the application and use of information theory.

Information theory has been fundamental to modern machine learning and can significantly contribute to the development of deep learning theory. This Special Issue aims to (1) provide information-theoretical insights into new deep learning methods and (2) develop new deep learning mechanisms, or adapt current mechanisms grounded in information theory. Its focus on emerging machine learning techniques indicates a particular interest in cutting-edge deep learning techniques that have not been analyzed previously and have not been examined through simplified architectures.

Dr. Ke Sun
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Entropy is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • deep learning
  • information theory
  • information divergence
  • Riemannian geometry
  • Fisher information
  • information bottleneck
  • deep autoencoders
  • normalization in deep learning
  • deep neural network optimizers

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (3 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

11 pages, 1903 KiB  
Article
Kolmogorov–Arnold and Long Short-Term Memory Convolutional Network Models for Supervised Quality Recognition of Photoplethysmogram Signals
by Aneeqa Mehrab, Michela Lapenna, Ferdinando Zanchetta, Angelica Simonetti, Giovanni Faglioni, Andrea Malagoli and Rita Fioresi
Entropy 2025, 27(4), 326; https://doi.org/10.3390/e27040326 - 21 Mar 2025
Viewed by 271
Abstract
Photoplethysmogram (PPG) signals recover key physiological parameters as pulse, oximetry, and ECG. In this paper, we first employ a hybrid architecture combining the Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) for the analysis of PPG signals to enable an automated quality [...] Read more.
Photoplethysmogram (PPG) signals recover key physiological parameters as pulse, oximetry, and ECG. In this paper, we first employ a hybrid architecture combining the Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) for the analysis of PPG signals to enable an automated quality recognition. Then, we compare its performance to a simpler CNN architecture enriched with Kolmogorov–Arnold Network (KAN) layers. Our results suggest that the usage of KAN layers is effective at reducing the number of parameters, while also enhancing the performance of CNNs when equipped with standard Multi-Layer Perceptron (MLP) layers. Full article
(This article belongs to the Special Issue Information Theory in Emerging Machine Learning Techniques)
Show Figures

Figure 1

27 pages, 7507 KiB  
Article
Time Series Data Generation Method with High Reliability Based on ACGAN
by Fang Liu, Yuxin Li and Yuanfang Zheng
Entropy 2025, 27(2), 111; https://doi.org/10.3390/e27020111 - 23 Jan 2025
Viewed by 683
Abstract
In the process of big data processing, especially in fields like industrial fault diagnosis, there is often the issue of small sample sizes. The data generation method based on Generative Adversarial Networks(GANs) is an effective way to solve this problem. Most of the [...] Read more.
In the process of big data processing, especially in fields like industrial fault diagnosis, there is often the issue of small sample sizes. The data generation method based on Generative Adversarial Networks(GANs) is an effective way to solve this problem. Most of the existing data generation methods do not consider temporal characteristics in order to reduce complexity. This can lead to insufficient feature extraction capability. At the same time, there is a high degree of overlap between the generated data due to the low category differentiation of the real data. This leads to a lower level of category differentiation and reliability of the generated data. To address these issues, a time series data generation method with High Reliability based on the ACGAN (HR-ACGAN) is proposed, applied to the field of industrial fault diagnosis. First, a Bi-directional Long Short-Term Memory (Bi-LSTM) network layer is introduced into the discriminator.It can fully learn the temporal characteristics of the time series data and avoid the insufficient feature extraction capability. Further, an improved training objective function is designed in the generator to avoid high overlap of generated data and enhance the reliability of generated data. Finally, two representative datasets from the industrial fault domain were selected to conduct a simulation analysis of the proposed method. The experimental results show that the proposed method can generate data with high similarity. The dataset expanded with the generated data achieves high classification accuracy, effectively mitigating the issue of dataset imbalance. The proposed HR-ACGAN method can provide effective technical support for practical applications such as fault diagnosis. Full article
(This article belongs to the Special Issue Information Theory in Emerging Machine Learning Techniques)
Show Figures

Figure 1

30 pages, 1927 KiB  
Article
Fast Proxy Centers for the Jeffreys Centroid: The Jeffreys–Fisher–Rao Center and the Gauss–Bregman Inductive Center
by Frank Nielsen
Entropy 2024, 26(12), 1008; https://doi.org/10.3390/e26121008 - 22 Nov 2024
Cited by 1 | Viewed by 821
Abstract
The symmetric Kullback–Leibler centroid, also called the Jeffreys centroid, of a set of mutually absolutely continuous probability distributions on a measure space provides a notion of centrality which has proven useful in many tasks, including information retrieval, information fusion, and clustering. However, the [...] Read more.
The symmetric Kullback–Leibler centroid, also called the Jeffreys centroid, of a set of mutually absolutely continuous probability distributions on a measure space provides a notion of centrality which has proven useful in many tasks, including information retrieval, information fusion, and clustering. However, the Jeffreys centroid is not available in closed form for sets of categorical or multivariate normal distributions, two widely used statistical models, and thus needs to be approximated numerically in practice. In this paper, we first propose the new Jeffreys–Fisher–Rao center defined as the Fisher–Rao midpoint of the sided Kullback–Leibler centroids as a plug-in replacement of the Jeffreys centroid. This Jeffreys–Fisher–Rao center admits a generic formula for uni-parameter exponential family distributions and a closed-form formula for categorical and multivariate normal distributions; it matches exactly the Jeffreys centroid for same-mean normal distributions and is experimentally observed in practice to be close to the Jeffreys centroid. Second, we define a new type of inductive center generalizing the principle of the Gauss arithmetic–geometric double sequence mean for pairs of densities of any given exponential family. This new Gauss–Bregman center is shown experimentally to approximate very well the Jeffreys centroid and is suggested to be used as a replacement for the Jeffreys centroid when the Jeffreys–Fisher–Rao center is not available in closed form. Furthermore, this inductive center always converges and matches the Jeffreys centroid for sets of same-mean normal distributions. We report on our experiments, which first demonstrate how well the closed-form formula of the Jeffreys–Fisher–Rao center for categorical distributions approximates the costly numerical Jeffreys centroid, which relies on the Lambert W function, and second show the fast convergence of the Gauss–Bregman double sequences, which can approximate closely the Jeffreys centroid when truncated to a first few iterations. Finally, we conclude this work by reinterpreting these fast proxy Jeffreys–Fisher–Rao and Gauss–Bregman centers of Jeffreys centroids under the lens of dually flat spaces in information geometry. Full article
(This article belongs to the Special Issue Information Theory in Emerging Machine Learning Techniques)
Show Figures

Figure 1

Back to TopTop