entropy-logo

Journal Browser

Journal Browser

Applications of Information Theory to Machine Learning

A special issue of Entropy (ISSN 1099-4300). This special issue belongs to the section "Information Theory, Probability and Statistics".

Deadline for manuscript submissions: 30 May 2025 | Viewed by 4952

Special Issue Editors

Department of Computer Science and Technology, Harbin Institute of Technology, Shenzhen 518055, China
Interests: information theory; data compression; algebraic coding theory; machine learning; deep learning; distributed storage
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China
Interests: computer vision; information theory; algebraic coding theory; machine learning; deep learning; internet; distributed storage
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Electrical and Computer Engineering, National University of Singapore, 4 Engineering Drive 3, Singapore 117583, Singapore
Interests: information and coding theory; artificial intelligence and machine learning; biomedical informatics; wireless ad hoc and sensor networks; internet of things

Special Issue Information

Dear Colleagues,

Machine learning applications are prevalent across various domains, representing intricate and sophisticated systems. Examples include pattern recognition, natural language processing, recommendation systems, and image classification, among others. The utilization of information theory to delve into the behavior of such machine learning systems, explaining and predicting their dynamics, has garnered considerable attention from both theoretical and experimental perspectives. Numerous advancements have been made in terms of applying information theory to machine learning, encompassing correlation analyses for spatial and temporal data, as well as the development of construction and clustering techniques for complex networks within this context. The driving forces behind this progress often stem from specific application areas, such as healthcare, finance, and computer vision.

However, the application of information theory to real-world machine learning data is frequently impeded by challenges such as non-stationarity and insufficient statistics. To advance further in this domain, we seek new statistical techniques grounded in information theory, enhancements to existing methodologies, and a deeper understanding of entropy's significance in complex machine learning systems. Contributions addressing any of these issues are highly encouraged.

This Special Issue aims to serve as a platform for the introduction of novel and refined information theory techniques tailored to machine learning applications. Specifically, the analysis and interpretation of compression for generalization, the use of mutual information for feature selection, information bottlenecks for representation learning, entropy-based anomaly detection, quantifying uncertainty with mutual information, theoretic information regularization for neural networks, etc., are within the scope of this Special Issue. Your contributions to this evolving field are eagerly awaited.

Dr. Bin Chen
Prof. Dr. Shu-Tao Xia
Dr. Mehul Motani
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Entropy is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • adversarial machine learning
  • self-supervised learning
  • sequential decision-making (bandit/reinforcement learning)
  • deep learning theory
  • clustering/community detection
  • security and privacy in machine learning
  • generative models
  • decision theory
  • federated learning

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

23 pages, 3403 KiB  
Article
Class-Hidden Client-Side Watermarking in Federated Learning
by Weitong Chen, Chi Zhang, Wei Zhang and Jie Cai
Entropy 2025, 27(2), 134; https://doi.org/10.3390/e27020134 - 27 Jan 2025
Viewed by 334
Abstract
Federated learning consists of a central aggregator and multiple clients, forming a distributed structure that effectively protects data privacy. However, since all participants can access the global model, the risk of model leakage increases, especially when unreliable participants are involved. To safeguard model [...] Read more.
Federated learning consists of a central aggregator and multiple clients, forming a distributed structure that effectively protects data privacy. However, since all participants can access the global model, the risk of model leakage increases, especially when unreliable participants are involved. To safeguard model copyright while enhancing the robustness and secrecy of the watermark, this paper proposes a client-side watermarking scheme. Specifically, the proposed method introduces an additional watermark class, expanding the output layer of the client model into an N+1-class classifier. The client’s local model is then trained using both the watermark dataset and the local dataset. Notably, before uploading to the server, the parameters of the watermark class are removed from the output layer and stored locally. Additionally, the client uploads amplified parameters to address the potential weakening of the watermark during the aggregation. After aggregation, the global model is distributed to the clients for local training. Through multiple rounds of iteration, the saved watermark parameters are continuously updated until the global model converges. On the MNIST, CIFAR-100, and CIFAR-10 datasets, the watermark detection rates on VGG-16 and ResNet-18 reached 100%. Furthermore, extensive experiments demonstrate that this method has minimal impact on model performance and exhibits strong robustness against pruning and fine-tuning attacks. Full article
(This article belongs to the Special Issue Applications of Information Theory to Machine Learning)
Show Figures

Figure 1

26 pages, 724 KiB  
Article
Causal Discovery and Reasoning for Continuous Variables with an Improved Bayesian Network Constructed by Locality Sensitive Hashing and Kernel Density Estimation
by Chenghao Wei, Chen Li, Yingying Liu, Song Chen, Zhiqiang Zuo, Pukai Wang and Zhiwei Ye
Entropy 2025, 27(2), 123; https://doi.org/10.3390/e27020123 - 24 Jan 2025
Viewed by 489
Abstract
The structure learning of a Bayesian network (BN) is a crucial process that aims to unravel the complex dependencies relationships among variables using a given dataset. This paper proposes a new BN structure learning method for data with continuous attribute values. As a [...] Read more.
The structure learning of a Bayesian network (BN) is a crucial process that aims to unravel the complex dependencies relationships among variables using a given dataset. This paper proposes a new BN structure learning method for data with continuous attribute values. As a non-parametric distribution-free method, kernel density estimation (KDE) is applied in the conditional independence (CI) test. The skeleton of the BN is constructed utilizing the test based on mutual information and conditional mutual information, delineating potential relational connections between parents and children without imposing any distributional assumptions. In the searching stage of BN structure learning, the causal relationships between variables are achieved by using the conditional entropy scoring function and hill-climbing strategy. To further enhance the computational efficiency of our method, we incorporate a locality sensitive hashing (LSH) function into the KDE process. The method speeds up the calculations of KDE while maintaining the precision of the estimates, leading to a notable decrease in the time required for computing mutual information, conditional mutual information, and conditional entropy. A BN classifier (BNC) is established by using the computationally efficient BN learning method. Our experiments demonstrated that KDE using LSH has greatly improved the speed compared to traditional KDE without losing fitting accuracy. This achievement underscores the effectiveness of our method in balancing speed and accuracy. By giving the benchmark networks, the network structure learning accuracy with the proposed method is superior to other traditional structure learning methods. The BNC also demonstrates better accuracy with stronger interpretability compared to conventional classifiers on public datasets. Full article
(This article belongs to the Special Issue Applications of Information Theory to Machine Learning)
Show Figures

Figure 1

23 pages, 4327 KiB  
Article
An Intelligent Maneuver Decision-Making Approach for Air Combat Based on Deep Reinforcement Learning and Transformer Networks
by Wentao Li, Feng Fang, Dongliang Peng and Shuning Han
Entropy 2024, 26(12), 1036; https://doi.org/10.3390/e26121036 - 29 Nov 2024
Cited by 1 | Viewed by 671
Abstract
The traditional maneuver decision-making approaches are highly dependent on accurate and complete situation information, and their decision-making quality becomes poor when opponent information is occasionally missing in complex electromagnetic environments. In order to solve this problem, an autonomous maneuver decision-making approach is developed [...] Read more.
The traditional maneuver decision-making approaches are highly dependent on accurate and complete situation information, and their decision-making quality becomes poor when opponent information is occasionally missing in complex electromagnetic environments. In order to solve this problem, an autonomous maneuver decision-making approach is developed based on deep reinforcement learning (DRL) architecture. Meanwhile, a Transformer network is integrated into the actor and critic networks, which can find the potential dependency relationships among the time series trajectory data. By using these relationships, the information loss is partially compensated, which leads to maneuvering decisions being more accurate. The issues of limited experience samples, low sampling efficiency, and poor stability in the agent training state appear when the Transformer network is introduced into DRL. To address these issues, the measures of designing an effective decision-making reward, a prioritized sampling method, and a dynamic learning rate adjustment mechanism are proposed. Numerous simulation results show that the proposed approach outperforms the traditional DRL algorithms, with a higher win rate in the case of opponent information loss. Full article
(This article belongs to the Special Issue Applications of Information Theory to Machine Learning)
Show Figures

Figure 1

26 pages, 21250 KiB  
Article
APCSMA: Adaptive Personalized Client-Selection and Model-Aggregation Algorithm for Federated Learning in Edge Computing Scenarios
by Xueting Ma, Guorui Ma, Yang Liu and Shuhan Qi
Entropy 2024, 26(8), 712; https://doi.org/10.3390/e26080712 - 21 Aug 2024
Viewed by 1503
Abstract
With the rapid advancement of the Internet and big data technologies, traditional centralized machine learning methods are challenged when dealing with large-scale datasets. Federated Learning (FL), as an emerging distributed machine learning paradigm, enables multiple clients to collaboratively train a global model while [...] Read more.
With the rapid advancement of the Internet and big data technologies, traditional centralized machine learning methods are challenged when dealing with large-scale datasets. Federated Learning (FL), as an emerging distributed machine learning paradigm, enables multiple clients to collaboratively train a global model while preserving privacy. Edge computing, also recognized as a critical technology for handling massive datasets, has garnered significant attention. However, the heterogeneity of clients in edge computing environments can severely impact the performance of the resultant models. This study introduces an Adaptive Personalized Client-Selection and Model-Aggregation Algorithm, APCSMA, aimed at optimizing FL performance in edge computing settings. The algorithm evaluates clients’ contributions by calculating the real-time performance of local models and the cosine similarity between local and global models, and it designs a ContriFunc function to quantify each client’s contribution. The server then selects clients and assigns weights during model aggregation based on these contributions. Moreover, the algorithm accommodates personalized needs in local model updates, rather than simply overwriting with the global model. Extensive experiments were conducted on the FashionMNIST and Cifar-10 datasets, simulating three data distributions with parameters dir = 0.1, 0.3, and 0.5. The accuracy improvements achieved were 3.9%, 1.9%, and 1.1% for the FashionMNIST dataset, and 31.9%, 8.4%, and 5.4% for the Cifar-10 dataset, respectively. Full article
(This article belongs to the Special Issue Applications of Information Theory to Machine Learning)
Show Figures

Figure 1

29 pages, 2335 KiB  
Article
Robust Support Vector Data Description with Truncated Loss Function for Outliers Depression
by Huakun Chen, Yongxi Lyu, Jingping Shi and Weiguo Zhang
Entropy 2024, 26(8), 628; https://doi.org/10.3390/e26080628 - 25 Jul 2024
Viewed by 1069
Abstract
Support vector data description (SVDD) is widely regarded as an effective technique for addressing anomaly detection problems. However, its performance can significantly deteriorate when the training data are affected by outliers or mislabeled observations. This study introduces a universal truncated loss function framework [...] Read more.
Support vector data description (SVDD) is widely regarded as an effective technique for addressing anomaly detection problems. However, its performance can significantly deteriorate when the training data are affected by outliers or mislabeled observations. This study introduces a universal truncated loss function framework into the SVDD model to enhance its robustness and employs the fast alternating direction method of multipliers (ADMM) algorithm to solve various truncated loss functions. Moreover, the convergence of the fast ADMM algorithm is analyzed theoretically. Within this framework, we developed the truncated generalized ramp, truncated binary cross entropy, and truncated linear exponential loss functions for SVDD. We conducted extensive experiments on synthetic and real-world datasets to validate the effectiveness of these three SVDD models in handling data with different noise levels, demonstrating their superior robustness and generalization capabilities compared to other SVDD models. Full article
(This article belongs to the Special Issue Applications of Information Theory to Machine Learning)
Show Figures

Figure 1

Back to TopTop