entropy-logo

Journal Browser

Journal Browser

Entropy: The Cornerstone of Machine Learning

A special issue of Entropy (ISSN 1099-4300). This special issue belongs to the section "Multidisciplinary Applications".

Deadline for manuscript submissions: closed (15 October 2023) | Viewed by 10709

Special Issue Editors


E-Mail Website
Guest Editor
Department of Computer Control Systems, Vinnytsia National Technical University, Khmelnitske Shose Str., 95, 21000 Vinnytsia, Ukraine
Interests: info-communication technologies; mathematical modeling; machine learning; pattern recognition; signal processing; system analysis; speaker recognition; computational linguistics
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Institute of Theoretical and Applied Informatics, Polish Academy of Sciences, Bałtycka 5, 44-100 Gliwice, Poland
Interests: performance evaluation of networking protocols; autoconfiguration and optimization of wireless networks; LP WAN; LoRa; radio planning; BLE; indoor positioning
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

The key feature of science is the description of some quantities in terms of others. To this end, scientists and engineers create mathematical models that describe the relationships in raw input data and methods that build on these models and ultimately produce useful output data. To make these models work, they are trained. In machine learning (ML), a model is a dynamic complex system that consists of many layers, each of which represents a simple mathematical operation. The purpose of training such a system is to assemble a “snowflake” from “chaos” by combining elements from the available nomenclature. Mathematically, learning is embodied in the procedure for minimizing the objective loss function of the model. However, the training of an ML model, as a typical complex dynamic system, obeys the second law of thermodynamics. ML learning is the process of finding a “balance point” or a model configuration with maximum entropy, which corresponds to the most probable value of the loss function (the smaller the value of the loss function, the higher the risk of overfitting the model). In our Special Issue, any scientifically based ideas aimed at maximizing the entropy of ML models are welcome: configuring data, structural elements, loss functions and qualitative metrics. We also invite papers showing applications of the maximization of entropy and ML to evaluate the performance of complex systems, including telecommunication networks. Let each author show their “snowflake” to the scientific community!

Prof. Dr. Viacheslav Kovtun
Prof. Dr. Krzysztof Grochla
Prof. Dr. Jerry D. Gibson
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Entropy is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • machine learning
  • ensemble models
  • kernel methods
  • pattern recognition
  • data pre-processing
  • deep learning
  • fuzzy logic
  • decision trees
  • cross entropy
  • data augmentation
  • genetic algorithm
  • neural networks

Published Papers (7 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

29 pages, 1775 KiB  
Article
Variational Inference via Rényi Bound Optimization and Multiple-Source Adaptation
by Dana Zalman (Oshri) and Shai Fine
Entropy 2023, 25(10), 1468; https://doi.org/10.3390/e25101468 - 20 Oct 2023
Viewed by 1017
Abstract
Variational inference provides a way to approximate probability densities through optimization. It does so by optimizing an upper or a lower bound of the likelihood of the observed data (the evidence). The classic variational inference approach suggests maximizing the Evidence Lower Bound (ELBO). [...] Read more.
Variational inference provides a way to approximate probability densities through optimization. It does so by optimizing an upper or a lower bound of the likelihood of the observed data (the evidence). The classic variational inference approach suggests maximizing the Evidence Lower Bound (ELBO). Recent studies proposed to optimize the variational Rényi bound (VR) and the χ upper bound. However, these estimates, which are based on the Monte Carlo (MC) approximation, either underestimate the bound or exhibit a high variance. In this work, we introduce a new upper bound, termed the Variational Rényi Log Upper bound (VRLU), which is based on the existing VR bound. In contrast to the existing VR bound, the MC approximation of the VRLU bound maintains the upper bound property. Furthermore, we devise a (sandwiched) upper–lower bound variational inference method, termed the Variational Rényi Sandwich (VRS), to jointly optimize the upper and lower bounds. We present a set of experiments, designed to evaluate the new VRLU bound and to compare the VRS method with the classic Variational Autoencoder (VAE) and the VR methods. Next, we apply the VRS approximation to the Multiple-Source Adaptation problem (MSA). MSA is a real-world scenario where data are collected from multiple sources that differ from one another by their probability distribution over the input space. The main aim is to combine fairly accurate predictive models from these sources and create an accurate model for new, mixed target domains. However, many domain adaptation methods assume prior knowledge of the data distribution in the source domains. In this work, we apply the suggested VRS density estimate to the Multiple-Source Adaptation problem (MSA) and show, both theoretically and empirically, that it provides tighter error bounds and improved performance, compared to leading MSA methods. Full article
(This article belongs to the Special Issue Entropy: The Cornerstone of Machine Learning)
Show Figures

Figure 1

20 pages, 14749 KiB  
Article
FBANet: Transfer Learning for Depression Recognition Using a Feature-Enhanced Bi-Level Attention Network
by Huayi Wang, Jie Zhang, Yaocheng Huang and Bo Cai
Entropy 2023, 25(9), 1350; https://doi.org/10.3390/e25091350 - 17 Sep 2023
Viewed by 1124
Abstract
The House-Tree-Person (HTP) sketch test is a psychological analysis technique designed to assess the mental health status of test subjects. Nowadays, there are mature methods for the recognition of depression using the HTP sketch test. However, existing works primarily rely on manual analysis [...] Read more.
The House-Tree-Person (HTP) sketch test is a psychological analysis technique designed to assess the mental health status of test subjects. Nowadays, there are mature methods for the recognition of depression using the HTP sketch test. However, existing works primarily rely on manual analysis of drawing features, which has the drawbacks of strong subjectivity and low automation. Only a small number of works automatically recognize depression using machine learning and deep learning methods, but their complex data preprocessing pipelines and multi-stage computational processes indicate a relatively low level of automation. To overcome the above issues, we present a novel deep learning-based one-stage approach for depression recognition in HTP sketches, which has a simple data preprocessing pipeline and calculation process with a high accuracy rate. In terms of data, we use a hand-drawn HTP sketch dataset, which contains drawings of normal people and patients with depression. In the model aspect, we design a novel network called Feature-Enhanced Bi-Level Attention Network (FBANet), which contains feature enhancement and bi-level attention modules. Due to the limited size of the collected data, transfer learning is employed, where the model is pre-trained on a large-scale sketch dataset and fine-tuned on the HTP sketch dataset. On the HTP sketch dataset, utilizing cross-validation, FBANet achieves a maximum accuracy of 99.07% on the validation dataset, with an average accuracy of 97.71%, outperforming traditional classification models and previous works. In summary, the proposed FBANet, after pre-training, demonstrates superior performance on the HTP sketch dataset and is expected to be a method for the auxiliary diagnosis of depression. Full article
(This article belongs to the Special Issue Entropy: The Cornerstone of Machine Learning)
Show Figures

Figure 1

17 pages, 3497 KiB  
Article
A Novel Strategy for Extracting Richer Semantic Information Based on Fault Detection in Power Transmission Lines
by Shuxia Yan, Junhuan Li, Jiachen Wang, Gaohua Liu, Anhai Ai and Rui Liu
Entropy 2023, 25(9), 1333; https://doi.org/10.3390/e25091333 - 14 Sep 2023
Cited by 1 | Viewed by 827
Abstract
With the development of the smart grid, the traditional defect detection methods in transmission lines are gradually shifted to the combination of robots or drones and deep learning technology to realize the automatic detection of defects, avoiding the risks and computational costs of [...] Read more.
With the development of the smart grid, the traditional defect detection methods in transmission lines are gradually shifted to the combination of robots or drones and deep learning technology to realize the automatic detection of defects, avoiding the risks and computational costs of manual detection. Lightweight embedded devices such as drones and robots belong to small devices with limited computational resources, while deep learning mostly relies on deep neural networks with huge computational resources. And semantic features of deep networks are richer, which are also critical for accurately classifying morphologically similar defects for detection, helping to identify differences and classify transmission line components. Therefore, we propose a method to obtain advanced semantic features even in shallow networks. Combined with transfer learning, we change the image features (e.g., position and edge connectivity) under self-supervised learning during pre-training. This allows the pre-trained model to learn potential semantic feature representations rather than relying on low-level features. The pre-trained model then directs a shallow network to extract rich semantic features for downstream tasks. In addition, we introduce a category semantic fusion module (CSFM) to enhance feature fusion by utilizing channel attention to capture global and local information lost during compression and extraction. This module helps to obtain more category semantic information. Our experiments on a self-created transmission line defect dataset show the superiority of modifying low-level image information during pre-training when adjusting the number of network layers and embedding of the CSFM. The strategy demonstrates generalization on the publicly available PASCAL VOC dataset. Finally, compared with state-of-the-art methods on the synthetic fog insulator dataset (SFID), the strategy achieves comparable performance with much smaller network depths. Full article
(This article belongs to the Special Issue Entropy: The Cornerstone of Machine Learning)
Show Figures

Figure 1

13 pages, 2642 KiB  
Article
An Enterprise Service Demand Classification Method Based on One-Dimensional Convolutional Neural Network with Cross-Entropy Loss and Enterprise Portrait
by Haixia Zhou and Jindong Chen
Entropy 2023, 25(8), 1211; https://doi.org/10.3390/e25081211 - 14 Aug 2023
Viewed by 920
Abstract
To address the diverse needs of enterprise users and the cold-start issue of recommendation system, this paper proposes a quality-service demand classification method—1D-CNN-CrossEntorpyLoss, based on cross-entropy loss and one-dimensional convolutional neural network (1D-CNN) with the comprehensive enterprise quality portrait labels. The [...] Read more.
To address the diverse needs of enterprise users and the cold-start issue of recommendation system, this paper proposes a quality-service demand classification method—1D-CNN-CrossEntorpyLoss, based on cross-entropy loss and one-dimensional convolutional neural network (1D-CNN) with the comprehensive enterprise quality portrait labels. The main idea of 1D-CNN-CrossEntorpyLoss is to use cross-entropy to minimize the loss of 1D-CNN model and enhance the performance of the enterprise quality-service demand classification. The transaction data of the enterprise quality-service platform are selected as the data source. Finally, the performance of 1D-CNN-CrossEntorpyLoss is compared with XGBoost, SVM, and logistic regression models. From the experimental results, it can be found that 1D-CNN-CrossEntorpyLoss has the best classification results with an accuracy of 72.44%. In addition, compared to the results without the enterprise-quality portrait, the enterprise-quality portrait improves the accuracy and recall of 1D-CNN-CrossEntorpyLoss model. It is also verified that the enterprise-quality portrait can further improve the classification ability of enterprise quality-service demand, and 1D-CNN-CrossEntorpyLoss is better than other classification methods, which can improve the precision service of the comprehensive quality service platform for MSMEs. Full article
(This article belongs to the Special Issue Entropy: The Cornerstone of Machine Learning)
Show Figures

Figure 1

13 pages, 1650 KiB  
Article
Homogeneous Adaboost Ensemble Machine Learning Algorithms with Reduced Entropy on Balanced Data
by Mahesh Thyluru Ramakrishna, Vinoth Kumar Venkatesan, Ivan Izonin, Myroslav Havryliuk and Chandrasekhar Rohith Bhat
Entropy 2023, 25(2), 245; https://doi.org/10.3390/e25020245 - 29 Jan 2023
Cited by 34 | Viewed by 2487
Abstract
Today’s world faces a serious public health problem with cancer. One type of cancer that begins in the breast and spreads to other body areas is breast cancer (BC). Breast cancer is one of the most prevalent cancers that claim the lives of [...] Read more.
Today’s world faces a serious public health problem with cancer. One type of cancer that begins in the breast and spreads to other body areas is breast cancer (BC). Breast cancer is one of the most prevalent cancers that claim the lives of women. It is also becoming clearer that most cases of breast cancer are already advanced when they are brought to the doctor’s attention by the patient. The patient may have the evident lesion removed, but the seeds have reached an advanced stage of development or the body’s ability to resist them has weakened considerably, rendering them ineffective. Although it is still much more common in more developed nations, it is also quickly spreading to less developed countries. The motivation behind this study is to use an ensemble method for the prediction of BC, as an ensemble model aims to automatically manage the strengths and weaknesses of each of its separate models, resulting in the best decision being made overall. The main objective of this paper is to predict and classify breast cancer using Adaboost ensemble techniques. The weighted entropy is computed for the target column. Taking each attribute’s weights results in the weighted entropy. Each class’s likelihood is represented by the weights. The amount of information gained increases with a decrease in entropy. Both individual and homogeneous ensemble classifiers, created by mixing Adaboost with different single classifiers, have been used in this work. In order to deal with the class imbalance issue as well as noise, the synthetic minority over-sampling technique (SMOTE) was used as part of the data mining pre-processing. The suggested approach uses a decision tree (DT) and naive Bayes (NB), with Adaboost ensemble techniques. The experimental findings shown 97.95% accuracy for prediction using the Adaboost-random forest classifier. Full article
(This article belongs to the Special Issue Entropy: The Cornerstone of Machine Learning)
Show Figures

Figure 1

18 pages, 2993 KiB  
Article
Parameterization of the Stochastic Model for Evaluating Variable Small Data in the Shannon Entropy Basis
by Oleh Bisikalo, Vyacheslav Kharchenko, Viacheslav Kovtun, Iurii Krak and Sergii Pavlov
Entropy 2023, 25(2), 184; https://doi.org/10.3390/e25020184 - 17 Jan 2023
Cited by 12 | Viewed by 1492
Abstract
The article analytically summarizes the idea of applying Shannon’s principle of entropy maximization to sets that represent the results of observations of the “input” and “output” entities of the stochastic model for evaluating variable small data. To formalize this idea, a sequential transition [...] Read more.
The article analytically summarizes the idea of applying Shannon’s principle of entropy maximization to sets that represent the results of observations of the “input” and “output” entities of the stochastic model for evaluating variable small data. To formalize this idea, a sequential transition from the likelihood function to the likelihood functional and the Shannon entropy functional is analytically described. Shannon’s entropy characterizes the uncertainty caused not only by the probabilistic nature of the parameters of the stochastic data evaluation model but also by interferences that distort the results of the measurements of the values of these parameters. Accordingly, based on the Shannon entropy, it is possible to determine the best estimates of the values of these parameters for maximally uncertain (per entropy unit) distortions that cause measurement variability. This postulate is organically transferred to the statement that the estimates of the density of the probability distribution of the parameters of the stochastic model of small data obtained as a result of Shannon entropy maximization will also take into account the fact of the variability of the process of their measurements. In the article, this principle is developed into the information technology of the parametric and non-parametric evaluation on the basis of Shannon entropy of small data measured under the influence of interferences. The article analytically formalizes three key elements: -instances of the class of parameterized stochastic models for evaluating variable small data; -methods of estimating the probability density function of their parameters, represented by normalized or interval probabilities; -approaches to generating an ensemble of random vectors of initial parameters. Full article
(This article belongs to the Special Issue Entropy: The Cornerstone of Machine Learning)
Show Figures

Figure 1

Review

Jump to: Research

32 pages, 4255 KiB  
Review
Reviewing Evolution of Learning Functions and Semantic Information Measures for Understanding Deep Learning
by Chenguang Lu
Entropy 2023, 25(5), 802; https://doi.org/10.3390/e25050802 - 15 May 2023
Cited by 1 | Viewed by 1508
Abstract
A new trend in deep learning, represented by Mutual Information Neural Estimation (MINE) and Information Noise Contrast Estimation (InfoNCE), is emerging. In this trend, similarity functions and Estimated Mutual Information (EMI) are used as learning and objective functions. Coincidentally, EMI is essentially the [...] Read more.
A new trend in deep learning, represented by Mutual Information Neural Estimation (MINE) and Information Noise Contrast Estimation (InfoNCE), is emerging. In this trend, similarity functions and Estimated Mutual Information (EMI) are used as learning and objective functions. Coincidentally, EMI is essentially the same as Semantic Mutual Information (SeMI) proposed by the author 30 years ago. This paper first reviews the evolutionary histories of semantic information measures and learning functions. Then, it briefly introduces the author’s semantic information G theory with the rate-fidelity function R(G) (G denotes SeMI, and R(G) extends R(D)) and its applications to multi-label learning, the maximum Mutual Information (MI) classification, and mixture models. Then it discusses how we should understand the relationship between SeMI and Shannon’s MI, two generalized entropies (fuzzy entropy and coverage entropy), Autoencoders, Gibbs distributions, and partition functions from the perspective of the R(G) function or the G theory. An important conclusion is that mixture models and Restricted Boltzmann Machines converge because SeMI is maximized, and Shannon’s MI is minimized, making information efficiency G/R close to 1. A potential opportunity is to simplify deep learning by using Gaussian channel mixture models for pre-training deep neural networks’ latent layers without considering gradients. It also discusses how the SeMI measure is used as the reward function (reflecting purposiveness) for reinforcement learning. The G theory helps interpret deep learning but is far from enough. Combining semantic information theory and deep learning will accelerate their development. Full article
(This article belongs to the Special Issue Entropy: The Cornerstone of Machine Learning)
Show Figures

Figure 1

Back to TopTop