E-Mail Alert

Add your e-mail address to receive forthcoming issues of this journal:

Journal Browser

Journal Browser

Special Issue "Entropy Based Inference and Optimization in Machine Learning"

A special issue of Entropy (ISSN 1099-4300).

Deadline for manuscript submissions: 20 December 2019.

Special Issue Editors

Guest Editor
Prof. Stephen Roberts

Department of Engineering Science & Oxford-Man Institute of Quantitative Finance, University of Oxford, Oxford, England, UK
Website | E-Mail
Interests: Bayesian machine learning; inference and Bayesian optimization; applications in physics and finance
Guest Editor
Dr. Stefan Zohren

Department of Engineering Science & Oxford-Man Institute of Quantitative Finance, University of Oxford, Oxford, England, UK
Website | E-Mail
Interests: statistical physics of inference and optimization; quantum optimization; machine learning applied to finance

Special Issue Information

Dear Colleagues,

Many modern machine learning algorithms are deeply rooted in the principles of statistical and information physics. A prominent example is the method of Maximum Entropy and its relations to Bayesian inference and optimization. Entropy-based methods have found many applications in modern machine learning, ranging from natural language processing to the development of approximate algorithms for large-scale data analysis. This special issue aims to focus on recent advances in entropy-based methods for inference and optimization problems in machine learning. We welcome submissions making novel contributions to the subject, both foundational as well as applied.

Prof. Stephen Roberts
Dr. Stefan Zohren
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All papers will be peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Entropy is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Method of Maximum Entropy
  • Applications in Machine Learning
  • Statistical Physics of Learning Algorithms
  • Information Physics
  • Bayesian Inference
  • Bayesian Optimization
  • Approximate Algorithms
  • Large Scale Data Analysis

Published Papers (7 papers)

View options order results:
result details:
Displaying articles 1-7
Export citation of selected articles as:

Research

Open AccessArticle
Maximum Entropy Analysis of Flow Networks: Theoretical Foundation and Applications
Entropy 2019, 21(8), 776; https://doi.org/10.3390/e21080776
Received: 24 June 2019 / Revised: 29 July 2019 / Accepted: 31 July 2019 / Published: 8 August 2019
PDF Full-text (806 KB) | HTML Full-text | XML Full-text
Abstract
The concept of a “flow network”—a set of nodes and links which carries one or more flows—unites many different disciplines, including pipe flow, fluid flow, electrical, chemical reaction, ecological, epidemiological, neurological, communications, transportation, financial, economic and human social networks. This Feature Paper presents [...] Read more.
The concept of a “flow network”—a set of nodes and links which carries one or more flows—unites many different disciplines, including pipe flow, fluid flow, electrical, chemical reaction, ecological, epidemiological, neurological, communications, transportation, financial, economic and human social networks. This Feature Paper presents a generalized maximum entropy framework to infer the state of a flow network, including its flow rates and other properties, in probabilistic form. In this method, the network uncertainty is represented by a joint probability function over its unknowns, subject to all that is known. This gives a relative entropy function which is maximized, subject to the constraints, to determine the most probable or most representative state of the network. The constraints can include “observable” constraints on various parameters, “physical” constraints such as conservation laws and frictional properties, and “graphical” constraints arising from uncertainty in the network structure itself. Since the method is probabilistic, it enables the prediction of network properties when there is insufficient information to obtain a deterministic solution. The derived framework can incorporate nonlinear constraints or nonlinear interdependencies between variables, at the cost of requiring numerical solution. The theoretical foundations of the method are first presented, followed by its application to a variety of flow networks. Full article
(This article belongs to the Special Issue Entropy Based Inference and Optimization in Machine Learning)
Figures

Graphical abstract

Open AccessFeature PaperArticle
A General Framework for Fair Regression
Entropy 2019, 21(8), 741; https://doi.org/10.3390/e21080741
Received: 29 April 2019 / Revised: 17 July 2019 / Accepted: 22 July 2019 / Published: 29 July 2019
PDF Full-text (1607 KB) | HTML Full-text | XML Full-text
Abstract
Fairness, through its many forms and definitions, has become an important issue facing the machine learning community. In this work, we consider how to incorporate group fairness constraints into kernel regression methods, applicable to Gaussian processes, support vector machines, neural network regression and [...] Read more.
Fairness, through its many forms and definitions, has become an important issue facing the machine learning community. In this work, we consider how to incorporate group fairness constraints into kernel regression methods, applicable to Gaussian processes, support vector machines, neural network regression and decision tree regression. Further, we focus on examining the effect of incorporating these constraints in decision tree regression, with direct applications to random forests and boosted trees amongst other widespread popular inference techniques. We show that the order of complexity of memory and computation is preserved for such models and tightly binds the expected perturbations to the model in terms of the number of leaves of the trees. Importantly, the approach works on trained models and hence can be easily applied to models in current use and group labels are only required on training data. Full article
(This article belongs to the Special Issue Entropy Based Inference and Optimization in Machine Learning)
Figures

Figure 1

Open AccessArticle
Entropic Regularization of Markov Decision Processes
Entropy 2019, 21(7), 674; https://doi.org/10.3390/e21070674
Received: 14 June 2019 / Revised: 6 July 2019 / Accepted: 8 July 2019 / Published: 10 July 2019
PDF Full-text (651 KB) | HTML Full-text | XML Full-text
Abstract
An optimal feedback controller for a given Markov decision process (MDP) can in principle be synthesized by value or policy iteration. However, if the system dynamics and the reward function are unknown, a learning agent must discover an optimal controller via direct interaction [...] Read more.
An optimal feedback controller for a given Markov decision process (MDP) can in principle be synthesized by value or policy iteration. However, if the system dynamics and the reward function are unknown, a learning agent must discover an optimal controller via direct interaction with the environment. Such interactive data gathering commonly leads to divergence towards dangerous or uninformative regions of the state space unless additional regularization measures are taken. Prior works proposed bounding the information loss measured by the Kullback–Leibler (KL) divergence at every policy improvement step to eliminate instability in the learning dynamics. In this paper, we consider a broader family of f-divergences, and more concretely α -divergences, which inherit the beneficial property of providing the policy improvement step in closed form at the same time yielding a corresponding dual objective for policy evaluation. Such entropic proximal policy optimization view gives a unified perspective on compatible actor-critic architectures. In particular, common least-squares value function estimation coupled with advantage-weighted maximum likelihood policy improvement is shown to correspond to the Pearson χ 2 -divergence penalty. Other actor-critic pairs arise for various choices of the penalty-generating function f. On a concrete instantiation of our framework with the α -divergence, we carry out asymptotic analysis of the solutions for different values of α and demonstrate the effects of the divergence function choice on common standard reinforcement learning problems. Full article
(This article belongs to the Special Issue Entropy Based Inference and Optimization in Machine Learning)
Figures

Figure 1

Open AccessArticle
Online Gradient Descent for Kernel-Based Maximum Correntropy Criterion
Entropy 2019, 21(7), 644; https://doi.org/10.3390/e21070644
Received: 15 May 2019 / Revised: 14 June 2019 / Accepted: 24 June 2019 / Published: 29 June 2019
PDF Full-text (309 KB) | HTML Full-text | XML Full-text
Abstract
In the framework of statistical learning, we study the online gradient descent algorithm generated by the correntropy-induced losses in Reproducing kernel Hilbert spaces (RKHS). As a generalized correlation measurement, correntropy has been widely applied in practice, owing to its prominent merits on robustness. [...] Read more.
In the framework of statistical learning, we study the online gradient descent algorithm generated by the correntropy-induced losses in Reproducing kernel Hilbert spaces (RKHS). As a generalized correlation measurement, correntropy has been widely applied in practice, owing to its prominent merits on robustness. Although the online gradient descent method is an efficient way to deal with the maximum correntropy criterion (MCC) in non-parameter estimation, there has been no consistency in analysis or rigorous error bounds. We provide a theoretical understanding of the online algorithm for MCC, and show that, with a suitable chosen scaling parameter, its convergence rate can be min–max optimal (up to a logarithmic factor) in the regression analysis. Our results show that the scaling parameter plays an essential role in both robustness and consistency. Full article
(This article belongs to the Special Issue Entropy Based Inference and Optimization in Machine Learning)
Open AccessFeature PaperArticle
MEMe: An Accurate Maximum Entropy Method for Efficient Approximations in Large-Scale Machine Learning
Entropy 2019, 21(6), 551; https://doi.org/10.3390/e21060551
Received: 30 April 2019 / Revised: 25 May 2019 / Accepted: 29 May 2019 / Published: 31 May 2019
PDF Full-text (363 KB) | HTML Full-text | XML Full-text
Abstract
Efficient approximation lies at the heart of large-scale machine learning problems. In this paper, we propose a novel, robust maximum entropy algorithm, which is capable of dealing with hundreds of moments and allows for computationally efficient approximations. We showcase the usefulness of the [...] Read more.
Efficient approximation lies at the heart of large-scale machine learning problems. In this paper, we propose a novel, robust maximum entropy algorithm, which is capable of dealing with hundreds of moments and allows for computationally efficient approximations. We showcase the usefulness of the proposed method, its equivalence to constrained Bayesian variational inference and demonstrate its superiority over existing approaches in two applications, namely, fast log determinant estimation and information-theoretic Bayesian optimisation. Full article
(This article belongs to the Special Issue Entropy Based Inference and Optimization in Machine Learning)
Figures

Figure 1

Open AccessArticle
Compact Belief Rule Base Learning for Classification with Evidential Clustering
Entropy 2019, 21(5), 443; https://doi.org/10.3390/e21050443
Received: 27 March 2019 / Revised: 24 April 2019 / Accepted: 28 April 2019 / Published: 28 April 2019
PDF Full-text (866 KB) | HTML Full-text | XML Full-text
Abstract
The belief rule-based classification system (BRBCS) is a promising technique for addressing different types of uncertainty in complex classification problems, by introducing the belief function theory into the classical fuzzy rule-based classification system. However, in the BRBCS, high numbers of instances and features [...] Read more.
The belief rule-based classification system (BRBCS) is a promising technique for addressing different types of uncertainty in complex classification problems, by introducing the belief function theory into the classical fuzzy rule-based classification system. However, in the BRBCS, high numbers of instances and features generally induce a belief rule base (BRB) with large size, which degrades the interpretability of the classification model for big data sets. In this paper, a BRB learning method based on the evidential C-means clustering (ECM) algorithm is proposed to efficiently design a compact belief rule-based classification system (CBRBCS). First, a supervised version of the ECM algorithm is designed by means of weighted product-space clustering to partition the training set with the goals of obtaining both good inter-cluster separability and inner-cluster pureness. Then, a systematic method is developed to construct belief rules based on the obtained credal partitions. Finally, an evidential partition entropy-based optimization procedure is designed to get a compact BRB with a better trade-off between accuracy and interpretability. The key benefit of the proposed CBRBCS is that it can provide a more interpretable classification model on the premise of comparative accuracy. Experiments based on synthetic and real data sets have been conducted to evaluate the classification accuracy and interpretability of the proposal. Full article
(This article belongs to the Special Issue Entropy Based Inference and Optimization in Machine Learning)
Figures

Figure 1

Open AccessArticle
On Using Linear Diophantine Equations for in-Parallel Hiding of Decision Tree Rules
Entropy 2019, 21(1), 66; https://doi.org/10.3390/e21010066
Received: 10 December 2018 / Revised: 1 January 2019 / Accepted: 10 January 2019 / Published: 14 January 2019
Cited by 1 | PDF Full-text (1391 KB) | HTML Full-text | XML Full-text
Abstract
Data sharing among organizations has become an increasingly common procedure in several areas such as advertising, marketing, electronic commerce, banking, and insurance sectors. However, any organization will most likely try to keep some patterns as hidden as possible once it shares its datasets [...] Read more.
Data sharing among organizations has become an increasingly common procedure in several areas such as advertising, marketing, electronic commerce, banking, and insurance sectors. However, any organization will most likely try to keep some patterns as hidden as possible once it shares its datasets with others. This paper focuses on preserving the privacy of sensitive patterns when inducing decision trees. We adopt a record augmentation approach to hide critical classification rules in binary datasets. Such a hiding methodology is preferred over other heuristic solutions like output perturbation or cryptographic techniques, which limit the usability of the data, since the raw data itself is readily available for public use. We propose a look ahead technique using linear Diophantine equations to add the appropriate number of instances while maintaining the initial entropy of the nodes. This method can be used to hide one or more decision tree rules optimally. Full article
(This article belongs to the Special Issue Entropy Based Inference and Optimization in Machine Learning)
Figures

Figure 1

Planned Papers

The below list represents only planned manuscripts. Some of these manuscripts have not been received by the Editorial Office yet. Papers submitted to MDPI journals are subject to peer-review.

Title: Maximum Entropy Analysis of Flow Networks: Theoretical Foundation and Applications
Authors: Robert K. Niven, Markus Abel, Michael Schlegel and Steven H. Waldrip
Entropy EISSN 1099-4300 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top