E-Mail Alert

Add your e-mail address to receive forthcoming issues of this journal:

Journal Browser

Journal Browser

Special Issue "Bayesian Inference and Information Theory"

A special issue of Entropy (ISSN 1099-4300). This special issue belongs to the section "Information Theory, Probability and Statistics".

Deadline for manuscript submissions: closed (30 April 2019)

Special Issue Editors

Guest Editor
Prof. Dr. Kevin H. Knuth

Department of Physics, University at Albany, 1400 Washington Avenue, Albany, NY 12222, USA
Website | E-Mail
Phone: 518-209-0734
Fax: +1 518 442 5260
Interests: entropy; probability theory; Bayesian; foundational issues; lattice theory; data analysis; maxent; machine learning; robotics; information theory; entropy-based experimental design
Guest Editor
Dr. Brendon J. Brewer

Department of Statistics, The University of Auckland, Private Bag 92019, Auckland 1142, New Zealand
Website | E-Mail
Interests: bayesian inference, markov chain monte carlo, nested sampling, MaxEnt

Special Issue Information

Dear Colleagues,

In Bayesian inference, probabilities describe plausibility, or the degree to which one statement implies another. In a similar manner, the entropies of information theory describe relevance, or the degree to which resolving one question would resolve another. However, this latter understanding is relatively undeveloped and is not often used in practical Bayesian data analysis. In this Special Issue we invite contributions to the area of Bayesian inference and information theory. The following suggested subtopics are of particular interest:

- Foundations of Bayesian inference and information theory

- Applications of Bayesian inference involving well-motivated uses of information theoretic concepts

- Bayesian experimental design

- Maximum entropy and choice of prior distributions

We look forward to receiving your contributions.

Prof. Dr. Kevin H. Knuth
Dr. Brendon J. Brewer
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All papers will be peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Entropy is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Bayesian inference;
  • Information theory;
  • Bayesian data analysis;
  • Maximum entropy;
  • Prior distributions;
  • Kullback-Leibler divergence;
  • Relevance

Published Papers (13 papers)

View options order results:
result details:
Displaying articles 1-13
Export citation of selected articles as:

Research

Jump to: Other

Open AccessArticle
Discriminative Structure Learning of Bayesian Network Classifiers from Training Dataset and Testing Instance
Entropy 2019, 21(5), 489; https://doi.org/10.3390/e21050489
Received: 12 February 2019 / Revised: 29 April 2019 / Accepted: 6 May 2019 / Published: 13 May 2019
PDF Full-text (665 KB) | HTML Full-text | XML Full-text
Abstract
Over recent decades, the rapid growth in data makes ever more urgent the quest for highly scalable Bayesian networks that have better classification performance and expressivity (that is, capacity to respectively describe dependence relationships between attributes in different situations). To reduce the search [...] Read more.
Over recent decades, the rapid growth in data makes ever more urgent the quest for highly scalable Bayesian networks that have better classification performance and expressivity (that is, capacity to respectively describe dependence relationships between attributes in different situations). To reduce the search space of possible attribute orders, k-dependence Bayesian classifier (KDB) simply applies mutual information to sort attributes. This sorting strategy is very efficient but it neglects the conditional dependencies between attributes and is sub-optimal. In this paper, we propose a novel sorting strategy and extend KDB from a single restricted network to unrestricted ensemble networks, i.e., unrestricted Bayesian classifier (UKDB), in terms of Markov blanket analysis and target learning. Target learning is a framework that takes each unlabeled testing instance P as a target and builds a specific Bayesian model Bayesian network classifiers (BNC) P to complement BNC T learned from training data T . UKDB respectively introduced UKDB P and UKDB T to flexibly describe the change in dependence relationships for different testing instances and the robust dependence relationships implicated in training data. They both use UKDB as the base classifier by applying the same learning strategy while modeling different parts of the data space, thus they are complementary in nature. The extensive experimental results on the Wisconsin breast cancer database for case study and other 10 datasets by involving classifiers with different structure complexities, such as Naive Bayes (0-dependence), Tree augmented Naive Bayes (1-dependence) and KDB (arbitrary k-dependence), prove the effectiveness and robustness of the proposed approach. Full article
(This article belongs to the Special Issue Bayesian Inference and Information Theory)
Figures

Figure 1

Open AccessArticle
How the Choice of Distance Measure Influences the Detection of Prior-Data Conflict
Entropy 2019, 21(5), 446; https://doi.org/10.3390/e21050446
Received: 28 March 2019 / Revised: 16 April 2019 / Accepted: 23 April 2019 / Published: 29 April 2019
PDF Full-text (5764 KB) | HTML Full-text | XML Full-text
Abstract
The present paper contrasts two related criteria for the evaluation of prior-data conflict: the Data Agreement Criterion (DAC; Bousquet, 2008) and the criterion of Nott et al. (2016). One aspect that these criteria have in common is that they depend on a distance [...] Read more.
The present paper contrasts two related criteria for the evaluation of prior-data conflict: the Data Agreement Criterion (DAC; Bousquet, 2008) and the criterion of Nott et al. (2016). One aspect that these criteria have in common is that they depend on a distance measure, of which dozens are available, but so far, only the Kullback-Leibler has been used. We describe and compare both criteria to determine whether a different choice of distance measure might impact the results. By means of a simulation study, we investigate how the choice of a specific distance measure influences the detection of prior-data conflict. The DAC seems more susceptible to the choice of distance measure, while the criterion of Nott et al. seems to lead to reasonably comparable conclusions of prior-data conflict, regardless of the distance measure choice. We conclude with some practical suggestions for the user of the DAC and the criterion of Nott et al. Full article
(This article belongs to the Special Issue Bayesian Inference and Information Theory)
Figures

Figure 1

Open AccessArticle
Bayesian Network Modelling of ATC Complexity Metrics for Future SESAR Demand and Capacity Balance Solutions
Entropy 2019, 21(4), 379; https://doi.org/10.3390/e21040379
Received: 16 February 2019 / Revised: 29 March 2019 / Accepted: 5 April 2019 / Published: 8 April 2019
PDF Full-text (12730 KB) | HTML Full-text | XML Full-text
Abstract
Demand & Capacity Management solutions are key SESAR (Single European Sky ATM Research) research projects to adapt future airspace to the expected high air traffic growth in a Trajectory Based Operations (TBO) environment. These solutions rely on processes, methods and metrics regarding the [...] Read more.
Demand & Capacity Management solutions are key SESAR (Single European Sky ATM Research) research projects to adapt future airspace to the expected high air traffic growth in a Trajectory Based Operations (TBO) environment. These solutions rely on processes, methods and metrics regarding the complexity assessment of traffic flows. However, current complexity methodologies and metrics do not properly take into account the impact of trajectories’ uncertainty to the quality of complexity predictions of air traffic demand. This paper proposes the development of several Bayesian network (BN) models to identify the impacts of TBO uncertainties to the quality of the predictions of complexity of air traffic demand for two particular Demand Capacity Balance (DCB) solutions developed by SESAR 2020, i.e., Dynamic Airspace Configuration (DAC) and Flight Centric Air Traffic Control (FCA). In total, seven BN models are elicited covering each concept at different time horizons. The models allow evaluating the influence of the “complexity generators” in the “complexity metrics”. Moreover, when the required level for the uncertainty of complexity is set, the networks allow identifying by how much uncertainty of the input variables should improve. Full article
(This article belongs to the Special Issue Bayesian Inference and Information Theory)
Figures

Figure 1

Open AccessArticle
A Mesoscopic Traffic Data Assimilation Framework for Vehicle Density Estimation on Urban Traffic Networks Based on Particle Filters
Entropy 2019, 21(4), 358; https://doi.org/10.3390/e21040358
Received: 11 January 2019 / Revised: 17 March 2019 / Accepted: 1 April 2019 / Published: 3 April 2019
PDF Full-text (4466 KB) | HTML Full-text | XML Full-text
Abstract
Traffic conditions can be more accurately estimated using data assimilation techniques since these methods incorporate an imperfect traffic simulation model with the (partial) noisy measurement data. In this paper, we propose a data assimilation framework for vehicle density estimation on urban traffic networks. [...] Read more.
Traffic conditions can be more accurately estimated using data assimilation techniques since these methods incorporate an imperfect traffic simulation model with the (partial) noisy measurement data. In this paper, we propose a data assimilation framework for vehicle density estimation on urban traffic networks. To compromise between computational efficiency and estimation accuracy, a mesoscopic traffic simulation model (we choose the platoon based model) is employed in this framework. Vehicle passages from loop detectors are considered as the measurement data which contain errors, such as missed and false detections. Due to the nonlinear and non-Gaussian nature of the problem, particle filters are adopted to carry out the state estimation, since this method does not have any restrictions on the model dynamics and error assumptions. Simulation experiments are carried out to test the proposed data assimilation framework, and the results show that the proposed framework can provide good vehicle density estimation on relatively large urban traffic networks under moderate sensor quality. The sensitivity analysis proves that the proposed framework is robust to errors both in the model and in the measurements. Full article
(This article belongs to the Special Issue Bayesian Inference and Information Theory)
Figures

Figure 1

Open AccessArticle
PID Control as a Process of Active Inference with Linear Generative Models
Entropy 2019, 21(3), 257; https://doi.org/10.3390/e21030257
Received: 18 January 2019 / Revised: 20 February 2019 / Accepted: 3 March 2019 / Published: 7 March 2019
PDF Full-text (1763 KB) | HTML Full-text | XML Full-text
Abstract
In the past few decades, probabilistic interpretations of brain functions have become widespread in cognitive science and neuroscience. In particular, the free energy principle and active inference are increasingly popular theories of cognitive functions that claim to offer a unified understanding of life [...] Read more.
In the past few decades, probabilistic interpretations of brain functions have become widespread in cognitive science and neuroscience. In particular, the free energy principle and active inference are increasingly popular theories of cognitive functions that claim to offer a unified understanding of life and cognition within a general mathematical framework derived from information and control theory, and statistical mechanics. However, we argue that if the active inference proposal is to be taken as a general process theory for biological systems, it is necessary to understand how it relates to existing control theoretical approaches routinely used to study and explain biological systems. For example, recently, PID (Proportional-Integral-Derivative) control has been shown to be implemented in simple molecular systems and is becoming a popular mechanistic explanation of behaviours such as chemotaxis in bacteria and amoebae, and robust adaptation in biochemical networks. In this work, we will show how PID controllers can fit a more general theory of life and cognition under the principle of (variational) free energy minimisation when using approximate linear generative models of the world. This more general interpretation also provides a new perspective on traditional problems of PID controllers such as parameter tuning as well as the need to balance performances and robustness conditions of a controller. Specifically, we then show how these problems can be understood in terms of the optimisation of the precisions (inverse variances) modulating different prediction errors in the free energy functional. Full article
(This article belongs to the Special Issue Bayesian Inference and Information Theory)
Figures

Figure 1

Open AccessArticle
Hidden Node Detection between Observable Nodes Based on Bayesian Clustering
Entropy 2019, 21(1), 32; https://doi.org/10.3390/e21010032
Received: 15 November 2018 / Revised: 14 December 2018 / Accepted: 3 January 2019 / Published: 7 January 2019
PDF Full-text (556 KB) | HTML Full-text | XML Full-text
Abstract
Structure learning is one of the main concerns in studies of Bayesian networks. In the present paper, we consider networks consisting of both observable and hidden nodes, and propose a method to investigate the existence of a hidden node between observable nodes, where [...] Read more.
Structure learning is one of the main concerns in studies of Bayesian networks. In the present paper, we consider networks consisting of both observable and hidden nodes, and propose a method to investigate the existence of a hidden node between observable nodes, where all nodes are discrete. This corresponds to the model selection problem between the networks with and without the middle hidden node. When the network includes a hidden node, it has been known that there are singularities in the parameter space, and the Fisher information matrix is not positive definite. Then, the many conventional criteria for structure learning based on the Laplace approximation do not work. The proposed method is based on Bayesian clustering, and its asymptotic property justifies the result; the redundant labels are eliminated and the simplest structure is detected even if there are singularities. Full article
(This article belongs to the Special Issue Bayesian Inference and Information Theory)
Figures

Figure 1

Open AccessArticle
Application of Bayesian Networks and Information Theory to Estimate the Occurrence of Mid-Air Collisions Based on Accident Precursors
Entropy 2018, 20(12), 969; https://doi.org/10.3390/e20120969
Received: 12 November 2018 / Revised: 8 December 2018 / Accepted: 11 December 2018 / Published: 14 December 2018
Cited by 3 | PDF Full-text (5283 KB) | HTML Full-text | XML Full-text
Abstract
This paper combines Bayesian networks (BN) and information theory to model the likelihood of severe loss of separation (LOS) near accidents, which are considered mid-air collision (MAC) precursors. BN is used to analyze LOS contributing factors and the multi-dependent relationship of causal factors, [...] Read more.
This paper combines Bayesian networks (BN) and information theory to model the likelihood of severe loss of separation (LOS) near accidents, which are considered mid-air collision (MAC) precursors. BN is used to analyze LOS contributing factors and the multi-dependent relationship of causal factors, while Information Theory is used to identify the LOS precursors that provide the most information. The combination of the two techniques allows us to use data on LOS causes and precursors to define warning scenarios that could forecast a major LOS with severity A or a near accident, and consequently the likelihood of a MAC. The methodology is illustrated with a case study that encompasses the analysis of LOS that have taken place within the Spanish airspace during a period of four years. Full article
(This article belongs to the Special Issue Bayesian Inference and Information Theory)
Figures

Figure 1

Open AccessArticle
Bayesian Inference in Auditing with Partial Prior Information Using Maximum Entropy Priors
Entropy 2018, 20(12), 919; https://doi.org/10.3390/e20120919
Received: 15 October 2018 / Revised: 15 November 2018 / Accepted: 28 November 2018 / Published: 1 December 2018
PDF Full-text (377 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
Problems in statistical auditing are usually one–sided. In fact, the main interest for auditors is to determine the quantiles of the total amount of error, and then to compare these quantiles with a given materiality fixed by the auditor, so that the accounting [...] Read more.
Problems in statistical auditing are usually one–sided. In fact, the main interest for auditors is to determine the quantiles of the total amount of error, and then to compare these quantiles with a given materiality fixed by the auditor, so that the accounting statement can be accepted or rejected. Dollar unit sampling (DUS) is a useful procedure to collect sample information, whereby items are chosen with a probability proportional to book amounts and in which the relevant error amount distribution is the distribution of the taints weighted by the book value. The likelihood induced by DUS refers to a 201–variate parameter p but the prior information is in a subparameter θ linear function of p , representing the total amount of error. This means that partial prior information must be processed. In this paper, two main proposals are made: (1) to modify the likelihood, to make it compatible with prior information and thus obtain a Bayesian analysis for hypotheses to be tested; (2) to use a maximum entropy prior to incorporate limited auditor information. To achieve these goals, we obtain a modified likelihood function inspired by the induced likelihood described by Zehna (1966) and then adapt the Bayes’ theorem to this likelihood in order to derive a posterior distribution for θ . This approach shows that the DUS methodology can be justified as a natural method of processing partial prior information in auditing and that a Bayesian analysis can be performed even when prior information is only available for a subparameter of the model. Finally, some numerical examples are presented. Full article
(This article belongs to the Special Issue Bayesian Inference and Information Theory)
Figures

Figure 1

Open AccessArticle
Efficient Heuristics for Structure Learning of k-Dependence Bayesian Classifier
Entropy 2018, 20(12), 897; https://doi.org/10.3390/e20120897
Received: 18 October 2018 / Revised: 13 November 2018 / Accepted: 20 November 2018 / Published: 22 November 2018
PDF Full-text (510 KB) | HTML Full-text | XML Full-text
Abstract
The rapid growth in data makes the quest for highly scalable learners a popular one. To achieve the trade-off between structure complexity and classification accuracy, the k-dependence Bayesian classifier (KDB) allows to represent different number of interdependencies for different data sizes. In [...] Read more.
The rapid growth in data makes the quest for highly scalable learners a popular one. To achieve the trade-off between structure complexity and classification accuracy, the k-dependence Bayesian classifier (KDB) allows to represent different number of interdependencies for different data sizes. In this paper, we proposed two methods to improve the classification performance of KDB. Firstly, we use the minimal-redundancy-maximal-relevance analysis, which sorts the predictive features to identify redundant ones. Then, we propose an improved discriminative model selection to select an optimal sub-model by removing redundant features and arcs in the Bayesian network. Experimental results on 40 UCI datasets demonstrate that these two techniques are complementary and the proposed algorithm achieves competitive classification performance, and less classification time than other state-of-the-art Bayesian network classifiers like tree-augmented naive Bayes and averaged one-dependence estimators. Full article
(This article belongs to the Special Issue Bayesian Inference and Information Theory)
Figures

Figure 1

Open AccessArticle
Ranking the Impact of Different Tests on a Hypothesis in a Bayesian Network
Entropy 2018, 20(11), 856; https://doi.org/10.3390/e20110856
Received: 31 August 2018 / Revised: 21 October 2018 / Accepted: 31 October 2018 / Published: 7 November 2018
PDF Full-text (444 KB) | HTML Full-text | XML Full-text
Abstract
Testing of evidence in criminal cases can be limited by temporal or financial constraints or by the fact that certain tests may be mutually exclusive, so choosing the tests that will have maximal impact on the final result is essential. In this paper, [...] Read more.
Testing of evidence in criminal cases can be limited by temporal or financial constraints or by the fact that certain tests may be mutually exclusive, so choosing the tests that will have maximal impact on the final result is essential. In this paper, we assume that a main hypothesis, evidence for it and possible tests for existence of this evidence are represented in the form of a Bayesian network, and use three different methods to measure the impact of a test on the main hypothesis. We illustrate the methods by applying them to an actual digital crime case provided by the Hong Kong police. We conclude that the Kullback–Leibler divergence is the optimal method for selecting the tests with the highest impact. Full article
(This article belongs to the Special Issue Bayesian Inference and Information Theory)
Figures

Figure 1

Open AccessArticle
Using the Data Agreement Criterion to Rank Experts’ Beliefs
Entropy 2018, 20(8), 592; https://doi.org/10.3390/e20080592
Received: 30 May 2018 / Revised: 7 August 2018 / Accepted: 7 August 2018 / Published: 9 August 2018
Cited by 1 | PDF Full-text (527 KB) | HTML Full-text | XML Full-text | Correction
Abstract
Experts’ beliefs embody a present state of knowledge. It is desirable to take this knowledge into account when making decisions. However, ranking experts based on the merit of their beliefs is a difficult task. In this paper, we show how experts can be [...] Read more.
Experts’ beliefs embody a present state of knowledge. It is desirable to take this knowledge into account when making decisions. However, ranking experts based on the merit of their beliefs is a difficult task. In this paper, we show how experts can be ranked based on their knowledge and their level of (un)certainty. By letting experts specify their knowledge in the form of a probability distribution, we can assess how accurately they can predict new data, and how appropriate their level of (un)certainty is. The expert’s specified probability distribution can be seen as a prior in a Bayesian statistical setting. We evaluate these priors by extending an existing prior-data (dis)agreement measure, the Data Agreement Criterion, and compare this approach to using Bayes factors to assess prior specification. We compare experts with each other and the data to evaluate their appropriateness. Using this method, new research questions can be asked and answered, for instance: Which expert predicts the new data best? Is there agreement between my experts and the data? Which experts’ representation is more valid or useful? Can we reach convergence between expert judgement and data? We provided an empirical example ranking (regional) directors of a large financial institution based on their predictions of turnover. Full article
(This article belongs to the Special Issue Bayesian Inference and Information Theory)
Figures

Figure 1

Open AccessArticle
A Definition of Conditional Probability with Non-Stochastic Information
Entropy 2018, 20(8), 572; https://doi.org/10.3390/e20080572
Received: 21 June 2018 / Revised: 27 July 2018 / Accepted: 31 July 2018 / Published: 3 August 2018
PDF Full-text (245 KB) | HTML Full-text | XML Full-text
Abstract
The current definition of a conditional probability enables one to update probabilities only on the basis of stochastic information. This paper provides a definition for conditional probability with non-stochastic information. The definition is derived by a set of axioms, where the information is [...] Read more.
The current definition of a conditional probability enables one to update probabilities only on the basis of stochastic information. This paper provides a definition for conditional probability with non-stochastic information. The definition is derived by a set of axioms, where the information is connected to the outcome of interest via a loss function. An illustration is presented. Full article
(This article belongs to the Special Issue Bayesian Inference and Information Theory)

Other

Jump to: Research

Open AccessCorrection
Correction: Veen, D.; Stoel, D.; Schalken, N.; Mulder, K.; Van de Schoot, R. Using the Data Agreement Criterion to Rank Experts’ Beliefs. Entropy 2018, 20, 592
Entropy 2019, 21(3), 307; https://doi.org/10.3390/e21030307
Received: 13 March 2019 / Revised: 14 March 2019 / Accepted: 15 March 2019 / Published: 21 March 2019
PDF Full-text (199 KB) | HTML Full-text | XML Full-text
Abstract
Due to a coding error the marginal likelihoods have not been correctly calculated for the empirical example and thus the Bayes Factors following from these marginal likelihoods are incorrect. The corrections required occur in Section 3.2 and in two paragraphs of the discussion [...] Read more.
Due to a coding error the marginal likelihoods have not been correctly calculated for the empirical example and thus the Bayes Factors following from these marginal likelihoods are incorrect. The corrections required occur in Section 3.2 and in two paragraphs of the discussion in which the results are referred to. The corrections have limited consequences for the paper and the main conclusions hold. Additionally typos in Equations, and, an error in the numbering of the Equations are remedied. Full article
(This article belongs to the Special Issue Bayesian Inference and Information Theory)
Entropy EISSN 1099-4300 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top