Information Theoretic Measures and Their Applications
Special Issue Information
The Contributions
- Attention to the Variation of Probabilistic Events: Information Processing with Message Importance Measure. By She, R.; Liu, S.; Fan, P. [1]Different probabilities of events attract different attention in many scenarios such as anomaly detection and security systems. To characterize the events’ importance from a probabilistic perspective, the message importance measure (MIM) is proposed as a kind of semantics analysis tool. Similar to Shannon entropy, the MIM has its special function in information representation, in which the parameter of MIM plays a vital role. Actually, the parameter dominates the properties of MIM, based on which the MIM has three work regions where this measure can be used flexibly for different goals. When the parameter is positive but not large enough, the MIM not only provides a new viewpoint for information processing but also has some similarities with Shannon entropy in the information compression and transmission. In this regard, the present work first constructs a system model with message importance measure and proposes the message importance loss to enrich the information processing strategies. Moreover, the message importance loss capacity is proposed to measure the information importance harvest in a transmission. Furthermore, the message importance distortion function is discussed to give an upper bound of information compression based on the MIM. Additionally, the bit rate transmission constrained by the message importance loss is investigated to broaden the scope for Shannon information theory.
- Melodies as Maximally Disordered Systems under Macroscopic Constraints with Musical Meaning. By Useche, J.; Hurtado, R. [2]One of the most relevant features of musical pieces is the selection and utilization of musical elements by composers. For connecting the musical properties of a melodic line as a whole with those of its constituent elements, the authors propose a representation for musical intervals based on physical quantities and a statistical model based on the minimization of relative entropy. The representation contains information about the size, location in the register, and level of tonal consonance of musical intervals. The statistical model involves expected values of relevant physical quantities that can be adopted as macroscopic constraints with musical meaning. The authors studied the occurrences of musical intervals in 20 melodic lines from seven masterpieces of Western tonal music. They found that all melodic lines are strictly ordered in terms of the physical quantities of the representation and that the formalism is suitable for approximately reproducing the final selection of musical intervals made by the composers, as well as for describing musical features as the asymmetry in the use of ascending and descending intervals, transposition processes, and the mean dissonance of a melodic line.
- Discriminatory Target Learning: Mining Significant Dependence Relationships from Labeled and Unlabeled Data. By Duan, Z.; Wang, L.; Mammadov, M.; Lou, H.; Sun, M. [3]Machine learning techniques have shown superior predictive power, among which Bayesian network classifiers (BNCs) have remained of great interest due to its capacity to demonstrate complex dependence relationships. Most traditional BNCs tend to build only one model to fit training instances by analyzing independence between attributes using conditional mutual information. However, for different class labels, the conditional dependence relationships may be different rather than invariant when attributes take different values, which may result in classification bias. To address this issue, the authors propose a novel framework, called discriminatory target learning, which can be regarded as a trade-off between probabilistic models learned from unlabeled instances at the uncertain end and that learned from labeled training data at the certain end. The final model can discriminately represent the dependence relationships hidden in unlabeled instances with respect to different possible class labels. Taking k-dependence Bayesian classifier as an example, experimental comparison on 42 publicly available datasets indicated that the final model achieved competitive classification performance compared to state-of-the-art learners such as Random forest and averaged one-dependence estimators.
- Structure Extension of Tree-Augmented Naive Bayes. By Long, Y.; Wang, L.; Sun, M. [4]Due to the simplicity and competitive classification performance of the naive Bayes (NB), researchers have proposed many approaches to improve NB by weakening its attribute independence assumption. Through the theoretical analysis of Kullback–Leibler divergence, the difference between NB and its variations lies in different orders of conditional mutual information represented by these augmenting edges in the tree-shaped network structure. In the present work, the authors propose to relax the independence assumption by further generalizing tree-augmented naive Bayes (TAN) from 1-dependence Bayesian network classifiers (BNC) to arbitrary k-dependence. Sub-models of TAN that are built to respectively represent specific conditional dependence relationships may “best match” the conditional probability distribution over the training data. Extensive experimental results reveal that the proposed algorithm achieves bias-variance trade-off and substantially better generalization performance than state-of-the-art classifiers such as logistic regression.
- Permutation Entropy and Statistical Complexity Analysis of Brazilian Agricultural Commodities. By de Araujo, F.; Bejan, L.; Rosso, O. A.; Stosic, T. [5]Agricultural commodities are considered perhaps the most important commodities, as any abrupt increase in food prices has serious consequences on food security and welfare, especially in developing countries. In this work, the authors analyze predictability of Brazilian agricultural commodity prices during the period after 2007/2008 food crisis. They use information theory based method Complexity/Entropy causality plane (CECP) that was shown to be successful in the analysis of market efficiency and predictability. By estimating information quantifiers permutation entropy and statistical complexity, they associate to each commodity the position in CECP and compare their efficiency (lack of predictability) using the deviation from a random process. The coffee market shows the highest efficiency (lowest predictability) while the pork market shows the lowest efficiency (highest predictability). By analyzing temporal evolution of commodities in the complexity–entropy causality plane, the authors observe that during the analyzed period (after 2007/2008 crisis) the efficiency of cotton, rice, and cattle markets increases, the soybeans market shows the decrease in efficiency until 2012, followed by the lower predictability and the increase of efficiency, while most commodities (8 out of total 12) exhibit relatively stable efficiency, indicating increased market integration in a post-crisis period.
- Information Theory for Non-Stationary Processes with Stationary Increments. By Granero-Belinchón, C.; Roux, S.; Garnier, N. [6]In the present contribution, the authors describe how to analyze the wide class of non-stationary processes with stationary centered increments using Shannon information theory. To do so, they use a practical viewpoint and define ersatz quantities from time-averaged probability distributions. These ersatz versions of entropy, mutual information, and entropy rate can be estimated when only a single realization of the process is available. We abundantly illustrate our approach by analyzing Gaussian and non-Gaussian self-similar signals, as well as multi-fractal signals. Using Gaussian signals allows them to check that their approach is robust in the sense that all quantities behave as expected from analytical derivations. Using the stationarity (independence on the integration time) of the ersatz entropy rate, they show that this quantity is not only able to fine probe the self-similarity of the process, but also offers a new way to quantify the multi-fractality.
- Higher-Order Cumulants Drive Neuronal Activity Patterns, Inducing UP-DOWN States in Neural Populations. By Baravalle, R.; Montani, F. [7]A major challenge in neuroscience is to understand the role of the higher-order correlations structure of neuronal populations. The dichotomized Gaussian model (DG) generates spike trains by means of thresholding a multivariate Gaussian random variable. The DG inputs are Gaussian distributed, and thus have no interactions beyond the second order in their inputs; however, they can induce higher-order correlations in the outputs. The authors propose a combination of analytical and numerical techniques to estimate higher-order, above the second, cumulants of the firing probability distributions. Their findings show that a large amount of pairwise interactions in the inputs can induce the system into two possible regimes, one with low activity (“DOWN state”) and another one with high activity (“UP state”), and the appearance of these states is due to a combination between the third- and fourth-order cumulant. This could be part of a mechanism that would help the neural code to upgrade specific information about the stimuli, motivating them to examine the behavior of the critical fluctuations through the Binder cumulant close to the critical point. The authors show, using the Binder cumulant, that higher-order correlations in the outputs generate a critical neural system that portrays a second-order phase transition.
- Direct and Indirect Effects—An Information Theoretic Perspective. By Schamberg, G.; Chapman, W.; Xie, S.; Coleman, T. [8]Information theoretic (IT) approaches to quantifying causal influences have experienced some popularity in the literature, in both theoretical and applied (e.g., neuroscience and climate science) domains. While these causal measures are desirable in that they are model agnostic and can capture nonlinear interactions, they are fundamentally different from common statistical notions of causal influence in that they (1) compare distributions over the effect rather than values of the effect and (2) are defined with respect to random variables representing a cause rather than specific values of a cause. The authors present here IT measures of direct, indirect, and total causal effects. The proposed measures are unlike existing IT techniques in that they enable measuring causal effects that are defined with respect to specific values of a cause while still offering the flexibility and general applicability of IT techniques. They provide an identifiability result and demonstrate application of the proposed measures in estimating the causal effect of the El Niño–Southern Oscillation on temperature anomalies in the North American Pacific Northwest.
- New Fast ApEn and SampEn Entropy Algorithms Implementation and Their Application to Supercomputer Power Consumption. By Tomala, J. [9] Approximate Entropy and especially Sample Entropy are frequently used algorithms recently for calculating the measure of complexity of a time series. A lesser known fact is that there are also accelerated modifications of these two algorithms, namely Fast Approximate Entropy and Fast Sample Entropy. All these algorithms are effectively implemented in the R software package TSEntropies. This paper contains not only an explanation of all these algorithms, but also the principle of their acceleration. Furthermore, the paper contains a description of the functions of this software package and their parameters, as well as simple examples of using this software package to calculate these measures of complexity of an artificial time series and the time series of a complex real-world system represented by the course of supercomputer infrastructure power consumption. These time series were also used to test the speed of this package and to compare its speed with another R package pracma. The results show that TS Entropies are up to 100 times faster than pracma and another important result is that the computational times of the new Fast Approximate Entropy and Fast Sample Entropy algorithms are up to 500 times lower than the computational times of their original versions. At the very end of this paper, the possible use of this software package TS Entropies is proposed.
- Composite Multiscale Partial Cross-Sample Entropy Analysis for Quantifying Intrinsic Similarity of Two Time Series Affected by Common External Factors. By Li, B.; Han, G.; Jiang, S.; Yu, Z. [10]In this paper, the authors propose a new cross-sample entropy, namely the composite multiscale partial cross-sample entropy (CMPCSE), for quantifying the intrinsic similarity of two time series affected by common external factors. First, in order to test the validity of CMPCSE, they apply it to three sets of artificial data. Experimental results show that CMPCSE can accurately measure the intrinsic cross-sample entropy of two simultaneously recorded time series by removing the effects from the third time series. Then, CMPCSE is employed to investigate the partial cross-sample entropy of Shanghai securities composite index (SSEC) and Shenzhen Stock Exchange Component Index (SZSE) by eliminating the effect of Hang Seng Index (HSI). Compared with the composite multiscale cross-sample entropy, the results obtained by CMPCSE show that SSEC and SZSE have stronger similarity. The authors believe that CMPCSE is an effective tool to study intrinsic similarity of two time series. View Full-Text Keywords: composite multiscale partial cross-sample entropy (CMPCSE); multiscale cross-sample entropy (MCSE); time series; stock indices.
- Effects of Tau and Sampling Frequency on the Regularity Analysis of ECG and EEG Signals Using ApEn and SampEn Entropy Estimators. By Espinosa, R.; Talero, J.; Weinstein, A. [11]Electrocardiography (ECG) and electroencephalography (EEG) signals provide clinical information relevant to determining a patient’s health status. The nonlinear analysis of ECG and EEG signals allows for discovering characteristics that could not be found with traditional methods based on amplitude and frequency. Approximate entropy (ApEn) and sampling entropy (SampEn) are nonlinear data analysis algorithms that measure the data’s regularity, and these are used to classify different electrophysiological signals as normal or pathological. Entropy calculation requires setting the parameters r (tolerance threshold), m (immersion dimension), and (time delay), with the last one being related to how the time series is downsampled. In this study, we showed the dependence of ApEn and SampEn on different values of , for ECG and EEG signals with different sampling frequencies (), extracted from a digital repository. We considered four values of (128, 256, 384, and 512 Hz for the ECG signals, and 160, 320, 480, and 640 Hz for the EEG signals) and five values of (from 1 to 5). We performed parametric and nonparametric statistical tests to confirm that the groups of normal and pathological ECG and EEG signals were significantly different (p < 0.05) for each F and value. The separation between the entropy values of regular and irregular signals was variable, demonstrating the dependence of ApEn and SampEn with and . For ECG signals, the separation between the conditions was more robust when using SampEn, the lowest value of , and larger than 1. For EEG signals, the separation between the conditions was more robust when using SampEn with large values of and larger than 1. Therefore, adjusting may be convenient for signals that were acquired with different to ensure a reliable clinical classification. Furthermore, it is useful to set to values larger than 1 to reduce the computational cost.
Acknowledgments
Conflicts of Interest
References
- She, R.; Liu, S.; Fan, P. Attention to the Variation of Probabilistic Events: Information Processing with Message Importance Measure. Entropy 2019, 21, 429. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Useche, J.; Hurtado, R. Melodies as Maximally Disordered Systems under Macroscopic Constraints with Musical Meaning. Entropy 2019, 21, 532. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Duan, Z.; Wang, L.; Mammadov, M.; Lou, H.; Sun, M. Discriminatory Target Learning: Mining Significant Dependence Relationships from Labeled and Unlabeled Data. Entropy 2019, 21, 537. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Long, Y.; Wang, L.; Sun, M. Structure Extension of Tree-Augmented Naive Bayes. Entropy 2019, 21, 721. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- De Araujo, F.; Bejan, L.; Rosso, O.A.; Stosic, T. Permutation Entropy and Statistical Complexity Analysis of Brazilian Agricultural Commodities. Entropy 2019, 21, 1220. [Google Scholar] [CrossRef] [Green Version]
- Granero-Belinchón, C.; Roux, S.; Garnier, N. Information Theory for Non-Stationary Processes with Stationary Increments. Entropy 2019, 21, 1223. [Google Scholar] [CrossRef] [Green Version]
- Baravalle, R.; Montani, F. Higher-Order Cumulants Drive Neuronal Activity Patterns, Inducing UP-DOWN States in Neural Populations. Entropy 2020, 22, 477. [Google Scholar] [CrossRef] [Green Version]
- Schamberg, G.; Chapman, W.; Xie, S.; Coleman, T. Direct and Indirect Effects—An Information Theoretic Perspective. Entropy 2020, 22, 854. [Google Scholar] [CrossRef]
- Tomcˇala, J. New Fast ApEn and SampEn Entropy Algorithms Implementation and Their Application to Supercomputer Power Consumption. Entropy 2020, 22, 863. [Google Scholar]
- Li, B.; Han, G.; Jiang, S.; Yu, Z. Composite Multiscale Partial Cross-Sample Entropy Analysis for Quantifying Intrinsic Similarity of Two Time Series Affected by Common External Factors. Entropy 2020, 22, 1003. [Google Scholar] [CrossRef]
- Espinosa, R.; Talero, J.; Weinstein, A. Effects of Tau and Sampling Frequency on the Regularity Analysis of ECG and EEG Signals Using ApEn and SampEn Entropy Estimators. Entropy 2020, 22, 1298. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Rosso, O.A.; Montani, F. Information Theoretic Measures and Their Applications. Entropy 2020, 22, 1382. https://doi.org/10.3390/e22121382
Rosso OA, Montani F. Information Theoretic Measures and Their Applications. Entropy. 2020; 22(12):1382. https://doi.org/10.3390/e22121382
Chicago/Turabian StyleRosso, Osvaldo A., and Fernando Montani. 2020. "Information Theoretic Measures and Their Applications" Entropy 22, no. 12: 1382. https://doi.org/10.3390/e22121382
APA StyleRosso, O. A., & Montani, F. (2020). Information Theoretic Measures and Their Applications. Entropy, 22(12), 1382. https://doi.org/10.3390/e22121382