Special Issue "Entropy-based Data Mining"

A special issue of Entropy (ISSN 1099-4300). This special issue belongs to the section "Information Theory".

Deadline for manuscript submissions: 31 January 2018

Dr. Massimiliano Zanin

Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Spain
Interests: complex systems; complex networks; network science; data mining
Dr. Ernestina Menasalvas

Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Spain
Interests: Big Data; Predictive Analytics; Data Mining; Data stream Mining

Dear Colleagues,

Entropy and data mining are not so distant as concepts as it may initially appear. They both share a common idea: Information contained in data presents some regularities, or structures, which we ought to understand in order to better understand the system under study. If entropy aims at assessing the presence of these structures, data mining goes one step further, by extracting and making them, explicitly, for further use; however, it is clear that the former is a first and necessary step for the latter.

Not surprising, entropy and data mining have had an intermingled history. Specifically, entropy has been used extensively to define and support data mining algorithms. Examples include the use of entropy metrics as splitting and pruning criteria in Decision Trees; as a mean to weight distances in high-dimensional k-mean clustering algorithms; to select features subsets in classification ensembles; and as a criterion to combine multiple classifiers. Entropy has also buttressed the creation of data mining models, as in maximum entropy classifiers, implementations of the multinomial logistic regression concept, and in outlier detection. On the other hand, entropy has also been used as a way to create new features from data, in order to feed standard data mining algorithms. For instance, different types of entropies have been used to describe time series, e.g., to distinguish between normal and ictal brain dynamics, or to assess heart rate complexity; to describe symbolic sequences, to then compare a set of them, as in DNA and in the identification of protein coding and non-coding sequences; or to assess the complexity of graphs and networks, in order to then distinguish and classify them.

This Special Issue seeks contributions clarifying and strengthening the relationship between these two research fields, with a special focus on, but not limited to, the improvement of data-mining algorithms through the entropy concept, and on the application of entropy in real-world data-mining tasks. We welcome theoretical, as well as experiment works, original research and review papers.

Dr. Massimiliano Zanin
Dr. Ernestina Menasalvas
  • Data mining algorithms
  • Classification
  • Clustering
  • Feature selection
  • Time series analysis
  • Network entropy

