entropy-logo

Journal Browser

Journal Browser

Information-Theoretic Data Mining

A special issue of Entropy (ISSN 1099-4300). This special issue belongs to the section "Multidisciplinary Applications".

Deadline for manuscript submissions: closed (31 March 2022) | Viewed by 34372

Special Issue Editor


E-Mail Website
Guest Editor
Faculty of Electrical Engineering and Computer Science, University of Maribor, Koroska 46, 2000 Maribor, Slovenia
Interests: data mining; machine learning; decision trees; data engineering; big data; security; privacy

Special Issue Information

Dear Colleagues,

Predictions are difficult, especially those about the future. Data mining is the process of converting data to knowledge using methods at the crossroads of machine learning, artificial intelligence, statistics, and database systems. Data science and data engineering help create knowledge in various forms of models, which in turn are used to finding anomalies, patterns, and correlations within large amounts of data to help predict the future or explain the present.

Currently, data mining is implemented on large scales to help solve business and societal problems. Information-theoretic data mining plays a special role due to its solid foundations in information theory. The field is slowly maturing with contributions from many fields of information and computer science and related sciences. This interdisciplinary characteristic leads to different viewpoints, different implementations, and different approaches.

Therefore, contributions are being solicited to this Special Issue on the many faces of information theory in data mining, presenting both theoretical and practical aspects of developing and using data mining approaches.

Prof. Boštjan Brumen
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Entropy is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • data mining
  • data engineering
  • big data
  • data science
  • machine learning
  • artificial intelligence
  • knowledge extraction
  • information theory

Published Papers (10 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

15 pages, 1008 KiB  
Article
A Two-Parameter Fractional Tsallis Decision Tree
by Jazmín S. De la Cruz-García, Juan Bory-Reyes and Aldo Ramirez-Arellano
Entropy 2022, 24(5), 572; https://doi.org/10.3390/e24050572 - 19 Apr 2022
Cited by 6 | Viewed by 1877
Abstract
Decision trees are decision support data mining tools that create, as the name suggests, a tree-like model. The classical C4.5 decision tree, based on the Shannon entropy, is a simple algorithm to calculate the gain ratio and then split the attributes based on [...] Read more.
Decision trees are decision support data mining tools that create, as the name suggests, a tree-like model. The classical C4.5 decision tree, based on the Shannon entropy, is a simple algorithm to calculate the gain ratio and then split the attributes based on this entropy measure. Tsallis and Renyi entropies (instead of Shannon) can be employed to generate a decision tree with better results. In practice, the entropic index parameter of these entropies is tuned to outperform the classical decision trees. However, this process is carried out by testing a range of values for a given database, which is time-consuming and unfeasible for massive data. This paper introduces a decision tree based on a two-parameter fractional Tsallis entropy. We propose a constructionist approach to the representation of databases as complex networks that enable us an efficient computation of the parameters of this entropy using the box-covering algorithm and renormalization of the complex network. The experimental results support the conclusion that the two-parameter fractional Tsallis entropy is a more sensitive measure than parametric Renyi, Tsallis, and Gini index precedents for a decision tree classifier. Full article
(This article belongs to the Special Issue Information-Theoretic Data Mining)
Show Figures

Figure 1

15 pages, 851 KiB  
Article
Language Representation Models: An Overview
by Thorben Schomacker and Marina Tropmann-Frick
Entropy 2021, 23(11), 1422; https://doi.org/10.3390/e23111422 - 28 Oct 2021
Cited by 10 | Viewed by 5457
Abstract
In the last few decades, text mining has been used to extract knowledge from free texts. Applying neural networks and deep learning to natural language processing (NLP) tasks has led to many accomplishments for real-world language problems over the years. The developments of [...] Read more.
In the last few decades, text mining has been used to extract knowledge from free texts. Applying neural networks and deep learning to natural language processing (NLP) tasks has led to many accomplishments for real-world language problems over the years. The developments of the last five years have resulted in techniques that have allowed for the practical application of transfer learning in NLP. The advances in the field have been substantial, and the milestone of outperforming human baseline performance based on the general language understanding evaluation has been achieved. This paper implements a targeted literature review to outline, describe, explain, and put into context the crucial techniques that helped achieve this milestone. The research presented here is a targeted review of neural language models that present vital steps towards a general language representation model. Full article
(This article belongs to the Special Issue Information-Theoretic Data Mining)
Show Figures

Figure 1

16 pages, 2611 KiB  
Article
Information Bottleneck Theory Based Exploration of Cascade Learning
by Xin Du, Katayoun Farrahi and Mahesan Niranjan
Entropy 2021, 23(10), 1360; https://doi.org/10.3390/e23101360 - 18 Oct 2021
Cited by 1 | Viewed by 2596
Abstract
In solving challenging pattern recognition problems, deep neural networks have shown excellent performance by forming powerful mappings between inputs and targets, learning representations (features) and making subsequent predictions. A recent tool to help understand how representations are formed is based on observing the [...] Read more.
In solving challenging pattern recognition problems, deep neural networks have shown excellent performance by forming powerful mappings between inputs and targets, learning representations (features) and making subsequent predictions. A recent tool to help understand how representations are formed is based on observing the dynamics of learning on an information plane using mutual information, linking the input to the representation (I(X;T)) and the representation to the target (I(T;Y)). In this paper, we use an information theoretical approach to understand how Cascade Learning (CL), a method to train deep neural networks layer-by-layer, learns representations, as CL has shown comparable results while saving computation and memory costs. We observe that performance is not linked to information–compression, which differs from observation on End-to-End (E2E) learning. Additionally, CL can inherit information about targets, and gradually specialise extracted features layer-by-layer. We evaluate this effect by proposing an information transition ratio, I(T;Y)/I(X;T), and show that it can serve as a useful heuristic in setting the depth of a neural network that achieves satisfactory accuracy of classification. Full article
(This article belongs to the Special Issue Information-Theoretic Data Mining)
Show Figures

Figure 1

16 pages, 1577 KiB  
Article
Discriminable Multi-Label Attribute Selection for Pre-Course Student Performance Prediction
by Jie Yang, Shimin Hu, Qichao Wang and Simon Fong
Entropy 2021, 23(10), 1252; https://doi.org/10.3390/e23101252 - 26 Sep 2021
Cited by 3 | Viewed by 1842
Abstract
The university curriculum is a systematic and organic study complex with some immediate associated steps; the initial learning of each semester’s course is crucial, and significantly impacts the learning process of subsequent courses and further studies. However, the low teacher–student ratio makes it [...] Read more.
The university curriculum is a systematic and organic study complex with some immediate associated steps; the initial learning of each semester’s course is crucial, and significantly impacts the learning process of subsequent courses and further studies. However, the low teacher–student ratio makes it difficult for teachers to consistently follow up on the detail-oriented learning situation of individual students. The extant learning early warning system is committed to automatically detecting whether students have potential difficulties—or even the risk of failing, or non-pass reports—before starting the course. Previous related research has the following three problems: first of all, it mainly focused on e-learning platforms and relied on online activity data, which was not suitable for traditional teaching scenarios; secondly, most current methods can only proffer predictions when the course is in progress, or even approaching the end; thirdly, few studies have focused on the feature redundancy in these learning data. Aiming at the traditional classroom teaching scenario, this paper transforms the pre-class student performance prediction problem into a multi-label learning model, and uses the attribute reduction method to scientifically streamline the characteristic information of the courses taken and explore the important relationship between the characteristics of the previously learned courses and the attributes of the courses to be taken, in order to detect high-risk students in each course before the course begins. Extensive experiments were conducted on 10 real-world datasets, and the results proved that the proposed approach achieves better performance than most other advanced methods in multi-label classification evaluation metrics. Full article
(This article belongs to the Special Issue Information-Theoretic Data Mining)
Show Figures

Figure 1

17 pages, 4292 KiB  
Article
DFTSA-Net: Deep Feature Transfer-Based Stacked Autoencoder Network for DME Diagnosis
by Ghada Atteia, Nagwan Abdel Samee and Hassan Zohair Hassan
Entropy 2021, 23(10), 1251; https://doi.org/10.3390/e23101251 - 26 Sep 2021
Cited by 18 | Viewed by 2600
Abstract
Diabetic macular edema (DME) is the most common cause of irreversible vision loss in diabetes patients. Early diagnosis of DME is necessary for effective treatment of the disease. Visual detection of DME in retinal screening images by ophthalmologists is a time-consuming process. Recently, [...] Read more.
Diabetic macular edema (DME) is the most common cause of irreversible vision loss in diabetes patients. Early diagnosis of DME is necessary for effective treatment of the disease. Visual detection of DME in retinal screening images by ophthalmologists is a time-consuming process. Recently, many computer-aided diagnosis systems have been developed to assist doctors by detecting DME automatically. In this paper, a new deep feature transfer-based stacked autoencoder neural network system is proposed for the automatic diagnosis of DME in fundus images. The proposed system integrates the power of pretrained convolutional neural networks as automatic feature extractors with the power of stacked autoencoders in feature selection and classification. Moreover, the system enables extracting a large set of features from a small input dataset using four standard pretrained deep networks: ResNet-50, SqueezeNet, Inception-v3, and GoogLeNet. The most informative features are then selected by a stacked autoencoder neural network. The stacked network is trained in a semi-supervised manner and is used for the classification of DME. It is found that the introduced system achieves a maximum classification accuracy of 96.8%, sensitivity of 97.5%, and specificity of 95.5%. The proposed system shows a superior performance over the original pretrained network classifiers and state-of-the-art findings. Full article
(This article belongs to the Special Issue Information-Theoretic Data Mining)
Show Figures

Figure 1

21 pages, 349 KiB  
Article
Overview of Machine Learning Process Modelling
by Boštjan Brumen, Aleš Černezel and Leon Bošnjak
Entropy 2021, 23(9), 1123; https://doi.org/10.3390/e23091123 - 28 Aug 2021
Cited by 6 | Viewed by 2315
Abstract
Much research has been conducted in the area of machine learning algorithms; however, the question of a general description of an artificial learner’s (empirical) performance has mainly remained unanswered. A general, restrictions-free theory on its performance has not been developed yet. In this [...] Read more.
Much research has been conducted in the area of machine learning algorithms; however, the question of a general description of an artificial learner’s (empirical) performance has mainly remained unanswered. A general, restrictions-free theory on its performance has not been developed yet. In this study, we investigate which function most appropriately describes learning curves produced by several machine learning algorithms, and how well these curves can predict the future performance of an algorithm. Decision trees, neural networks, Naïve Bayes, and Support Vector Machines were applied to 130 datasets from publicly available repositories. Three different functions (power, logarithmic, and exponential) were fit to the measured outputs. Using rigorous statistical methods and two measures for the goodness-of-fit, the power law model proved to be the most appropriate model for describing the learning curve produced by the algorithms in terms of goodness-of-fit and prediction capabilities. The presented study, first of its kind in scale and rigour, provides results (and methods) that can be used to assess the performance of novel or existing artificial learners and forecast their ‘capacity to learn’ based on the amount of available or desired data. Full article
(This article belongs to the Special Issue Information-Theoretic Data Mining)
Show Figures

Figure 1

28 pages, 6834 KiB  
Article
Status Set Sequential Pattern Mining Considering Time Windows and Periodic Analysis of Patterns
by Shenghan Zhou, Houxiang Liu, Bang Chen, Wenkui Hou, Xinpeng Ji, Yue Zhang, Wenbing Chang and Yiyong Xiao
Entropy 2021, 23(6), 738; https://doi.org/10.3390/e23060738 - 11 Jun 2021
Cited by 2 | Viewed by 2172
Abstract
The traditional sequential pattern mining method is carried out considering the whole time period and often ignores the sequential patterns that only occur in local time windows, as well as possible periodicity. Therefore, in order to overcome the limitations of traditional methods, this [...] Read more.
The traditional sequential pattern mining method is carried out considering the whole time period and often ignores the sequential patterns that only occur in local time windows, as well as possible periodicity. Therefore, in order to overcome the limitations of traditional methods, this paper proposes status set sequential pattern mining with time windows (SSPMTW). In contrast to traditional methods, the item status is considered, and time windows, minimum confidence, minimum coverage, minimum factor set ratios and other constraints are added to mine more valuable rules in local time windows. The periodicity of these rules is also analyzed. According to the proposed method, this paper improves the Apriori algorithm, proposes the TW-Apriori algorithm, and explains the basic idea of the algorithm. Then, the feasibility, validity and efficiency of the proposed method and algorithm are verified by small-scale and large-scale examples. In a large-scale numerical example solution, the influence of various constraints on the mining results is analyzed. Finally, the solution results of SSPM and SSPMTW are compared and analyzed, and it is suggested that SSPMTW can excavate the laws existing in local time windows and analyze the periodicity of the laws, which solves the problem of SSPM ignoring the laws existing in local time windows and overcomes the limitations of traditional sequential pattern mining algorithms. In addition, the rules mined by SSPMTW reduce the entropy of the system. Full article
(This article belongs to the Special Issue Information-Theoretic Data Mining)
Show Figures

Figure 1

20 pages, 1732 KiB  
Article
Energy Based Logic Mining Analysis with Hopfield Neural Network for Recruitment Evaluation
by Siti Zulaikha Mohd Jamaludin, Mohd Shareduwan Mohd Kasihmuddin, Ahmad Izani Md Ismail, Mohd. Asyraf Mansor and Md Faisal Md Basir
Entropy 2021, 23(1), 40; https://doi.org/10.3390/e23010040 - 30 Dec 2020
Cited by 20 | Viewed by 2416
Abstract
An effective recruitment evaluation plays an important role in the success of companies, industries and institutions. In order to obtain insight on the relationship between factors contributing to systematic recruitment, the artificial neural network and logic mining approach can be adopted as a [...] Read more.
An effective recruitment evaluation plays an important role in the success of companies, industries and institutions. In order to obtain insight on the relationship between factors contributing to systematic recruitment, the artificial neural network and logic mining approach can be adopted as a data extraction model. In this work, an energy based k satisfiability reverse analysis incorporating a Hopfield neural network is proposed to extract the relationship between the factors in an electronic (E) recruitment data set. The attributes of E recruitment data set are represented in the form of k satisfiability logical representation. We proposed the logical representation to 2-satisfiability and 3-satisfiability representation, which are regarded as a systematic logical representation. The E recruitment data set is obtained from an insurance agency in Malaysia, with the aim of extracting the relationship of dominant attributes that contribute to positive recruitment among the potential candidates. Thus, our approach is evaluated according to correctness, robustness and accuracy of the induced logic obtained, corresponding to the E recruitment data. According to the experimental simulations with different number of neurons, the findings indicated the effectiveness and robustness of energy based k satisfiability reverse analysis with Hopfield neural network in extracting the dominant attributes toward positive recruitment in the insurance agency in Malaysia. Full article
(This article belongs to the Special Issue Information-Theoretic Data Mining)
Show Figures

Figure 1

25 pages, 6151 KiB  
Article
Amazon Employees Resources Access Data Extraction via Clonal Selection Algorithm and Logic Mining Approach
by Nur Ezlin Zamri, Mohd. Asyraf Mansor, Mohd Shareduwan Mohd Kasihmuddin, Alyaa Alway, Siti Zulaikha Mohd Jamaludin and Shehab Abdulhabib Alzaeemi
Entropy 2020, 22(6), 596; https://doi.org/10.3390/e22060596 - 27 May 2020
Cited by 40 | Viewed by 6284
Abstract
Amazon.com Inc. seeks alternative ways to improve manual transactions system of granting employees resources access in the field of data science. The work constructs a modified Artificial Neural Network (ANN) by incorporating a Discrete Hopfield Neural Network (DHNN) and Clonal Selection Algorithm (CSA) [...] Read more.
Amazon.com Inc. seeks alternative ways to improve manual transactions system of granting employees resources access in the field of data science. The work constructs a modified Artificial Neural Network (ANN) by incorporating a Discrete Hopfield Neural Network (DHNN) and Clonal Selection Algorithm (CSA) with 3-Satisfiability (3-SAT) logic to initiate an Artificial Intelligence (AI) model that executes optimization tasks for industrial data. The selection of 3-SAT logic is vital in data mining to represent entries of Amazon Employees Resources Access (AERA) via information theory. The proposed model employs CSA to improve the learning phase of DHNN by capitalizing features of CSA such as hypermutation and cloning process. This resulting the formation of the proposed model, as an alternative machine learning model to identify factors that should be prioritized in the approval of employees resources applications. Subsequently, reverse analysis method (SATRA) is integrated into our proposed model to extract the relationship of AERA entries based on logical representation. The study will be presented by implementing simulated, benchmark and AERA data sets with multiple performance evaluation metrics. Based on the findings, the proposed model outperformed the other existing methods in AERA data extraction. Full article
(This article belongs to the Special Issue Information-Theoretic Data Mining)
Show Figures

Figure 1

26 pages, 3769 KiB  
Article
Machine Learning Based Automated Segmentation and Hybrid Feature Analysis for Diabetic Retinopathy Classification Using Fundus Image
by Aqib Ali, Salman Qadri, Wali Khan Mashwani, Wiyada Kumam, Poom Kumam, Samreen Naeem, Atila Goktas, Farrukh Jamal, Christophe Chesneau, Sania Anam and Muhammad Sulaiman
Entropy 2020, 22(5), 567; https://doi.org/10.3390/e22050567 - 19 May 2020
Cited by 44 | Viewed by 5431
Abstract
The object of this study was to demonstrate the ability of machine learning (ML) methods for the segmentation and classification of diabetic retinopathy (DR). Two-dimensional (2D) retinal fundus (RF) images were used. The datasets of DR—that is, the mild, moderate, non-proliferative, proliferative, and [...] Read more.
The object of this study was to demonstrate the ability of machine learning (ML) methods for the segmentation and classification of diabetic retinopathy (DR). Two-dimensional (2D) retinal fundus (RF) images were used. The datasets of DR—that is, the mild, moderate, non-proliferative, proliferative, and normal human eye ones—were acquired from 500 patients at Bahawal Victoria Hospital (BVH), Bahawalpur, Pakistan. Five hundred RF datasets (sized 256 × 256) for each DR stage and a total of 2500 (500 × 5) datasets of the five DR stages were acquired. This research introduces the novel clustering-based automated region growing framework. For texture analysis, four types of features—histogram (H), wavelet (W), co-occurrence matrix (COM) and run-length matrix (RLM)—were extracted, and various ML classifiers were employed, achieving 77.67%, 80%, 89.87%, and 96.33% classification accuracies, respectively. To improve classification accuracy, a fused hybrid-feature dataset was generated by applying the data fusion approach. From each image, 245 pieces of hybrid feature data (H, W, COM, and RLM) were observed, while 13 optimized features were selected after applying four different feature selection techniques, namely Fisher, correlation-based feature selection, mutual information, and probability of error plus average correlation. Five ML classifiers named sequential minimal optimization (SMO), logistic (Lg), multi-layer perceptron (MLP), logistic model tree (LMT), and simple logistic (SLg) were deployed on selected optimized features (using 10-fold cross-validation), and they showed considerably high classification accuracies of 98.53%, 99%, 99.66%, 99.73%, and 99.73%, respectively. Full article
(This article belongs to the Special Issue Information-Theoretic Data Mining)
Show Figures

Figure 1

Back to TopTop