entropy-logo

Journal Browser

Journal Browser

Representation Learning: Theory, Applications and Ethical Issues

A special issue of Entropy (ISSN 1099-4300). This special issue belongs to the section "Information Theory, Probability and Statistics".

Deadline for manuscript submissions: closed (30 June 2021) | Viewed by 27114

Special Issue Editors


E-Mail Website
Guest Editor
Department of Mathematics, University of Padova, via Trieste 63, 35121 Padova, Italy
Interests: kernel methods; preference learning; recommender systems; multiple kernel learning; interpretable machine learning; deep neural networks
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Mathematics, University of Padova, via Trieste 63, 35121 Padova, Italy
Interests: recommender systems; kernel methods; interpretable machine learning; security/privacy in machine learning; deep neural networks
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

The representation problem has always been at the core of machine learning. Finding a good data representation is the common denominator of many machine learning subtopics, such as feature selection, kernel learning, and deep learning. The recent rise of deep learning technologies has opened up new and fascinating possibilities for researchers in many fields. However, deep networks often fall short when it comes to being interpreted or explained. Hence, in addition to the effectiveness of a representation, there is the need to face many related problems, for example, interpretability, robustness, and fairness.

The purpose of this Special Issue is to highlight the state-of-the-art in representation learning both from a theoretical and a practical perspective. Possible topics include but are not limited to the following:

  • Deep and shallow representation learning;
  • Generative and adversarial representation learning;
  • Robust representations for security;
  • Representation learning for fair and ethical learning;
  • Representation learning for interpretable machine learning;
  • Representation learning in other domains, e.g., recommender systems, natural language processing, cybersecurity, process mining.

Prof. Fabio Aiolli
Dr. Mirko Polato
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Entropy is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • representation learning
  • deep learning
  • kernel learning
  • interpretability
  • explainability
  • fairness

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

14 pages, 1467 KiB  
Article
A Manifold Learning Perspective on Representation Learning: Learning Decoder and Representations without an Encoder
by Viktoria Schuster and Anders Krogh
Entropy 2021, 23(11), 1403; https://doi.org/10.3390/e23111403 - 25 Oct 2021
Cited by 2 | Viewed by 2430
Abstract
Autoencoders are commonly used in representation learning. They consist of an encoder and a decoder, which provide a straightforward method to map n-dimensional data in input space to a lower m-dimensional representation space and back. The decoder itself defines an m [...] Read more.
Autoencoders are commonly used in representation learning. They consist of an encoder and a decoder, which provide a straightforward method to map n-dimensional data in input space to a lower m-dimensional representation space and back. The decoder itself defines an m-dimensional manifold in input space. Inspired by manifold learning, we showed that the decoder can be trained on its own by learning the representations of the training samples along with the decoder weights using gradient descent. A sum-of-squares loss then corresponds to optimizing the manifold to have the smallest Euclidean distance to the training samples, and similarly for other loss functions. We derived expressions for the number of samples needed to specify the encoder and decoder and showed that the decoder generally requires much fewer training samples to be well-specified compared to the encoder. We discuss the training of autoencoders in this perspective and relate it to previous work in the field that uses noisy training examples and other types of regularization. On the natural image data sets MNIST and CIFAR10, we demonstrated that the decoder is much better suited to learn a low-dimensional representation, especially when trained on small data sets. Using simulated gene regulatory data, we further showed that the decoder alone leads to better generalization and meaningful representations. Our approach of training the decoder alone facilitates representation learning even on small data sets and can lead to improved training of autoencoders. We hope that the simple analyses presented will also contribute to an improved conceptual understanding of representation learning. Full article
(This article belongs to the Special Issue Representation Learning: Theory, Applications and Ethical Issues)
Show Figures

Figure 1

21 pages, 1400 KiB  
Article
The Problem of Fairness in Synthetic Healthcare Data
by Karan Bhanot, Miao Qi, John S. Erickson, Isabelle Guyon and Kristin P. Bennett
Entropy 2021, 23(9), 1165; https://doi.org/10.3390/e23091165 - 4 Sep 2021
Cited by 32 | Viewed by 5378
Abstract
Access to healthcare data such as electronic health records (EHR) is often restricted by laws established to protect patient privacy. These restrictions hinder the reproducibility of existing results based on private healthcare data and also limit new research. Synthetically-generated healthcare data solve this [...] Read more.
Access to healthcare data such as electronic health records (EHR) is often restricted by laws established to protect patient privacy. These restrictions hinder the reproducibility of existing results based on private healthcare data and also limit new research. Synthetically-generated healthcare data solve this problem by preserving privacy and enabling researchers and policymakers to drive decisions and methods based on realistic data. Healthcare data can include information about multiple in- and out- patient visits of patients, making it a time-series dataset which is often influenced by protected attributes like age, gender, race etc. The COVID-19 pandemic has exacerbated health inequities, with certain subgroups experiencing poorer outcomes and less access to healthcare. To combat these inequities, synthetic data must “fairly” represent diverse minority subgroups such that the conclusions drawn on synthetic data are correct and the results can be generalized to real data. In this article, we develop two fairness metrics for synthetic data, and analyze all subgroups defined by protected attributes to analyze the bias in three published synthetic research datasets. These covariate-level disparity metrics revealed that synthetic data may not be representative at the univariate and multivariate subgroup-levels and thus, fairness should be addressed when developing data generation methods. We discuss the need for measuring fairness in synthetic healthcare data to enable the development of robust machine learning models to create more equitable synthetic healthcare datasets. Full article
(This article belongs to the Special Issue Representation Learning: Theory, Applications and Ethical Issues)
Show Figures

Figure 1

21 pages, 1809 KiB  
Article
Occlusion-Based Explanations in Deep Recurrent Models for Biomedical Signals
by Michele Resta, Anna Monreale and Davide Bacciu
Entropy 2021, 23(8), 1064; https://doi.org/10.3390/e23081064 - 17 Aug 2021
Cited by 10 | Viewed by 3533
Abstract
The biomedical field is characterized by an ever-increasing production of sequential data, which often come in the form of biosignals capturing the time-evolution of physiological processes, such as blood pressure and brain activity. This has motivated a large body of research dealing with [...] Read more.
The biomedical field is characterized by an ever-increasing production of sequential data, which often come in the form of biosignals capturing the time-evolution of physiological processes, such as blood pressure and brain activity. This has motivated a large body of research dealing with the development of machine learning techniques for the predictive analysis of such biosignals. Unfortunately, in high-stakes decision making, such as clinical diagnosis, the opacity of machine learning models becomes a crucial aspect to be addressed in order to increase the trust and adoption of AI technology. In this paper, we propose a model agnostic explanation method, based on occlusion, that enables the learning of the input’s influence on the model predictions. We specifically target problems involving the predictive analysis of time-series data and the models that are typically used to deal with data of such nature, i.e., recurrent neural networks. Our approach is able to provide two different kinds of explanations: one suitable for technical experts, who need to verify the quality and correctness of machine learning models, and one suited to physicians, who need to understand the rationale underlying the prediction to make aware decisions. A wide experimentation on different physiological data demonstrates the effectiveness of our approach both in classification and regression tasks. Full article
(This article belongs to the Special Issue Representation Learning: Theory, Applications and Ethical Issues)
Show Figures

Figure 1

26 pages, 5039 KiB  
Article
Toward Learning Trustworthily from Data Combining Privacy, Fairness, and Explainability: An Application to Face Recognition
by Danilo Franco, Luca Oneto, Nicolò Navarin and Davide Anguita
Entropy 2021, 23(8), 1047; https://doi.org/10.3390/e23081047 - 14 Aug 2021
Cited by 10 | Viewed by 3371
Abstract
In many decision-making scenarios, ranging from recreational activities to healthcare and policing, the use of artificial intelligence coupled with the ability to learn from historical data is becoming ubiquitous. This widespread adoption of automated systems is accompanied by the increasing concerns regarding their [...] Read more.
In many decision-making scenarios, ranging from recreational activities to healthcare and policing, the use of artificial intelligence coupled with the ability to learn from historical data is becoming ubiquitous. This widespread adoption of automated systems is accompanied by the increasing concerns regarding their ethical implications. Fundamental rights, such as the ones that require the preservation of privacy, do not discriminate based on sensible attributes (e.g., gender, ethnicity, political/sexual orientation), or require one to provide an explanation for a decision, are daily undermined by the use of increasingly complex and less understandable yet more accurate learning algorithms. For this purpose, in this work, we work toward the development of systems able to ensure trustworthiness by delivering privacy, fairness, and explainability by design. In particular, we show that it is possible to simultaneously learn from data while preserving the privacy of the individuals thanks to the use of Homomorphic Encryption, ensuring fairness by learning a fair representation from the data, and ensuring explainable decisions with local and global explanations without compromising the accuracy of the final models. We test our approach on a widespread but still controversial application, namely face recognition, using the recent FairFace dataset to prove the validity of our approach. Full article
(This article belongs to the Special Issue Representation Learning: Theory, Applications and Ethical Issues)
Show Figures

Figure 1

18 pages, 492 KiB  
Article
Propositional Kernels
by Mirko Polato and Fabio Aiolli
Entropy 2021, 23(8), 1020; https://doi.org/10.3390/e23081020 - 7 Aug 2021
Viewed by 1539
Abstract
The pervasive presence of artificial intelligence (AI) in our everyday life has nourished the pursuit of explainable AI. Since the dawn of AI, logic has been widely used to express, in a human-friendly fashion, the internal process that led an (intelligent) system to [...] Read more.
The pervasive presence of artificial intelligence (AI) in our everyday life has nourished the pursuit of explainable AI. Since the dawn of AI, logic has been widely used to express, in a human-friendly fashion, the internal process that led an (intelligent) system to deliver a specific output. In this paper, we take a step forward in this direction by introducing a novel family of kernels, called Propositional kernels, that construct feature spaces that are easy to interpret. Specifically, Propositional Kernel functions compute the similarity between two binary vectors in a feature space composed of logical propositions of a fixed form. The Propositional kernel framework improves upon the recent Boolean kernel framework by providing more expressive kernels. In addition to the theoretical definitions, we also provide an algorithm (and the source code) to efficiently construct any propositional kernel. An extensive empirical evaluation shows the effectiveness of Propositional kernels on several artificial and benchmark categorical data sets. Full article
(This article belongs to the Special Issue Representation Learning: Theory, Applications and Ethical Issues)
Show Figures

Figure 1

22 pages, 1936 KiB  
Article
Feature Selection for Recommender Systems with Quantum Computing
by Riccardo Nembrini, Maurizio Ferrari Dacrema and Paolo Cremonesi
Entropy 2021, 23(8), 970; https://doi.org/10.3390/e23080970 - 28 Jul 2021
Cited by 19 | Viewed by 4937
Abstract
The promise of quantum computing to open new unexplored possibilities in several scientific fields has been long discussed, but until recently the lack of a functional quantum computer has confined this discussion mostly to theoretical algorithmic papers. It was only in the last [...] Read more.
The promise of quantum computing to open new unexplored possibilities in several scientific fields has been long discussed, but until recently the lack of a functional quantum computer has confined this discussion mostly to theoretical algorithmic papers. It was only in the last few years that small but functional quantum computers have become available to the broader research community. One paradigm in particular, quantum annealing, can be used to sample optimal solutions for a number of NP-hard optimization problems represented with classical operations research tools, providing an easy access to the potential of this emerging technology. One of the tasks that most naturally fits in this mathematical formulation is feature selection. In this paper, we investigate how to design a hybrid feature selection algorithm for recommender systems that leverages the domain knowledge and behavior hidden in the user interactions data. We represent the feature selection as an optimization problem and solve it on a real quantum computer, provided by D-Wave. The results indicate that the proposed approach is effective in selecting a limited set of important features and that quantum computers are becoming powerful enough to enter the wider realm of applied science. Full article
(This article belongs to the Special Issue Representation Learning: Theory, Applications and Ethical Issues)
Show Figures

Figure 1

18 pages, 5333 KiB  
Article
Learning Ordinal Embedding from Sets
by Aïssatou Diallo and Johannes Fürnkranz
Entropy 2021, 23(8), 964; https://doi.org/10.3390/e23080964 - 27 Jul 2021
Viewed by 1705
Abstract
Ordinal embedding is the task of computing a meaningful multidimensional representation of objects, for which only qualitative constraints on their distance functions are known. In particular, we consider comparisons of the form “Which object from the pair (j,k) is [...] Read more.
Ordinal embedding is the task of computing a meaningful multidimensional representation of objects, for which only qualitative constraints on their distance functions are known. In particular, we consider comparisons of the form “Which object from the pair (j,k) is more similar to object i?”. In this paper, we generalize this framework to the case where the ordinal constraints are not given at the level of individual points, but at the level of sets, and propose a distributional triplet embedding approach in a scalable learning framework. We show that the query complexity of our approach is on par with the single-item approach. Without having access to features of the items to be embedded, we show the applicability of our model on toy datasets for the task of reconstruction and demonstrate the validity of the obtained embeddings in experiments on synthetic and real-world datasets. Full article
(This article belongs to the Special Issue Representation Learning: Theory, Applications and Ethical Issues)
Show Figures

Figure 1

14 pages, 1627 KiB  
Article
Learning Numerosity Representations with Transformers: Number Generation Tasks and Out-of-Distribution Generalization
by Tommaso Boccato, Alberto Testolin and Marco Zorzi
Entropy 2021, 23(7), 857; https://doi.org/10.3390/e23070857 - 3 Jul 2021
Cited by 2 | Viewed by 2276
Abstract
One of the most rapidly advancing areas of deep learning research aims at creating models that learn to disentangle the latent factors of variation from a data distribution. However, modeling joint probability mass functions is usually prohibitive, which motivates the use of conditional [...] Read more.
One of the most rapidly advancing areas of deep learning research aims at creating models that learn to disentangle the latent factors of variation from a data distribution. However, modeling joint probability mass functions is usually prohibitive, which motivates the use of conditional models assuming that some information is given as input. In the domain of numerical cognition, deep learning architectures have successfully demonstrated that approximate numerosity representations can emerge in multi-layer networks that build latent representations of a set of images with a varying number of items. However, existing models have focused on tasks requiring to conditionally estimate numerosity information from a given image. Here, we focus on a set of much more challenging tasks, which require to conditionally generate synthetic images containing a given number of items. We show that attention-based architectures operating at the pixel level can learn to produce well-formed images approximately containing a specific number of items, even when the target numerosity was not present in the training distribution. Full article
(This article belongs to the Special Issue Representation Learning: Theory, Applications and Ethical Issues)
Show Figures

Figure 1

Back to TopTop