Special Issue "Information-Theoretic Methods for Deep Learning Based Data Acquisition, Analysis and Security"

A special issue of Entropy (ISSN 1099-4300). This special issue belongs to the section "Information Theory, Probability and Statistics".

Deadline for manuscript submissions: closed (15 October 2020) | Viewed by 7972

Special Issue Editors

Prof. Dr. Slava Voloshynovskiy
E-Mail Website
Guest Editor
Stochastic Information Processing Group, Department of Computer Science, Faculty of Science, University of Geneva, Geneva, Switzerland
Interests: information-theoretic machine learning; image processing; security and privacy
Dr. Benedetta Tondi
E-Mail Website
Guest Editor
Visual Information Processing and Protection Group, Department of Information Engineering and Mathematics, University of Siena, Siena, Italy
Interests: adversarial machine learning; information theory; adversarial signal processing; multimedia forensics

Special Issue Information

Dear Colleagues,

The recent advancement of machine learning techniques in general and deep learning (DL) in particular recalls a necessity to carefully rethink many traditional approaches in data acquisition, analysis, processing, and security. At the same time, deep learning, as a glorified signal processing tool, lacks a solid information-theoretical basis and strong connections with the fundamental information-theoretic results in channel and source coding, hypothesis testing, estimation, and security. The goal of this Special Issue is to link deep learning techniques with information theory and thus create a basis for theoretically explainable machine learning and interpretable deep learning solutions.

We would like to welcome original works addressing the following:

  • New methods for data acquisition, including sampling, when the acquisition/imaging operator is optimized on the statistics of training data and deep reconstruction algorithms (e.g., DL for learning optimized generational methods, DL-based adaptation of sensor planning for high resolution image acquisition and reconstruction in sensor networks); a special interest is in new approaches establishing the fundamental links between information-theoretic limits and characteristics of imaging systems coupled with reconstruction algorithms considered as optimized encoder–decoder pairs. Contributions are welcome to cover the broad spectrum of problems, such as restoration from blurred images and videos, reconstruction from sparsely sampled data in transform domains, demosaicing, super-resolution, and inpainting, etc;
  • New methods of data analysis bridging information-theoretic methods with deep learning systems and promoting explainable machine learning, especially in unsupervised settings; this concerns the information-theoretic analysis of learned features and filters. One of the main goals is to better characterize the impact of the training data and clearly understand which portion of training data contributes to the success of the derived solutions. Possible topics include the exploration of the link between DL and variational inference, the information bottleneck paradigm, and anomaly detection;
  • New approaches to secure machine learning to reduce vulnerability to adversarial attacks, thus bridging information-theoretic and cryptographic principles with machine learning; in this way, it is possible to characterize improved security in terms of an ‘information’ advantage of the defender over the attacker. Particular interest is also devoted to approaches pertaining to the development of information theoretical methods to measure the vulnerability of systems to attack;
  • New techniques for privacy preserving learning extending the limits of current centralized architectures (based on a single classifier having the access to the training data from all classes) to distributed and federated architectures with a partial and limited access to training data.

This Special Issue should serve as a platform for multi-disciplinary researchers interested in sharing their results with other communities using similar techniques. All submitted manuscripts will be subject to peer review, and accepted papers will be available via open access. We welcome the submission of extended conference papers with a clear justification of all extensions with respect to previously published works.

Prof. Dr. Slava Voloshynovskiy
Dr. Benedetta Tondi
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Entropy is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • information theory
  • deep learning
  • data acquisition and reconstruction
  • data analysis
  • data estimation and inference
  • classification
  • security and privacy

Published Papers (6 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Article
Adam and the Ants: On the Influence of the Optimization Algorithm on the Detectability of DNN Watermarks
Entropy 2020, 22(12), 1379; https://doi.org/10.3390/e22121379 - 06 Dec 2020
Cited by 7 | Viewed by 1125
Abstract
As training Deep Neural Networks (DNNs) becomes more expensive, the interest in protecting the ownership of the models with watermarking techniques increases. Uchida et al. proposed a digital watermarking algorithm that embeds the secret message into the model coefficients. However, despite its appeal, [...] Read more.
As training Deep Neural Networks (DNNs) becomes more expensive, the interest in protecting the ownership of the models with watermarking techniques increases. Uchida et al. proposed a digital watermarking algorithm that embeds the secret message into the model coefficients. However, despite its appeal, in this paper, we show that its efficacy can be compromised by the optimization algorithm being used. In particular, we found through a theoretical analysis that, as opposed to Stochastic Gradient Descent (SGD), the update direction given by Adam optimization strongly depends on the sign of a combination of columns of the projection matrix used for watermarking. Consequently, as observed in the empirical results, this makes the coefficients move in unison giving rise to heavily spiked weight distributions that can be easily detected by adversaries. As a way to solve this problem, we propose a new method called Block-Orthonormal Projections (BOP) that allows one to combine watermarking with Adam optimization with a minor impact on the detectability of the watermark and an increased robustness. Full article
Show Figures

Figure 1

Article
Bottleneck Problems: An Information and Estimation-Theoretic View
Entropy 2020, 22(11), 1325; https://doi.org/10.3390/e22111325 - 20 Nov 2020
Cited by 4 | Viewed by 1089
Abstract
Information bottleneck (IB) and privacy funnel (PF) are two closely related optimization problems which have found applications in machine learning, design of privacy algorithms, capacity problems (e.g., Mrs. Gerber’s Lemma), and strong data processing inequalities, among others. In this work, we first investigate [...] Read more.
Information bottleneck (IB) and privacy funnel (PF) are two closely related optimization problems which have found applications in machine learning, design of privacy algorithms, capacity problems (e.g., Mrs. Gerber’s Lemma), and strong data processing inequalities, among others. In this work, we first investigate the functional properties of IB and PF through a unified theoretical framework. We then connect them to three information-theoretic coding problems, namely hypothesis testing against independence, noisy source coding, and dependence dilution. Leveraging these connections, we prove a new cardinality bound on the auxiliary variable in IB, making its computation more tractable for discrete random variables. In the second part, we introduce a general family of optimization problems, termed “bottleneck problems”, by replacing mutual information in IB and PF with other notions of mutual information, namely f-information and Arimoto’s mutual information. We then argue that, unlike IB and PF, these problems lead to easily interpretable guarantees in a variety of inference tasks with statistical constraints on accuracy and privacy. While the underlying optimization problems are non-convex, we develop a technique to evaluate bottleneck problems in closed form by equivalently expressing them in terms of lower convex or upper concave envelope of certain functions. By applying this technique to a binary case, we derive closed form expressions for several bottleneck problems. Full article
Show Figures

Figure 1

Article
Information Bottleneck Classification in Extremely Distributed Systems
Entropy 2020, 22(11), 1237; https://doi.org/10.3390/e22111237 - 30 Oct 2020
Cited by 1 | Viewed by 941
Abstract
We present a new decentralized classification system based on a distributed architecture. This system consists of distributed nodes, each possessing their own datasets and computing modules, along with a centralized server, which provides probes to classification and aggregates the responses of nodes for [...] Read more.
We present a new decentralized classification system based on a distributed architecture. This system consists of distributed nodes, each possessing their own datasets and computing modules, along with a centralized server, which provides probes to classification and aggregates the responses of nodes for a final decision. Each node, with access to its own training dataset of a given class, is trained based on an auto-encoder system consisting of a fixed data-independent encoder, a pre-trained quantizer and a class-dependent decoder. Hence, these auto-encoders are highly dependent on the class probability distribution for which the reconstruction distortion is minimized. Alternatively, when an encoding–quantizing–decoding node observes data from different distributions, unseen at training, there is a mismatch, and such a decoding is not optimal, leading to a significant increase of the reconstruction distortion. The final classification is performed at the centralized classifier that votes for the class with the minimum reconstruction distortion. In addition to the system applicability for applications facing big-data communication problems and or requiring private classification, the above distributed scheme creates a theoretical bridge to the information bottleneck principle. The proposed system demonstrates a very promising performance on basic datasets such as MNIST and FasionMNIST. Full article
Show Figures

Graphical abstract

Article
Data-Dependent Conditional Priors for Unsupervised Learning of Multimodal Data
Entropy 2020, 22(8), 888; https://doi.org/10.3390/e22080888 - 13 Aug 2020
Cited by 1 | Viewed by 1449
Abstract
One of the major shortcomings of variational autoencoders is the inability to produce generations from the individual modalities of data originating from mixture distributions. This is primarily due to the use of a simple isotropic Gaussian as the prior for the latent code [...] Read more.
One of the major shortcomings of variational autoencoders is the inability to produce generations from the individual modalities of data originating from mixture distributions. This is primarily due to the use of a simple isotropic Gaussian as the prior for the latent code in the ancestral sampling procedure for data generations. In this paper, we propose a novel formulation of variational autoencoders, conditional prior VAE (CP-VAE), with a two-level generative process for the observed data where continuous z and a discrete c variables are introduced in addition to the observed variables x. By learning data-dependent conditional priors, the new variational objective naturally encourages a better match between the posterior and prior conditionals, and the learning of the latent categories encoding the major source of variation of the original data in an unsupervised manner. Through sampling continuous latent code from the data-dependent conditional priors, we are able to generate new samples from the individual mixture components corresponding, to the multimodal structure over the original data. Moreover, we unify and analyse our objective under different independence assumptions for the joint distribution of the continuous and discrete latent variables. We provide an empirical evaluation on one synthetic dataset and three image datasets, FashionMNIST, MNIST, and Omniglot, illustrating the generative performance of our new model comparing to multiple baselines. Full article
Show Figures

Figure 1

Article
A New Information-Theoretic Method for Advertisement Conversion Rate Prediction for Large-Scale Sparse Data Based on Deep Learning
Entropy 2020, 22(6), 643; https://doi.org/10.3390/e22060643 - 10 Jun 2020
Cited by 3 | Viewed by 1259
Abstract
With the development of online advertising technology, the accurate targeted advertising based on user preferences is obviously more suitable both for the market and users. The amount of conversion can be properly increased by predicting the user’s purchasing intention based on the advertising [...] Read more.
With the development of online advertising technology, the accurate targeted advertising based on user preferences is obviously more suitable both for the market and users. The amount of conversion can be properly increased by predicting the user’s purchasing intention based on the advertising Conversion Rate (CVR). According to the high-dimensional and sparse characteristics of the historical behavior sequences, this paper proposes a LSLM_LSTM model, which is for the advertising CVR prediction based on large-scale sparse data. This model aims at minimizing the loss, utilizing the Adaptive Moment Estimation (Adam) optimization algorithm to mine the nonlinear patterns hidden in the data automatically. Through the experimental comparison with a variety of typical CVR prediction models, it is found that the proposed LSLM_LSTM model can utilize the time series characteristics of user behavior sequences more effectively, as well as mine the potential relationship hidden in the features, which brings higher accuracy and trains faster compared to those with consideration of only low or high order features. Full article
Show Figures

Figure 1

Article
Are Classification Deep Neural Networks Good for Blind Image Watermarking?
Entropy 2020, 22(2), 198; https://doi.org/10.3390/e22020198 - 08 Feb 2020
Cited by 9 | Viewed by 1628
Abstract
Image watermarking is usually decomposed into three steps: (i) a feature vector is extracted from an image; (ii) it is modified to embed the watermark; (iii) and it is projected back into the image space while avoiding the creation of visual artefacts. This [...] Read more.
Image watermarking is usually decomposed into three steps: (i) a feature vector is extracted from an image; (ii) it is modified to embed the watermark; (iii) and it is projected back into the image space while avoiding the creation of visual artefacts. This feature extraction is usually based on a classical image representation given by the Discrete Wavelet Transform or the Discrete Cosine Transform for instance. These transformations require very accurate synchronisation between the embedding and the detection and usually rely on various registration mechanisms for that purpose. This paper investigates a new family of transformation based on Deep Neural Networks trained with supervision for a classification task. Motivations come from the Computer Vision literature, which has demonstrated the robustness of these features against light geometric distortions. Also, adversarial sample literature provides means to implement the inverse transform needed in the third step above mentioned. As far as zero-bit watermarking is concerned, this paper shows that this approach is feasible as it yields a good quality of the watermarked images and an intrinsic robustness. We also tests more advanced tools from Computer Vision such as aggregation schemes with weak geometry and retraining with a dataset augmented with classical image processing attacks. Full article
Show Figures

Figure 1

Back to TopTop