entropy-logo

Journal Browser

Journal Browser

Information Theory-Based Deep Learning Tools for Computer Vision

A special issue of Entropy (ISSN 1099-4300). This special issue belongs to the section "Information Theory, Probability and Statistics".

Deadline for manuscript submissions: closed (16 September 2022) | Viewed by 21453

Special Issue Editors


E-Mail Website
Guest Editor

E-Mail
Guest Editor
Department of Mathematics, University of Jaén, 23071 Jaén, Spain
Interests: mathematics and computational modeling; integrated information theory; mathematical models

Special Issue Information

Dear Colleagues,

Artificial intelligence (AI) is a cross-disciplinary field of research that is generally concerned with developing and investigating systems that operate or act intelligently. In 1948, Claude Shannon, a mathematician and pioneer of AI, proposed the foundations of information theory (IT), and experts from both IT and AI have benefited since then.

Deep learning (DL) is a subset of AI, which is concerned with algorithms inspired by the structure and function of the brain. DL is creating many new applications in broad areas of science, particularly in the domain of computer vision (CV). These novel applications of DL to CV have increased in recent years. Specifically, in conventional applications of DL, a chosen algorithm learns the data and identifies hidden patterns during training. Then, the retrieved information is used for many purposes, e.g., classification.

Therefore, the goal of this Special Issue is to broadly engage the communities of IT, DL, and CV together in order to provide a forum for the researchers and practitioners related to this rapidly developed field, and share their novel and original research regarding the topic addressed by this Special Issue. Additionally, survey papers about relevant topics are also welcome.

Topics of interest include, but are not limited to the following:

  • Deep adversarial learning for CV
  • Medical imaging
  • IT principles in DL applied to CV, especially deep neural networks (DNN)
  • Image registration
  • Image segmentation
  • Nature-inspired algorithms for DL&CV
  • Theoretical analysis of DL models for DL&CV
  • Applications of DNN to CV based on IT principles

Prof. Dr. Jose Santamaria Lopez
Prof. Dr. Francisco Roca
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Entropy is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

14 pages, 3148 KiB  
Article
Point Cloud Geometry Compression Based on Multi-Layer Residual Structure
by Jiawen Yu, Jin Wang, Longhua Sun, Mu-En Wu and Qing Zhu
Entropy 2022, 24(11), 1677; https://doi.org/10.3390/e24111677 - 17 Nov 2022
Cited by 2 | Viewed by 1835
Abstract
Point cloud data are extensively used in various applications, such as autonomous driving and augmented reality since it can provide both detailed and realistic depictions of 3D scenes or objects. Meanwhile, 3D point clouds generally occupy a large amount of storage space that [...] Read more.
Point cloud data are extensively used in various applications, such as autonomous driving and augmented reality since it can provide both detailed and realistic depictions of 3D scenes or objects. Meanwhile, 3D point clouds generally occupy a large amount of storage space that is a big burden for efficient communication. However, it is difficult to efficiently compress such sparse, disordered, non-uniform and high dimensional data. Therefore, this work proposes a novel deep-learning framework for point cloud geometric compression based on an autoencoder architecture. Specifically, a multi-layer residual module is designed on a sparse convolution-based autoencoders that progressively down-samples the input point clouds and reconstructs the point clouds in a hierarchically way. It effectively constrains the accuracy of the sampling process at the encoder side, which significantly preserves the feature information with a decrease in the data volume. Compared with the state-of-the-art geometry-based point cloud compression (G-PCC) schemes, our approach obtains more than 70–90% BD-Rate gain on an object point cloud dataset and achieves a better point cloud reconstruction quality. Additionally, compared to the state-of-the-art PCGCv2, we achieve an average gain of about 10% in BD-Rate. Full article
(This article belongs to the Special Issue Information Theory-Based Deep Learning Tools for Computer Vision)
Show Figures

Figure 1

16 pages, 1136 KiB  
Article
Image Clustering Algorithm Based on Predefined Evenly-Distributed Class Centroids and Composite Cosine Distance
by Qiuyu Zhu, Liheng Hu and Rui Wang
Entropy 2022, 24(11), 1533; https://doi.org/10.3390/e24111533 - 26 Oct 2022
Viewed by 1265
Abstract
The clustering algorithms based on deep neural network perform clustering by obtaining the optimal feature representation. However, in the face of complex natural images, the cluster accuracy of existing clustering algorithms is still relatively low. This paper presents an image clustering algorithm based [...] Read more.
The clustering algorithms based on deep neural network perform clustering by obtaining the optimal feature representation. However, in the face of complex natural images, the cluster accuracy of existing clustering algorithms is still relatively low. This paper presents an image clustering algorithm based on predefined evenly-distributed class centroids (PEDCC) and composite cosine distance. Compared with the current popular auto-encoder structure, we design an encoder-only network structure with normalized latent features, and two effective loss functions in latent feature space by replacing the Euclidean distance with a composite cosine distance. We find that (1) contrastive learning plays a key role in the clustering algorithm and greatly improves the quality of learning latent features; (2) compared with the Euclidean distance, the composite cosine distance can be more suitable for the normalized latent features and PEDCC-based Maximum Mean Discrepancy (MMD) loss function; and (3) for complex natural images, a self-supervised pretrained model can be used to effectively improve clustering performance. Several experiments have been carried out on six common data sets, MNIST, Fashion-MNIST, COIL20, CIFAR-10, STL-10 and ImageNet-10. Experimental results show that our method achieves the best clustering effect compared with other latest clustering algorithms. Full article
(This article belongs to the Special Issue Information Theory-Based Deep Learning Tools for Computer Vision)
Show Figures

Figure 1

16 pages, 2118 KiB  
Article
Dietary Nutritional Information Autonomous Perception Method Based on Machine Vision in Smart Homes
by Hongyang Li and Guanci Yang
Entropy 2022, 24(7), 868; https://doi.org/10.3390/e24070868 - 24 Jun 2022
Cited by 9 | Viewed by 1426
Abstract
In order to automatically perceive the user’s dietary nutritional information in the smart home environment, this paper proposes a dietary nutritional information autonomous perception method based on machine vision in smart homes. Firstly, we proposed a food-recognition algorithm based on YOLOv5 to monitor [...] Read more.
In order to automatically perceive the user’s dietary nutritional information in the smart home environment, this paper proposes a dietary nutritional information autonomous perception method based on machine vision in smart homes. Firstly, we proposed a food-recognition algorithm based on YOLOv5 to monitor the user’s dietary intake using the social robot. Secondly, in order to obtain the nutritional composition of the user’s dietary intake, we calibrated the weight of food ingredients and designed the method for the calculation of food nutritional composition; then, we proposed a dietary nutritional information autonomous perception method based on machine vision (DNPM) that supports the quantitative analysis of nutritional composition. Finally, the proposed algorithm was tested on the self-expanded dataset CFNet-34 based on the Chinese food dataset ChineseFoodNet. The test results show that the average recognition accuracy of the food-recognition algorithm based on YOLOv5 is 89.7%, showing good accuracy and robustness. According to the performance test results of the dietary nutritional information autonomous perception system in smart homes, the average nutritional composition perception accuracy of the system was 90.1%, the response time was less than 6 ms, and the speed was higher than 18 fps, showing excellent robustness and nutritional composition perception performance. Full article
(This article belongs to the Special Issue Information Theory-Based Deep Learning Tools for Computer Vision)
Show Figures

Figure 1

16 pages, 939 KiB  
Article
Single-Shot 3D Multi-Person Shape Reconstruction from a Single RGB Image
by Seong Hyun Kim and Ju Yong Chang
Entropy 2020, 22(8), 806; https://doi.org/10.3390/e22080806 - 23 Jul 2020
Cited by 1 | Viewed by 2740
Abstract
Although the performance of the 3D human shape reconstruction method has improved considerably in recent years, most methods focus on a single person, reconstruct a root-relative 3D shape, and rely on ground-truth information about the absolute depth to convert the reconstruction result to [...] Read more.
Although the performance of the 3D human shape reconstruction method has improved considerably in recent years, most methods focus on a single person, reconstruct a root-relative 3D shape, and rely on ground-truth information about the absolute depth to convert the reconstruction result to the camera coordinate system. In this paper, we propose an end-to-end learning-based model for single-shot, 3D, multi-person shape reconstruction in the camera coordinate system from a single RGB image. Our network produces output tensors divided into grid cells to reconstruct the 3D shapes of multiple persons in a single-shot manner, where each grid cell contains information about the subject. Moreover, our network predicts the absolute position of the root joint while reconstructing the root-relative 3D shape, which enables reconstructing the 3D shapes of multiple persons in the camera coordinate system. The proposed network can be learned in an end-to-end manner and process images at about 37 fps to perform the 3D multi-person shape reconstruction task in real time. Full article
(This article belongs to the Special Issue Information Theory-Based Deep Learning Tools for Computer Vision)
Show Figures

Figure 1

Review

Jump to: Research

49 pages, 11201 KiB  
Review
Salient Object Detection Techniques in Computer Vision—A Survey
by Ashish Kumar Gupta, Ayan Seal, Mukesh Prasad and Pritee Khanna
Entropy 2020, 22(10), 1174; https://doi.org/10.3390/e22101174 - 19 Oct 2020
Cited by 56 | Viewed by 12796
Abstract
Detection and localization of regions of images that attract immediate human visual attention is currently an intensive area of research in computer vision. The capability of automatic identification and segmentation of such salient image regions has immediate consequences for applications in the field [...] Read more.
Detection and localization of regions of images that attract immediate human visual attention is currently an intensive area of research in computer vision. The capability of automatic identification and segmentation of such salient image regions has immediate consequences for applications in the field of computer vision, computer graphics, and multimedia. A large number of salient object detection (SOD) methods have been devised to effectively mimic the capability of the human visual system to detect the salient regions in images. These methods can be broadly categorized into two categories based on their feature engineering mechanism: conventional or deep learning-based. In this survey, most of the influential advances in image-based SOD from both conventional as well as deep learning-based categories have been reviewed in detail. Relevant saliency modeling trends with key issues, core techniques, and the scope for future research work have been discussed in the context of difficulties often faced in salient object detection. Results are presented for various challenging cases for some large-scale public datasets. Different metrics considered for assessment of the performance of state-of-the-art salient object detection models are also covered. Some future directions for SOD are presented towards end. Full article
(This article belongs to the Special Issue Information Theory-Based Deep Learning Tools for Computer Vision)
Show Figures

Figure 1

Back to TopTop