On the Use of Deep Learning for Image/Video Coding and Visual Quality Assessment

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Computer Science & Engineering".

Deadline for manuscript submissions: closed (30 April 2023) | Viewed by 7827

Special Issue Editors


E-Mail Website1 Website2
Guest Editor
Département d’Electronique, Faculté des Sciences de l'Ingénieur, Université Djilali Liabès de Sidi Bel Abbès, Sidi Bel Abbès, 22000, Algérie
Interests: signal processing; image processing; video processing coding; pattern recognition

E-Mail Website
Guest Editor
Department of Measurement Control and Information Technology, School of Instrumentation and Optoelectronic Engineering, Beihang University, Beijing 100191, China
Interests: HD video compression codec; visual salience and visual perception; stereo matching of binocular images; image segmentation; neural network and deep learning; image processing and computer vision

E-Mail Website
Guest Editor
Département d'Électronique et Informatique, Institut National des Sciences Appliquées de Rennes, 35700 Rennes, France
Interests: video coding; video compression; video transmission; image processing; video processing; signal/image and video processing; multimedia signal processing; digital image processing; data compression; signal processing

E-Mail Website1 Website2
Guest Editor
Department of Multimedia Communications, National Institute of Telecommunications and ICT, 417773 Oran, Algeria
Interests: data compression; image classification; neural nets; video coding; image coding; learning (artificial intelligence); natural scenes; object detection; optimization; rate distortion theory; security of data; statistical analysis; approximation theory; computer crime; computer vision; convolutional neural nets; cryptography; deep learning (artificial intelligence); distortion; image reconstruction; image representation; image sequences; natural language processing; object tracking; pattern classification

Special Issue Information

Dear Colleagues,

With the development of imaging and display technologies, ultra-high-definition, high dynamic range, high frame rate and immersive 360-degree content have emerged in our lives. However, the increase in resolution and dimensionality involves a large amount of data, making its storage or transmission not plausible over existing bandwidth-limited infrastructure. In order to address these challenges, it is desirable to design efficient image/video compression algorithms. Moreover, since there is quality loss during the image/video compression and transmission processes, it is important to have an effective tool to reliably assess, control and ensure high quality.

Today, artificial intelligence (AI) is widely used in academia and industry. Deep learning, and especially convolutional neural networks (CNN),  is regarded as one of the important AI technologies that have been successfully applied in areas such as image processing, computer vision, and pattern recognition. Currently, the traditional video compression and visual quality assessment methods face a lot of challenges, including high computational complexity, limited coding efficiency, and low prediction accuracy. Deep learning provides a new way to solve these problems.

This Special Issue is intended for researchers and practitioners from academia as well as industry who are interested in issues that arise from using deep learning for video data compression and visual quality assessment.

The topics of interest include, but are not limited to:

  • Deep learning for image/video compression;
  • Deep learning for rate control and bit allocation optimizations;
  • Deep learning for filtering algorithms;
  • Deep learning for low-complexity video coding algorithms;
  • Deep learning for coding efficiency optimization;
  • Deep learning for Versatile Video Coding (VVC) optimization;
  • Deep learning for 3D/HDR/360-degree video coding;
  • Deep learning for frame interpolation;
  • Deep learning for image/video quality assessment.

Dr. Kamel Belloulata
Dr. Shiping Zhu
Dr. Hamidouche Wassim
Dr. Sid Ahmed Fezza
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • end-to-end optimized image compression
  • convolutional neural network compression, learned image and video compression 
  • learned transforms 
  • nonlinear transform coding (NLT)

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

18 pages, 4512 KiB  
Article
A Visual Enhancement Network with Feature Fusion for Image Aesthetic Assessment
by Xin Zhang, Xinyu Jiang, Qing Song and Pengzhou Zhang
Electronics 2023, 12(11), 2526; https://doi.org/10.3390/electronics12112526 - 3 Jun 2023
Cited by 1 | Viewed by 1058
Abstract
Image aesthetic assessment (IAA) with neural attention has made significant progress due to its effectiveness in object recognition. Current studies have shown that the features learned by convolutional neural networks (CNN) at different learning stages indicate meaningful information. The shallow feature contains the [...] Read more.
Image aesthetic assessment (IAA) with neural attention has made significant progress due to its effectiveness in object recognition. Current studies have shown that the features learned by convolutional neural networks (CNN) at different learning stages indicate meaningful information. The shallow feature contains the low-level information of images, and the deep feature perceives the image semantics and themes. Inspired by this, we propose a visual enhancement network with feature fusion (FF-VEN). It consists of two sub-modules, the visual enhancement module (VE module) and the shallow and deep feature fusion module (SDFF module). The former uses an adaptive filter in the spatial domain to simulate human eyes according to the region of interest (ROI) extracted by neural feedback. The latter not only extracts the shallow feature and the deep feature via transverse connection, but also uses a feature fusion unit (FFU) to fuse the pooled features together with the aim of information contribution maximization. Experiments on standard AVA dataset and Photo.net dataset show the effectiveness of FF-VEN. Full article
Show Figures

Figure 1

15 pages, 2069 KiB  
Article
ARET-IQA: An Aspect-Ratio-Embedded Transformer for Image Quality Assessment
by Hancheng Zhu, Yong Zhou, Zhiwen Shao, Wen-Liang Du, Jiaqi Zhao and Rui Yao
Electronics 2022, 11(14), 2132; https://doi.org/10.3390/electronics11142132 - 7 Jul 2022
Cited by 2 | Viewed by 1834
Abstract
Image quality assessment (IQA) aims to automatically evaluate image perceptual quality by simulating the human visual system, which is an important research topic in the field of image processing and computer vision. Although existing deep-learning-based IQA models have achieved significant success, these IQA [...] Read more.
Image quality assessment (IQA) aims to automatically evaluate image perceptual quality by simulating the human visual system, which is an important research topic in the field of image processing and computer vision. Although existing deep-learning-based IQA models have achieved significant success, these IQA models usually require input images with a fixed size, which varies the perceptual quality of images. To this end, this paper proposes an aspect-ratio-embedded Transformer-based image quality assessment method, which can implant the adaptive aspect ratios of input images into the multihead self-attention module of the Swin Transformer. In this way, the proposed IQA model can not only relieve the variety of perceptual quality caused by size changes in input images but also leverage more global content correlations to infer image perceptual quality. Furthermore, to comprehensively capture the impact of low-level and high-level features on image quality, the proposed IQA model combines the output features of multistage Transformer blocks for jointly inferring image quality. Experimental results on multiple IQA databases show that the proposed IQA method is superior to state-of-the-art methods for assessing image technical and aesthetic quality. Full article
Show Figures

Figure 1

15 pages, 1783 KiB  
Article
Learning-Based Text Image Quality Assessment with Texture Feature and Embedding Robustness
by Zhiwei Jia, Shugong Xu, Shiyi Mu and Yue Tao
Electronics 2022, 11(10), 1611; https://doi.org/10.3390/electronics11101611 - 18 May 2022
Cited by 2 | Viewed by 2020
Abstract
The quality of the input text image has a clear impact on the output of a scene text recognition (STR) system; however, due to the fact that the main content of a text image is a sequence of characters containing semantic information, how [...] Read more.
The quality of the input text image has a clear impact on the output of a scene text recognition (STR) system; however, due to the fact that the main content of a text image is a sequence of characters containing semantic information, how to effectively assess text image quality remains a research challenge. Text image quality assessment (TIQA) can help in picking a hard sample, leading to a more robust STR system and recognition-oriented text image restoration. In this paper, by arguing that the text image quality comes from character-level texture feature and embedding robustness, we propose a learning-based fine-grained, sharp, and recognizable text image quality assessment method (FSR–TIQA), which is the first TIQA scheme to our knowledge. In order to overcome the difficulty of obtaining the character position in a text image, an attention-based recognizer is used to generate the character embedding and character image. We use the similarity distribution distance to evaluate the character embedding robustness between the intra-class and inter-class similarity distributions. The Haralick feature is used to reflect the clarity of the character region texture feature. Then, a quality score network is designed under a label–free training scheme to normalize the texture feature and output the quality score. Extensive experiments indicate that FSR-TIQA has significant discrimination for different quality text images on benchmarks and Textzoom datasets. Our method shows good potential to analyze dataset distribution and guide dataset collection. Full article
Show Figures

Figure 1

14 pages, 1393 KiB  
Article
RDNet: Rate–Distortion-Based Coding Unit Partition Network for Intra-Prediction
by Chao Yao, Chenming Xu and Meiqin Liu
Electronics 2022, 11(6), 916; https://doi.org/10.3390/electronics11060916 - 15 Mar 2022
Cited by 4 | Viewed by 1912
Abstract
High efficiency video coding (HEVC) has been finalized as the most widely utilized video coding standard, jointly developed by ITU-T, VCEG, and MPEG. In HEVC, the quad-tree structure of the coding unit partition is one of the most substantial modules and provides significant [...] Read more.
High efficiency video coding (HEVC) has been finalized as the most widely utilized video coding standard, jointly developed by ITU-T, VCEG, and MPEG. In HEVC, the quad-tree structure of the coding unit partition is one of the most substantial modules and provides significant coding gains following huge coding time. In this paper, a rate–distortion-based coding unit partition network (RDNet) is proposed to make partition decisions based on the statistical features. RDNet is composed of a prediction sub-network and a target sub-network, where the prediction sub-network is used to predict the CU partition modes of the intra-prediction and the target sub-network is designed to optimize the network parameters by evaluating the rate–distortion cost, respectively. To balance the prediction accuracy and the rate–distortion loss, a parameter-exchanging strategy is applied to control the parameters’ sharing between two networks. Experimental results prove that our model can reduce the encoding time of HEVC by 55.83~71.72% with an efficient BD-BR of 2.876~3.347%, and the ablation study evaluates the ability of our strategy on balancing the trade-off between coding accuracy and inference speed. Full article
Show Figures

Figure 1

Back to TopTop