sensors-logo

Journal Browser

Journal Browser

Special Issue "Image and Video Processing and Recognition Based on Artificial Intelligence II"

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Sensing and Imaging".

Deadline for manuscript submissions: 31 August 2022.

Special Issue Editors

Prof. Dr. Kang Ryoung Park
E-Mail Website
Guest Editor
Division of Electronics and Electrical Engineering, Dongguk University, 30, Pildong- ro 1-gil, Jung-gu, Seoul 04620, Korea
Interests: deep learning; biometrics; image processing
Special Issues, Collections and Topics in MDPI journals
Prof. Dr. Sangyoun Lee
E-Mail Website
Guest Editor
School of Electrical and Electronic Engineering, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul 120-749, Korea
Interests: human detection and recognition; gesture recognition; face recognition; HEVC
Special Issues, Collections and Topics in MDPI journals
Prof. Dr. Euntai Kim
E-Mail Website1 Website2
Guest Editor
School of Electrical and Electronic Engineering, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul 120-749, Korea
Interests: pedestrian and vehicle detection; recognition; vision for advanced driver assistance systems (ADAS); robot vision
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Recent developments have led to the intense application of artificial intelligence (AI) techniques to image and video processing and recognition. Although the state-of-the-art technology has matured, its performance is still affected by various environmental conditions and heterogeneous databases. The purpose of this Special Issue is to invite high-quality and state-of-the-art academic papers on challenging issues in the field of AI-based image and video processing and recognition. We solicit original papers of unpublished and completed research that are not currently under review by any other conference, magazine, or journal. Topics of interest include, but are not limited to, the following:

  • AI-based image processing, understanding, recognition, compression, and reconstruction;
  • AI-based video processing, understanding, recognition, compression, and reconstruction;
  • Computer vision based on AI;
  • AI-based biometrics;
  • AI-based object detection and tracking;
  • Approaches that combine AI techniques and conventional methods for image and video processing and recognition;
  • Explainable AI (XAI) for image and video processing and recognition;
  • Generative adversarial network (GAN)-based image and video processing and recognition;
  • Approaches that combine AI techniques and blockchain methods for image and video processing and recognition.

Prof. Dr. Kang Ryoung Park
Prof. Dr. Sangyoun Lee
Prof. Dr. Euntai Kim
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All papers will be peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Image processing, understanding, recognition, compression, and reconstruction based on AI
  • Video processing, understanding, recognition, compression, and reconstruction based on AI
  • Computer vision based on AI
  • Biometrics based on AI
  • Fusion of AI and conventional methods
  • XAI and GAN
  • Fusion of AI and blockchain methods

Published Papers (3 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Article
An Efficient Approach Using Knowledge Distillation Methods to Stabilize Performance in a Lightweight Top-Down Posture Estimation Network
Sensors 2021, 21(22), 7640; https://doi.org/10.3390/s21227640 - 17 Nov 2021
Viewed by 375
Abstract
Multi-person pose estimation has been gaining considerable interest due to its use in several real-world applications, such as activity recognition, motion capture, and augmented reality. Although the improvement of the accuracy and speed of multi-person pose estimation techniques has been recently studied, limitations [...] Read more.
Multi-person pose estimation has been gaining considerable interest due to its use in several real-world applications, such as activity recognition, motion capture, and augmented reality. Although the improvement of the accuracy and speed of multi-person pose estimation techniques has been recently studied, limitations still exist in balancing these two aspects. In this paper, a novel knowledge distilled lightweight top-down pose network (KDLPN) is proposed that balances computational complexity and accuracy. For the first time in multi-person pose estimation, a network that reduces computational complexity by applying a “Pelee” structure and shuffles pixels in the dense upsampling convolution layer to reduce the number of channels is presented. Furthermore, to prevent performance degradation because of the reduced computational complexity, knowledge distillation is applied to establish the pose estimation network as a teacher network. The method performance is evaluated on the MSCOCO dataset. Experimental results demonstrate that our KDLPN network significantly reduces 95% of the parameters required by state-of-the-art methods with minimal performance degradation. Moreover, our method is compared with other pose estimation methods to substantiate the importance of computational complexity reduction and its effectiveness. Full article
Show Figures

Figure 1

Article
Deep-Learning-Based Stress Recognition with Spatial-Temporal Facial Information
Sensors 2021, 21(22), 7498; https://doi.org/10.3390/s21227498 - 11 Nov 2021
Viewed by 332
Abstract
In recent times, as interest in stress control has increased, many studies on stress recognition have been conducted. Several studies have been based on physiological signals, but the disadvantage of this strategy is that it requires physiological-signal-acquisition devices. Another strategy employs facial-image-based stress-recognition [...] Read more.
In recent times, as interest in stress control has increased, many studies on stress recognition have been conducted. Several studies have been based on physiological signals, but the disadvantage of this strategy is that it requires physiological-signal-acquisition devices. Another strategy employs facial-image-based stress-recognition methods, which do not require devices, but predominantly use handcrafted features. However, such features have low discriminating power. We propose a deep-learning-based stress-recognition method using facial images to address these challenges. Given that deep-learning methods require extensive data, we constructed a large-capacity image database for stress recognition. Furthermore, we used temporal attention, which assigns a high weight to frames that are highly related to stress, as well as spatial attention, which assigns a high weight to regions that are highly related to stress. By adding a network that inputs the facial landmark information closely related to stress, we supplemented the network that receives only facial images as the input. Experimental results on our newly constructed database indicated that the proposed method outperforms contemporary deep-learning-based recognition methods. Full article
Show Figures

Figure 1

Article
Restoration of Motion Blurred Image by Modified DeblurGAN for Enhancing the Accuracies of Finger-Vein Recognition
Sensors 2021, 21(14), 4635; https://doi.org/10.3390/s21144635 - 06 Jul 2021
Viewed by 709
Abstract
Among many available biometrics identification methods, finger-vein recognition has an advantage that is difficult to counterfeit, as finger veins are located under the skin, and high user convenience as a non-invasive image capturing device is used for recognition. However, blurring can occur when [...] Read more.
Among many available biometrics identification methods, finger-vein recognition has an advantage that is difficult to counterfeit, as finger veins are located under the skin, and high user convenience as a non-invasive image capturing device is used for recognition. However, blurring can occur when acquiring finger-vein images, and such blur can be mainly categorized into three types. First, skin scattering blur due to light scattering in the skin layer; second, optical blur occurs due to lens focus mismatching; and third, motion blur exists due to finger movements. Blurred images generated in these kinds of blur can significantly reduce finger-vein recognition performance. Therefore, restoration of blurred finger-vein images is necessary. Most of the previous studies have addressed the restoration method of skin scattering blurred images and some of the studies have addressed the restoration method of optically blurred images. However, there has been no research on restoration methods of motion blurred finger-vein images that can occur in actual environments. To address this problem, this study proposes a new method for improving the finger-vein recognition performance by restoring motion blurred finger-vein images using a modified deblur generative adversarial network (modified DeblurGAN). Based on an experiment conducted using two open databases, the Shandong University homologous multi-modal traits (SDUMLA-HMT) finger-vein database and Hong Kong Polytechnic University finger-image database version 1, the proposed method demonstrates outstanding performance that is better than those obtained using state-of-the-art methods. Full article
Show Figures

Figure 1

Planned Papers

The below list represents only planned manuscripts. Some of these manuscripts have not been received by the Editorial Office yet. Papers submitted to MDPI journals are subject to peer-review.

Title: Detection Quality Rate Amplified by a Combination of HOG and CNN for Real-time Multiple Object Tracking across multiple non-overlapping cameras
Authors: Lesole Kalake1, Yanqiu Dong2, Wanggen Wan3, , Li Huo4
Affiliation: 1,2,3 School of Communications and Information Engineering, Institute of Smart City, Shanghai University, Shanghai 200444, China
Abstract: Multi-Object Tracking in video surveillance is subjected to illumination variation, blurring, and motion and similarity variations during the identification process in real-world practice. The previously proposed applications have difficulties in learning the appearances and differentiating the objects from sundry detections. They are mostly relying heavily on local features and tend to lose vital global structured features like contour features. This contributed to their inability to accurately detect, classify or distinguish the fooling images. In this paper, we propose a paradigm aimed at eliminating these tracking difficulties by enhancing the detection quality rate through the combination of a Deep Convolutional Neural Network (DCNN) and Histogram Oriented Gradient (HOG) descriptor. We trained the algorithm with an input of 80x32 images size, clean, and convert them into binary for reducing the numbers of false positives. In testing, we eliminate the background on frames size(..) and apply morphological operations and Laplacian of Gaussian Model (LOG) mixture after blobs. The images further undergo feature extraction and computation with the HOG descriptor to simplify the structural information of the objects in the captured video images. We store the appearance features in an array and pass them onto the network (CNN) for further processing. We have applied and evaluated our algorithm for real-time multiple object tracking on various city streets using MAT and EPFL Multi-camera pedestrian datasets. The experimental results illustrate that our proposed technique improves the detection rate and data associations. Our algorithm outperforms the previous works on precision and specificity rates when compared to the state-of-the-art methods.

Back to TopTop