sensors-logo

Journal Browser

Journal Browser

Artificial Intelligence in Imaging Sensing and Processing

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Intelligent Sensors".

Deadline for manuscript submissions: closed (31 July 2023) | Viewed by 9861

Special Issue Editor


E-Mail Website
Guest Editor
Department of Computer Science, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
Interests: computer vision and multimedia system; intelligent control system; machine learning for signal processing; human–machine interface
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Many modern intelligent systems and applications have embedded deep-learning-based image sensing and processing. For instance, smartphones, surveillance cameras, UAVs, autonomous vehicles, remote sensing systems, and medical images nowadays use advanced deep-learning-based technologies to achieve state-of-the-art performance. Accordingly, the deep-learning-based technologies include but are not limited to image/video enhancement, restoration, super-resolution, dynamic range manipulation, 3D deep estimation, 2D/3D object detection and tracking, segmentation, recognition, and scene understanding and reconstruction.

Even though some successful approaches and algorithms have been proposed, more innovative and efficient alternatives based on supervised, unsupervised, semi-supervised, self-supervised, zero-shot, and few-shot learning are expected to make the systems more practical and generalized for the challenges in the real world. From the architecture’s viewpoint, the pros and cons of CNN-based and transformer-based methods still remain unknown for discussion. Regarding the generative-based approaches, we expect to have more discussions of flow-based, GAN-based, and diffusion-based models for image sensing and processing. On the other hand, the co-design of image sensing and image processing, related to computational imaging, is another critical issue that needs more investigation.

This Special Issue aims to highlight and invite state-of-the-art research papers related to deep-learning-based image processing and computer vision techniques in image sensing. Topics include but are not limited to:

      1. Computer Vision

  • Scene Analysis
  • Camera Calibration
  • Motion Analysis
  • 3D Reconstruction
  • Vision-based Surveillance
  • Vision Interface

      2. Image Processing

  • Document Image Processing
  • Medical Image Processing
  • Remote Image Processing
  • Image/Video Watermarking
  • Super-resolution

     3. Pattern recognition

  • Face Detection & Recognition
  • Facial Expression Recognition
  • Gesture Recognition
  • Behavior Recognition

     4. Video Processing

  • Video Object Segmentation
  • Video Object Tracking
  • Video Content Analysis
  • Video Indexing and Retrieval
  • Compression and Transmission of Videos
  • Video Adaptation
  • Video Networking

      5. Applications and Systems

  • Industrial Visual Inspection
  • Robotic Vision
  • Intelligent Transportation System
  • Multimedia Communication Network
  • Digital Signal Processing
  • Multimedia SoC
  • Biomedical Application
  • Multimedia Security and Forensics
  • Multimedia in education

Dr. Ching-Chun Huang
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

Computer Vision; Image Processing; Pattern recognition; Video Processing

Published Papers (7 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

20 pages, 2943 KiB  
Article
Image-Based Ship Detection Using Deep Variational Information Bottleneck
by Duc-Dat Ngo, Van-Linh Vo, Tri Nguyen, Manh-Hung Nguyen and My-Ha Le
Sensors 2023, 23(19), 8093; https://doi.org/10.3390/s23198093 - 26 Sep 2023
Viewed by 1283
Abstract
Image-based ship detection is a critical function in maritime security. However, lacking high-quality training datasets makes it challenging to train a robust supervision deep learning model. Conventional methods use data augmentation to increase training samples. This approach is not robust because the data [...] Read more.
Image-based ship detection is a critical function in maritime security. However, lacking high-quality training datasets makes it challenging to train a robust supervision deep learning model. Conventional methods use data augmentation to increase training samples. This approach is not robust because the data augmentation may not present a complex background or occlusion well. This paper proposes to use an information bottleneck and a reparameterization trick to address the challenge. The information bottleneck learns features that focus only on the object and eliminate all backgrounds. It helps to avoid background variance. In addition, the reparameterization introduces uncertainty during the training phase. It helps to learn more robust detectors. Comprehensive experiments show that the proposed method outperforms conventional methods on Seaship datasets, especially when the number of training samples is small. In addition, this paper discusses how to integrate the information bottleneck and the reparameterization into well-known object detection frameworks efficiently. Full article
(This article belongs to the Special Issue Artificial Intelligence in Imaging Sensing and Processing)
Show Figures

Figure 1

20 pages, 13187 KiB  
Article
Beyond Human Detection: A Benchmark for Detecting Common Human Posture
by Yongxin Li, You Wu, Xiaoting Chen, Han Chen, Depeng Kong, Haihua Tang and Shuiwang Li
Sensors 2023, 23(19), 8061; https://doi.org/10.3390/s23198061 - 24 Sep 2023
Cited by 2 | Viewed by 1093
Abstract
Human detection is the task of locating all instances of human beings present in an image, which has a wide range of applications across various fields, including search and rescue, surveillance, and autonomous driving. The rapid advancement of computer vision and deep learning [...] Read more.
Human detection is the task of locating all instances of human beings present in an image, which has a wide range of applications across various fields, including search and rescue, surveillance, and autonomous driving. The rapid advancement of computer vision and deep learning technologies has brought significant improvements in human detection. However, for more advanced applications like healthcare, human–computer interaction, and scene understanding, it is crucial to obtain information beyond just the localization of humans. These applications require a deeper understanding of human behavior and state to enable effective and safe interactions with humans and the environment. This study presents a comprehensive benchmark, the Common Human Postures (CHP) dataset, aimed at promoting a more informative and more encouraging task beyond mere human detection. The benchmark dataset comprises a diverse collection of images, featuring individuals in different environments, clothing, and occlusions, performing a wide range of postures and activities. The benchmark aims to enhance research in this challenging task by designing novel and precise methods specifically for it. The CHP dataset consists of 5250 human images collected from different scenes, annotated with bounding boxes for seven common human poses. Using this well-annotated dataset, we have developed two baseline detectors, namely CHP-YOLOF and CHP-YOLOX, building upon two identity-preserved human posture detectors: IPH-YOLOF and IPH-YOLOX. We evaluate the performance of these baseline detectors through extensive experiments. The results demonstrate that these baseline detectors effectively detect human postures on the CHP dataset. By releasing the CHP dataset, we aim to facilitate further research on human pose estimation and to attract more researchers to focus on this challenging task. Full article
(This article belongs to the Special Issue Artificial Intelligence in Imaging Sensing and Processing)
Show Figures

Figure 1

20 pages, 3081 KiB  
Article
Real-Time Video Super-Resolution with Spatio-Temporal Modeling and Redundancy-Aware Inference
by Wenhao Wang, Zhenbing Liu, Haoxiang Lu, Rushi Lan and Zhaoyuan Zhang
Sensors 2023, 23(18), 7880; https://doi.org/10.3390/s23187880 - 14 Sep 2023
Cited by 2 | Viewed by 1454
Abstract
Video super-resolution aims to generate high-resolution frames from low-resolution counterparts. It can be regarded as a specialized application of image super-resolution, serving various purposes, such as video display and surveillance. This paper proposes a novel method for real-time video super-resolution. It effectively exploits [...] Read more.
Video super-resolution aims to generate high-resolution frames from low-resolution counterparts. It can be regarded as a specialized application of image super-resolution, serving various purposes, such as video display and surveillance. This paper proposes a novel method for real-time video super-resolution. It effectively exploits spatial information by utilizing the capabilities of an image super-resolution model and leverages the temporal information inherent in videos. Specifically, the method incorporates a pre-trained image super-resolution network as its foundational framework, allowing it to leverage existing expertise for super-resolution. A fast temporal information aggregation module is presented to further aggregate temporal cues across frames. By using deformable convolution to align features of neighboring frames, this module takes advantage of inter-frame dependency. In addition, it employs a hierarchical fast spatial offset feature extraction and a channel attention-based temporal fusion. A redundancy-aware inference algorithm is developed to reduce computational redundancy by reusing intermediate features, achieving real-time inferring speed. Extensive experiments on several benchmarks demonstrate that the proposed method can reconstruct satisfactory results with strong quantitative performance and visual qualities. The real-time inferring ability makes it suitable for real-world deployment. Full article
(This article belongs to the Special Issue Artificial Intelligence in Imaging Sensing and Processing)
Show Figures

Figure 1

18 pages, 1602 KiB  
Article
Quality Control of Carbon Look Components via Surface Defect Classification with Deep Neural Networks
by Andrea Silenzi, Vincenzo Castorani, Selene Tomassini, Nicola Falcionelli, Paolo Contardo, Andrea Bonci, Aldo Franco Dragoni and Paolo Sernani
Sensors 2023, 23(17), 7607; https://doi.org/10.3390/s23177607 - 1 Sep 2023
Cited by 2 | Viewed by 1129
Abstract
Many “Industry 4.0” applications rely on data-driven methodologies such as Machine Learning and Deep Learning to enable automatic tasks and implement smart factories. Among these applications, the automatic quality control of manufacturing materials is of utmost importance to achieve precision and standardization in [...] Read more.
Many “Industry 4.0” applications rely on data-driven methodologies such as Machine Learning and Deep Learning to enable automatic tasks and implement smart factories. Among these applications, the automatic quality control of manufacturing materials is of utmost importance to achieve precision and standardization in production. In this regard, most of the related literature focused on combining Deep Learning with Nondestructive Testing techniques, such as Infrared Thermography, requiring dedicated settings to detect and classify defects in composite materials. Instead, the research described in this paper aims at understanding whether deep neural networks and transfer learning can be applied to plain images to classify surface defects in carbon look components made with Carbon Fiber Reinforced Polymers used in the automotive sector. To this end, we collected a database of images from a real case study, with 400 images to test binary classification (defect vs. no defect) and 1500 for the multiclass classification (components with no defect vs. recoverable vs. non-recoverable). We developed and tested ten deep neural networks as classifiers, comparing ten different pre-trained CNNs as feature extractors. Specifically, we evaluated VGG16, VGG19, ResNet50 version 2, ResNet101 version 2, ResNet152 version 2, Inception version 3, MobileNet version 2, NASNetMobile, DenseNet121, and Xception, all pre-trainined with ImageNet, combined with fully connected layers to act as classifiers. The best classifier, i.e., the network based on DenseNet121, achieved a 97% accuracy in classifying components with no defects, recoverable components, and non-recoverable components, demonstrating the viability of the proposed methodology to classify surface defects from images taken with a smartphone in varying conditions, without the need for dedicated settings. The collected images and the source code of the experiments are available in two public, open-access repositories, making the presented research fully reproducible. Full article
(This article belongs to the Special Issue Artificial Intelligence in Imaging Sensing and Processing)
Show Figures

Figure 1

13 pages, 5941 KiB  
Article
Vision-Based Jigsaw Puzzle Solving with a Robotic Arm
by Chang-Hsian Ma, Chien-Liang Lu and Huang-Chia Shih
Sensors 2023, 23(15), 6913; https://doi.org/10.3390/s23156913 - 3 Aug 2023
Cited by 2 | Viewed by 1418
Abstract
This study proposed two algorithms for reconstructing jigsaw puzzles by using a color compatibility feature. Two realistic application cases were examined: one involved using the original image, while the other did not. We also calculated the transformation matrix to obtain the real positions [...] Read more.
This study proposed two algorithms for reconstructing jigsaw puzzles by using a color compatibility feature. Two realistic application cases were examined: one involved using the original image, while the other did not. We also calculated the transformation matrix to obtain the real positions of each puzzle piece and transmitted the positional information to the robotic arm, which then put each puzzle piece in its correct position. The algorithms were tested on 35-piece and 70-piece puzzles, achieving an average success rate of 87.1%. Compared with the human visual system, the proposed methods demonstrated enhanced accuracy when handling more complex textural images. Full article
(This article belongs to the Special Issue Artificial Intelligence in Imaging Sensing and Processing)
Show Figures

Figure 1

16 pages, 17630 KiB  
Article
Progressively Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation
by Hong-Yu Lee, Yung-Hui Li, Ting-Hsuan Lee and Muhammad Saqlain Aslam
Sensors 2023, 23(15), 6858; https://doi.org/10.3390/s23156858 - 1 Aug 2023
Cited by 9 | Viewed by 1607
Abstract
Unsupervised image-to-image translation has received considerable attention due to the recent remarkable advancements in generative adversarial networks (GANs). In image-to-image translation, state-of-the-art methods use unpaired image data to learn mappings between the source and target domains. However, despite their promising results, existing approaches [...] Read more.
Unsupervised image-to-image translation has received considerable attention due to the recent remarkable advancements in generative adversarial networks (GANs). In image-to-image translation, state-of-the-art methods use unpaired image data to learn mappings between the source and target domains. However, despite their promising results, existing approaches often fail in challenging conditions, particularly when images have various target instances and a translation task involves significant transitions in shape and visual artifacts when translating low-level information rather than high-level semantics. To tackle the problem, we propose a novel framework called Progressive Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization (PRO-U-GAT-IT) for the unsupervised image-to-image translation task. In contrast to existing attention-based models that fail to handle geometric transitions between the source and target domains, our model can translate images requiring extensive and holistic changes in shape. Experimental results show the superiority of the proposed approach compared to the existing state-of-the-art models on different datasets. Full article
(This article belongs to the Special Issue Artificial Intelligence in Imaging Sensing and Processing)
Show Figures

Figure 1

16 pages, 2632 KiB  
Article
Rethinking Feature Generalization in Vacant Space Detection
by Hung-Nguyen Manh
Sensors 2023, 23(10), 4776; https://doi.org/10.3390/s23104776 - 15 May 2023
Viewed by 1283
Abstract
Vacant space detection is critical in modern parking lots. However, deploying a detection model as a service is not an easy task. As the camera in a new parking is set up at different heights or viewing angles from the original parking lot [...] Read more.
Vacant space detection is critical in modern parking lots. However, deploying a detection model as a service is not an easy task. As the camera in a new parking is set up at different heights or viewing angles from the original parking lot where the training data are collected, the performance of the vacant space detector could be degraded. Therefore, in this paper, we proposed a method to learn generalized features so that the detector can work better in different environments. In detail, the features are suitable for a vacant detection task and robust to environmental change. We use a reparameterization process to model the variance from the environment. In addition, a variational information bottleneck is used to ensure the learned feature focus on only the appearance of a car in a specific parking space. Experimental results show that performances on a new parking lot increase significantly when only data from source parking are used in the training phase. Full article
(This article belongs to the Special Issue Artificial Intelligence in Imaging Sensing and Processing)
Show Figures

Figure 1

Back to TopTop