Deep Learning in Computer Vision

A special issue of Journal of Imaging (ISSN 2313-433X). This special issue belongs to the section "Computer Vision and Pattern Recognition".

Deadline for manuscript submissions: 15 September 2024 | Viewed by 1931

Special Issue Editors


E-Mail Website
Guest Editor
Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong 999077, China
Interests: image classification; object detection; semantic segmentation; pose estimation

E-Mail Website
Guest Editor
Department of Computer Science and Technology, Nanjing University, Nanjing 210023, China
Interests: complex human behavior understanding; video-language understanding

Special Issue Information

Dear Colleagues,

The field of computer vision has undergone a significant transformation with the advent of deep learning techniques, enabling the development of innovative applications across various domains. This Special Issue focuses on exploring the latest advancements, methodologies, and applications in this rapidly evolving area.

Deep learning methods, such as convolutional neural networks (CNNs), vision transformer (ViT), diffusion models and generative adversarial networks (GANs), have demonstrated remarkable success in tasks such as object recognition, semantic segmentation and image synthesis. These techniques have paved the way for a myriad of applications, including autonomous vehicles, facial recognition, biomedical image analysis and video surveillance, among others.

We invite contributions that present cutting-edge research, novel techniques, methods, tools and ideas related to the integration of deep learning in computer vision. Submissions may cover a wide range of topics, including, but not limited to:

  • Advances in deep learning architectures for computer vision tasks;
  • Transfer learning and domain adaptation in computer vision;
  • Deep reinforcement learning for vision-based control;
  • Generative models for image synthesis and manipulation;
  • Application of computer vision technology in biomedical imaging;
  • Applications of deep learning in fields such as remote sensing, robotics and art.

We encourage submissions that propose innovative and scientifically grounded research lines for the future development of deep learning techniques in computer vision. Together, we aim to advance the field of computer vision and its applications in various industries.

Dr. Dong Zhang
Dr. Rui Yan
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Journal of Imaging is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • image classification
  • object detection
  • semantic segmentation
  • pose estimation
  • multimedia analysis and retrieval
  • few-shot learning
  • human behavior understanding
  • video language understanding
  • video understanding and analysis

Published Papers (2 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

18 pages, 1810 KiB  
Article
Knowledge Distillation in Video-Based Human Action Recognition: An Intuitive Approach to Efficient and Flexible Model Training
by Fernando Camarena, Miguel Gonzalez-Mendoza and Leonardo Chang
J. Imaging 2024, 10(4), 85; https://doi.org/10.3390/jimaging10040085 - 30 Mar 2024
Viewed by 679
Abstract
Training a model to recognize human actions in videos is computationally intensive. While modern strategies employ transfer learning methods to make the process more efficient, they still face challenges regarding flexibility and efficiency. Existing solutions are limited in functionality and rely heavily on [...] Read more.
Training a model to recognize human actions in videos is computationally intensive. While modern strategies employ transfer learning methods to make the process more efficient, they still face challenges regarding flexibility and efficiency. Existing solutions are limited in functionality and rely heavily on pretrained architectures, which can restrict their applicability to diverse scenarios. Our work explores knowledge distillation (KD) for enhancing the training of self-supervised video models in three aspects: improving classification accuracy, accelerating model convergence, and increasing model flexibility under regular and limited-data scenarios. We tested our method on the UCF101 dataset using differently balanced proportions: 100%, 50%, 25%, and 2%. We found that using knowledge distillation to guide the model’s training outperforms traditional training without affecting the classification accuracy and while reducing the convergence rate of model training in standard settings and a data-scarce environment. Additionally, knowledge distillation enables cross-architecture flexibility, allowing model customization for various applications: from resource-limited to high-performance scenarios. Full article
(This article belongs to the Special Issue Deep Learning in Computer Vision)
Show Figures

Figure 1

15 pages, 980 KiB  
Article
FishSegSSL: A Semi-Supervised Semantic Segmentation Framework for Fish-Eye Images
by Sneha Paul, Zachary Patterson and Nizar Bouguila
J. Imaging 2024, 10(3), 71; https://doi.org/10.3390/jimaging10030071 - 15 Mar 2024
Viewed by 970
Abstract
The application of large field-of-view (FoV) cameras equipped with fish-eye lenses brings notable advantages to various real-world computer vision applications, including autonomous driving. While deep learning has proven successful in conventional computer vision applications using regular perspective images, its potential in fish-eye camera [...] Read more.
The application of large field-of-view (FoV) cameras equipped with fish-eye lenses brings notable advantages to various real-world computer vision applications, including autonomous driving. While deep learning has proven successful in conventional computer vision applications using regular perspective images, its potential in fish-eye camera contexts remains largely unexplored due to limited datasets for fully supervised learning. Semi-supervised learning comes as a potential solution to manage this challenge. In this study, we explore and benchmark two popular semi-supervised methods from the perspective image domain for fish-eye image segmentation. We further introduce FishSegSSL, a novel fish-eye image segmentation framework featuring three semi-supervised components: pseudo-label filtering, dynamic confidence thresholding, and robust strong augmentation. Evaluation on the WoodScape dataset, collected from vehicle-mounted fish-eye cameras, demonstrates that our proposed method enhances the model’s performance by up to 10.49% over fully supervised methods using the same amount of labeled data. Our method also improves the existing image segmentation methods by 2.34%. To the best of our knowledge, this is the first work on semi-supervised semantic segmentation on fish-eye images. Additionally, we conduct a comprehensive ablation study and sensitivity analysis to showcase the efficacy of each proposed method in this research. Full article
(This article belongs to the Special Issue Deep Learning in Computer Vision)
Show Figures

Graphical abstract

Back to TopTop