Deep Learning in Image Analysis: Progress and Challenges

A special issue of Journal of Imaging (ISSN 2313-433X).

Deadline for manuscript submissions: 31 December 2024 | Viewed by 5414

Special Issue Editor


E-Mail Website
Guest Editor
School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing 100044, China
Interests: medical image analysis; object detection; person ReID; deep learning; domain generalization

Special Issue Information

Dear Colleagues,

Due to its huge advantages, deep-learning-related methods have become the mainstream technology in image analysis, making significant progress and holding vast prospects. They are applied in a variety of tasks, such as image classification, object segmentation, object detection, image registration, image fusion, biomedical engineering, natural language processing, etc. Among different kinds of deep learning techniques, supervised learning was first adopted. Later, unsupervised, semi-supervised, few-shot, one-shot, and zero-shot learning methods received extensive attention. For the network structure, convolutional neural networks, recurrent neural networks, generative networks, attention mechanisms, and transformers have been designed and widely applied.

Which advancements will eventually be more productive and innovative in this field?

We request contributions presenting techniques (methods, tools, ideas, or even market evaluations) that will contribute to the future roadmap of deep learning, as well as concepts for significantly innovative objectives in image analysis techniques. This Special Issue will discuss the novel supervised, unsupervised, semi-supervised, few-shot, etc., deep learning methods in image analysis, including classification, segmentation, detection, registration, domain generalization, multi-modality fusion, etc. Scientifically founded innovative and speculative research lines are welcome for proposal and evaluation.

Prof. Dr. Yanfeng Li
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Journal of Imaging is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • deep learning
  • image classification
  • image segmentation
  • object detection
  • image registration
  • image fusion
  • multi-modality
  • domain generalization
  • unsupervised learning
  • semi-supervised learning
  • few-shot learning

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

22 pages, 18461 KiB  
Article
Learning More May Not Be Better: Knowledge Transferability in Vision-and-Language Tasks
by Tianwei Chen, Noa Garcia, Mayu Otani, Chenhui Chu, Yuta Nakashima and Hajime Nagahara
J. Imaging 2024, 10(12), 300; https://doi.org/10.3390/jimaging10120300 - 22 Nov 2024
Viewed by 480
Abstract
Is learning more knowledge always better for vision-and-language models? In this paper, we study knowledge transferability in multi-modal tasks. The current tendency in machine learning is to assume that by joining multiple datasets from different tasks, their overall performance improves. However, we show [...] Read more.
Is learning more knowledge always better for vision-and-language models? In this paper, we study knowledge transferability in multi-modal tasks. The current tendency in machine learning is to assume that by joining multiple datasets from different tasks, their overall performance improves. However, we show that not all knowledge transfers well or has a positive impact on related tasks, even when they share a common goal. We conducted an exhaustive analysis based on hundreds of cross-experiments on twelve vision-and-language tasks categorized into four groups. While tasks in the same group are prone to improve each other, results show that this is not always the case. In addition, other factors, such as dataset size or the pre-training stage, may have a great impact on how well the knowledge is transferred. Full article
(This article belongs to the Special Issue Deep Learning in Image Analysis: Progress and Challenges)
Show Figures

Figure 1

20 pages, 12767 KiB  
Article
A Real-Time End-to-End Framework with a Stacked Model Using Ultrasound Video for Cardiac Septal Defect Decision-Making
by Siti Nurmani, Ria Nova, Ade Iriani Sapitri, Muhammad Naufal Rachmatullah, Bambang Tutuko, Firdaus Firdaus, Annisa Darmawahyuni, Anggun Islami, Satria Mandala, Radiyati Umi Partan, Akhiar Wista Arum and Rio Bastian
J. Imaging 2024, 10(11), 280; https://doi.org/10.3390/jimaging10110280 - 3 Nov 2024
Viewed by 942
Abstract
Echocardiography is the gold standard for the comprehensive diagnosis of cardiac septal defects (CSDs). Currently, echocardiography diagnosis is primarily based on expert observation, which is laborious and time-consuming. With digitization, deep learning (DL) can be used to improve the efficiency of the diagnosis. [...] Read more.
Echocardiography is the gold standard for the comprehensive diagnosis of cardiac septal defects (CSDs). Currently, echocardiography diagnosis is primarily based on expert observation, which is laborious and time-consuming. With digitization, deep learning (DL) can be used to improve the efficiency of the diagnosis. This study presents a real-time end-to-end framework tailored for pediatric ultrasound video analysis for CSD decision-making. The framework employs an advanced real-time architecture based on You Only Look Once (Yolo) techniques for CSD decision-making with high accuracy. Leveraging the state of the art with the Yolov8l (large) architecture, the proposed model achieves a robust performance in real-time processes. It can be observed that the experiment yielded a mean average precision (mAP) exceeding 89%, indicating the framework’s effectiveness in accurately diagnosing CSDs from ultrasound (US) videos. The Yolov8l model exhibits precise performance in the real-time testing of pediatric patients from Mohammad Hoesin General Hospital in Palembang, Indonesia. Based on the results of the proposed model using 222 US videos, it exhibits 95.86% accuracy, 96.82% sensitivity, and 98.74% specificity. During real-time testing in the hospital, the model exhibits a 97.17% accuracy, 95.80% sensitivity, and 98.15% specificity; only 3 out of the 53 US videos in the real-time process were diagnosed incorrectly. This comprehensive approach holds promise for enhancing clinical decision-making and improving patient outcomes in pediatric cardiology. Full article
(This article belongs to the Special Issue Deep Learning in Image Analysis: Progress and Challenges)
Show Figures

Figure 1

22 pages, 4136 KiB  
Article
DepthCrackNet: A Deep Learning Model for Automatic Pavement Crack Detection
by Alireza Saberironaghi and Jing Ren
J. Imaging 2024, 10(5), 100; https://doi.org/10.3390/jimaging10050100 - 26 Apr 2024
Cited by 1 | Viewed by 2265
Abstract
Detecting cracks in the pavement is a vital component of ensuring road safety. Since manual identification of these cracks can be time-consuming, an automated method is needed to speed up this process. However, creating such a system is challenging due to factors including [...] Read more.
Detecting cracks in the pavement is a vital component of ensuring road safety. Since manual identification of these cracks can be time-consuming, an automated method is needed to speed up this process. However, creating such a system is challenging due to factors including crack variability, variations in pavement materials, and the occurrence of miscellaneous objects and anomalies on the pavement. Motivated by the latest progress in deep learning applied to computer vision, we propose an effective U-Net-shaped model named DepthCrackNet. Our model employs the Double Convolution Encoder (DCE), composed of a sequence of convolution layers, for robust feature extraction while keeping parameters optimally efficient. We have incorporated the TriInput Multi-Head Spatial Attention (TMSA) module into our model; in this module, each head operates independently, capturing various spatial relationships and boosting the extraction of rich contextual information. Furthermore, DepthCrackNet employs the Spatial Depth Enhancer (SDE) module, specifically designed to augment the feature extraction capabilities of our segmentation model. The performance of the DepthCrackNet was evaluated on two public crack datasets: Crack500 and DeepCrack. In our experimental studies, the network achieved mIoU scores of 77.0% and 83.9% with the Crack500 and DeepCrack datasets, respectively. Full article
(This article belongs to the Special Issue Deep Learning in Image Analysis: Progress and Challenges)
Show Figures

Figure 1

Review

Jump to: Research

15 pages, 627 KiB  
Review
Real-Time Emotion Recognition for Improving the Teaching–Learning Process: A Scoping Review
by Cèlia Llurba and Ramon Palau
J. Imaging 2024, 10(12), 313; https://doi.org/10.3390/jimaging10120313 - 9 Dec 2024
Viewed by 383
Abstract
Emotion recognition (ER) is gaining popularity in various fields, including education. The benefits of ER in the classroom for educational purposes, such as improving students’ academic performance, are gradually becoming known. Thus, real-time ER is proving to be a valuable tool for teachers [...] Read more.
Emotion recognition (ER) is gaining popularity in various fields, including education. The benefits of ER in the classroom for educational purposes, such as improving students’ academic performance, are gradually becoming known. Thus, real-time ER is proving to be a valuable tool for teachers as well as for students. However, its feasibility in educational settings requires further exploration. This review offers learning experiences based on real-time ER with students to explore their potential in learning and in improving their academic achievement. The purpose is to present evidence of good implementation and suggestions for their successful application. The content analysis finds that most of the practices lead to significant improvements in terms of educational purposes. Nevertheless, the analysis identifies problems that might block the implementation of these practices in the classroom and in education; among the obstacles identified are the absence of privacy of the students and the support needs of the students. We conclude that artificial intelligence (AI) and ER are potential tools to approach the needs in ordinary classrooms, although reliable automatic recognition is still a challenge for researchers to achieve the best ER feature in real time, given the high input data variability. Full article
(This article belongs to the Special Issue Deep Learning in Image Analysis: Progress and Challenges)
Show Figures

Figure 1

Back to TopTop