Deep Learning Technologies and Their Applications in Image Processing, Computer Vision, and Computational Intelligence

A special issue of AI (ISSN 2673-2688).

Deadline for manuscript submissions: 15 May 2026 | Viewed by 1471

Special Issue Editors

College of Electronics and Information Engineering, Sichuan University, Chengdu 610065, China
Interests: image/video restoration; image/video coding; machine learning; image segmentation
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Deep learning has emerged as a pivotal technology across diverse domains, including image processing, computer vision, natural language processing, speech recognition, and beyond. With rapid advancements in artificial intelligence, deep learning, and high-performance computing, image, vision, and computing technologies have been widely implemented in autonomous driving, medical imaging, smart cities, augmented reality, and other cutting-edge fields.

These technological breakthroughs and expanded applications not only offer novel tools and methodologies for scientific research but also enhance industrial innovation. In the era of intelligence and digitization, deep learning technologies and their applications in image, vision, and computing are accelerating societal progress while providing critical support for future talent cultivation and disciplinary development.

This Special Issue will showcase the latest advances in deep learning, encompassing fundamental technologies and interdisciplinary applications in image processing, computer vision, intelligent computing, and related domains. We invite original research papers and comprehensive literature reviews addressing the aforementioned topics. Particularly, extended versions of papers accepted at ICIVC 2025 (https://icivc.org/) and ICDLT 2025 (https://www.icdlt.org/) are highly encouraged.

You may choose our Joint Special Issue in Sensors.

Dr. Honggang Chen
Dr. Chao Ren
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. AI is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • deep learning models and algorithms
  • machine learning theory and technology
  • image processing theory and applications
  • computer graphics and computational photography
  • computer vision techniques and applications
  • multimedia technology

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (2 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

21 pages, 899 KB  
Article
Gated Fusion Networks for Multi-Modal Violence Detection
by Bilal Ahmad, Mustaqeem Khan and Muhammad Sajjad
AI 2025, 6(10), 259; https://doi.org/10.3390/ai6100259 - 3 Oct 2025
Viewed by 383
Abstract
Public safety and security require an effective monitoring system to detect violence through visual, audio, and motion data. However, current methods often fail to utilize the complementary benefits of visual and auditory modalities, thereby reducing their overall effectiveness. To enhance violence detection, we [...] Read more.
Public safety and security require an effective monitoring system to detect violence through visual, audio, and motion data. However, current methods often fail to utilize the complementary benefits of visual and auditory modalities, thereby reducing their overall effectiveness. To enhance violence detection, we present a novel multimodal method in this paper that detects motion, audio, and visual information from the input to recognize violence. We designed a framework comprising two specialized components: a gated fusion module and a multi-scale transformer, which enables the efficient detection of violence in multimodal data. To ensure a seamless and effective integration of features, a gated fusion module dynamically adjusts the contribution of each modality. At the same time, a multi-modal transformer utilizes multiple instance learning (MIL) to identify violent behaviors more accurately from input data by capturing complex temporal correlations. Our model fully integrates multi-modal information using these techniques, improving the accuracy of violence detection. In this study, we found that our approach outperformed state-of-the-art methods with an accuracy of 86.85% using the XD-Violence dataset, thereby demonstrating the potential of multi-modal fusion in detecting violence. Full article
Show Figures

Figure 1

15 pages, 4635 KB  
Article
GLNet-YOLO: Multimodal Feature Fusion for Pedestrian Detection
by Yi Zhang, Qing Zhao, Xurui Xie, Yang Shen, Jinhe Ran, Shu Gui, Haiyan Zhang, Xiuhe Li and Zhen Zhang
AI 2025, 6(9), 229; https://doi.org/10.3390/ai6090229 - 12 Sep 2025
Viewed by 750
Abstract
In the field of modern computer vision, pedestrian detection technology holds significant importance in applications such as intelligent surveillance, autonomous driving, and robot navigation. However, single-modal images struggle to achieve high-precision detection in complex environments. To address this, this study proposes a GLNet-YOLO [...] Read more.
In the field of modern computer vision, pedestrian detection technology holds significant importance in applications such as intelligent surveillance, autonomous driving, and robot navigation. However, single-modal images struggle to achieve high-precision detection in complex environments. To address this, this study proposes a GLNet-YOLO framework based on cross-modal deep feature fusion, aiming to improve pedestrian detection performance in complex environments by fusing feature information from visible light and infrared images. By extending the YOLOv11 architecture, the framework adopts a dual-branch network structure to process visible light and infrared modal inputs, respectively, and introduces the FM module to realize global feature fusion and enhancement, as well as the DMR module to accomplish local feature separation and interaction. Experimental results show that on the LLVIP dataset, compared to the single-modal YOLOv11 baseline, our fused model improves the mAP@50 by 9.2% over the visible-light-only model and 0.7% over the infrared-only model. This significantly improves the detection accuracy under low-light and complex background conditions and enhances the robustness of the algorithm, and its effectiveness is further verified on the KAIST dataset. Full article
Show Figures

Figure 1

Back to TopTop