Submit to Special Issue Submit Abstract to Special Issue Review for Applied Sciences Propose a Special Issue

Journal Menu

Journal Browser

Applications in Computer Vision and Image Processing

Print Special Issue Flyer
Special Issue Editors
Special Issue Information
Keywords
Benefits of Publishing in a Special Issue
Published Papers

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 30 October 2025 | Viewed by 9022

Share This Special Issue

Special Issue Editors

Dr. Cai Meng

E-Mail Website
Guest Editor

Image Processing Center, Beijing University of Aeronautics and Astronautics, Beijing 100191, China
Interests: image guidance; surgical robotics; computer vision
Special Issues, Collections and Topics in MDPI journals

Dr. Hongsheng He

E-Mail Website
Guest Editor

Department of Computer Science, The University of Alabama, Tuscaloosa, AL, USA
Interests: autonomy; intelligent robotics; machine learning

Special Issue Information

Dear Colleagues,

In recent years, the fields of computer vision and image processing have undergone transformative developments, fueled by advancements in machine learning, deep learning architectures, and computational power. This Special Issue, titled "Applications in Computer Vision and Image Processing", aims to explore the dynamic landscape of these fields, showcasing breakthrough research and innovative applications. We invite contributions that highlight novel methodologies, algorithms, and real-world applications that push the boundaries of what is achievable in analyzing, understanding, and visualizing image data. Papers focusing on areas such as object detection and recognition, image enhancement, 3D reconstruction, medical image analysis, and augmented reality are particularly welcome. This Special Issue endeavors to serve as a repository of knowledge that bridges the gap between theoretical research and practical implementations, making significant impacts across various industries, including medical surgery, automotive, robotics, and more.

We wish this edition to be an exciting collaboration of ideas from both academia and industry, fostering innovation and highlighting the latest advancements in the field. Whether you are a researcher, practitioner, or enthusiast, this Special Issue will provide a comprehensive insight into the ever-evolving world of computer vision and image processing. We look forward to your valuable contributions and are excited to share this knowledge with our readers worldwide. Thank you for being a part of this journey towards advancing technological frontiers.

Dr. Cai Meng
Dr. Hongsheng He
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

big data and computer vision
deep learning in computer vision
computer vision in robotics
computer vision in industry
computer vision in auto driving
visual SLAM
computer vision in remote sensing
target detection, recognition, and tracking
text detection and recognition
3D reconstruction
human–computer interaction
pattern recognition and analysis
image segmentation
image restoration or image impaiting
other topics related to applications of computer vision and image processing

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (6 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

29 pages, 5173 KiB

Open AccessArticle

A Quantitative Evaluation of UAV Flight Parameters for SfM-Based 3D Reconstruction of Buildings

by Inho Jo, Yunku Lee, Namhyuk Ham, Juhyung Kim and Jae-Jun Kim

Appl. Sci. 2025, 15(13), 7196; https://doi.org/10.3390/app15137196 - 26 Jun 2025

Viewed by 320

Abstract

This study aims to address the critical lack of standardized guidelines for unmanned aerial vehicle (UAV) image acquisition strategies utilizing structure-from-motion (SfM) by focusing on 3D building exterior modeling. A comprehensive experimental analysis was conducted to systematically investigate and quantitatively evaluate the effects of various shooting patterns and parameters on SfM reconstruction quality and processing efficiency. This study implemented a systematic experimental framework to test various UAV flight patterns, including circular, surface, and aerial configurations. Under controlled environmental conditions on representative building structures, key variables were manipulated, and all collected data were processed through a consistent SfM pipeline based on the SIFT algorithm. Quantitative evaluation results using various analytical methodologies (multiple regression analysis, Kruskal–Wallis test, random forest feature importance, principal component analysis including K-means clustering, response surface methodology (RSM), preference ranking technique based on similarity to the ideal solution (TOPSIS), and Pareto optimization) revealed that the basic shooting pattern ‘type’ has a significant and statistically significant influence on all major SfM performance metrics (reprojection error, final point count, computation time, reconstruction completeness; Kruskal–Wallis p < 0.001). Additionally, within the patterns, clear parameter sensitivity and complex nonlinear relationships were identified (e.g., overlapping variables play a decisive role in determining the point count and completeness of surface patterns, with an adjusted R² ≈ 0.70; the results of circular patterns are strongly influenced by the interaction between radius and tilt angle on reprojection error and point count, with an adjusted R² ≈ 0.80). Furthermore, composite pattern analysis using TOPSIS identified excellent combinations that balanced multiple criteria, and Pareto optimization explicitly quantified the inherent trade-offs between conflicting objectives (e.g., time vs. accuracy, number of points vs. completeness). In conclusion, this study clearly demonstrates that hierarchical strategic approaches are essential for optimizing UAV-SfM data collection. Additionally, it provides important empirical data, a validated methodological framework, and specific quantitative guidelines for standardizing UAV data collection workflows, thereby improving existing empirical or case-specific approaches. Full article

(This article belongs to the Special Issue Applications in Computer Vision and Image Processing)

► Show Figures

Figure 1

20 pages, 8769 KiB

Open AccessArticle

Mamba-DQN: Adaptively Tunes Visual SLAM Parameters Based on Historical Observation DQN

by Xubo Ma, Chuhua Huang, Xin Huang and Wangping Wu

Appl. Sci. 2025, 15(6), 2950; https://doi.org/10.3390/app15062950 - 9 Mar 2025

Viewed by 1093

Abstract

The parameter configuration of traditional visual SLAM algorithms usually relies on expert experience and extensive experiments, and the parameter configuration needs to be reset as the scene changes, which is a complex and tedious process. To achieve parameter adaptation in visual SLAM, we propose the Mamba-DQN method, which transforms complex parameter adjustment tasks into policy learning assignments for the agent. In this paper, we select the key parameters of visual SLAM to construct the agent action space. The reward function is constructed based on the absolute trajectory error (ATE), and the Mamba history observer is built within the agent to learn the observation trajectory, aiming to improve the quality of the agent’s decisions. Finally, the proposed method was experimented on the EuRoc MAV and TUM-VI datasets. The experimental results show that Mamba-DQN not only enhances the positioning accuracy of visual SLAM and demonstrates good real-time performance but also avoids the tedious parameter adjustment process. Full article

(This article belongs to the Special Issue Applications in Computer Vision and Image Processing)

► Show Figures

Figure 1

18 pages, 36707 KiB

Open AccessArticle

High-Precision Image Editing via Dual Attention Control in Diffusion Models Without Fine-Tuning

by Zhiqiang Pan, Yingchun Kuang, Jianmei Lan and Lizhuo Zhang

Appl. Sci. 2025, 15(3), 1079; https://doi.org/10.3390/app15031079 - 22 Jan 2025

Viewed by 2443

Abstract

Existing diffusion models outperform generative models like Generative Adversarial Networks in image synthesis and editing. However, they struggle with high-precision image editing while preserving image details and the accuracy of editing instructions. To address these challenges, we propose a dual attention control method to achieve high-precision image editing. Our approach includes two key attention control modules: (1) cross-attention control module, which combines the cross-attention maps of the original and edited images through weighted parameters, ensures that the synthesized edited image retains the structure of the input image. (2) Self-attention control module, which varies based on the editing task, applied at “coarse” and “fine” layers, since the coarse layers help maintain input image details and the fine layers are better suited for style transformations. Experimental evaluations have demonstrated that our approach achieves excellent results in detail preservation, content consistency, visual realism, and semantic understanding, making it especially suitable for tasks requiring high-precision editing. Specifically, compared to the editing outcomes under no control conditions, the introduction of dual visual attention control has led to an increase of 6.19% in CLIP scores, a reduction of 29.3% in LPIPS, and a decrease of 24.7% in FID. These significant improvements not only validate the effectiveness of the dual attention control but also attest to the method’s substantial flexibility and adaptability across different scenarios. Notably, our approach is a zero-shot solution, requiring no user optimization or fine-tuning, facilitating real-world applications. Full article

(This article belongs to the Special Issue Applications in Computer Vision and Image Processing)

► Show Figures

Figure 1

17 pages, 6879 KiB

Open AccessArticle

Machine Learning Models for Artist Classification of Cultural Heritage Sketches

by Gianina Chirosca, Roxana Rădvan, Silviu Mușat, Matei Pop and Alecsandru Chirosca

Appl. Sci. 2025, 15(1), 212; https://doi.org/10.3390/app15010212 - 30 Dec 2024

Cited by 1 | Viewed by 1325

Abstract

Modern computer vision algorithms allow researchers and art historians to search for artist-characteristic contour extraction from sketches, thus providing accurate input for artwork analysis, for possible assignments and classifications, and also for the identification of the specific stylistic features. We approach this challenging task with three machine learning algorithms and evaluate their performance on a small collection of images from five distinct artists. These algorithms aim to find the most appropriate artist for a sketch (or a contour of a sketch), with promising results that have a higher level of confidence (around 92%). Models start from common Faster R-CNN architectures, reinforcement learning, and vector extraction tools. The proposed tool provides a base for future improvements to create a tool that aids artwork evaluators. Full article

(This article belongs to the Special Issue Applications in Computer Vision and Image Processing)

► Show Figures

Figure 1

23 pages, 106560 KiB

Open AccessArticle

RLUNet: Overexposure-Content-Recovery-Based Single HDR Image Reconstruction with the Imaging Pipeline Principle

by Yiru Zheng, Wei Wang, Xiao Wang and Xin Yuan

Appl. Sci. 2024, 14(23), 11289; https://doi.org/10.3390/app142311289 - 3 Dec 2024

Viewed by 1667

Abstract

With the popularity of High Dynamic Range (HDR) display technology, consumer demand for HDR images is increasing. Since HDR cameras are expensive, reconstructing High Dynamic Range (HDR) images from traditional Low Dynamic Range (LDR) images is crucial. However, existing HDR image reconstruction algorithms often fail to recover fine details and do not adequately address the fundamental principles of the LDR imaging pipeline. To overcome these limitations, the Reversing Lossy UNet (RLUNet) has been proposed, aiming to effectively balance dynamic range expansion and recover overexposed areas through a deeper understanding of LDR image pipeline principles. The RLUNet model comprises the Reverse Lossy Network, which is designed according to the LDR–HDR framework and focuses on reconstructing HDR images by recovering overexposed regions, dequantizing, linearizing the mapping, and suppressing compression artifacts. This framework, grounded in the principles of the LDR imaging pipeline, is designed to reverse the operations involved in lossy image operations. Furthermore, the integration of the Texture Filling Module (TFM) block with the Recovery of Overexposed Regions (ROR) module in the RLUNet model enhances the visual performance and detail texture of the overexposed areas in the reconstructed HDR image. The experiments demonstrate that the proposed RLUNet model outperforms various state-of-the-art methods on different testsets. Full article

(This article belongs to the Special Issue Applications in Computer Vision and Image Processing)

► Show Figures

Figure 1

19 pages, 1669 KiB

Open AccessArticle

FETrack: Feature-Enhanced Transformer Network for Visual Object Tracking

by Hang Liu, Detian Huang and Mingxin Lin

Appl. Sci. 2024, 14(22), 10589; https://doi.org/10.3390/app142210589 - 17 Nov 2024

Viewed by 1438

Abstract

Visual object tracking is a fundamental task in computer vision, with applications ranging from video surveillance to autonomous driving. Despite recent advances in transformer-based one-stream trackers, unrestricted feature interactions between the template and the search region often introduce background noise into the template, degrading the tracking performance. To address this issue, we propose FETrack, a feature-enhanced transformer-based network for visual object tracking. Specifically, we incorporate an independent template stream in the encoder of the one-stream tracker to acquire the high-quality template features while suppressing the harmful background noise effectively. Then, we employ a sequence-learning-based causal transformer in the decoder to generate the bounding box autoregressively, simplifying the prediction head network. Further, we present a dynamic threshold-based online template-updating strategy and a template-filtering approach to boost tracking robustness and reduce redundant computations. Extensive experiments demonstrate that our FETrack achieves a superior performance over state-of-the-art trackers. Specifically, the proposed FETrack achieves a 75.1% AO on GOT-10k, 81.2% AUC on LaSOT, and 89.3%

P_{n o r m}

on TrackingNet. Full article

(This article belongs to the Special Issue Applications in Computer Vision and Image Processing)

► Show Figures

Journal Menu

Journal Browser

Applications in Computer Vision and Image Processing

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (6 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI