applsci-logo

Journal Browser

Journal Browser

Applications in Computer Vision and Image Processing

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (30 October 2025) | Viewed by 26925

Special Issue Editors

Image Processing Center, Beijing University of Aeronautics and Astronautics, Beijing 100191, China
Interests: image guidance; surgical robotics; computer vision
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Computer Science, The University of Alabama, Tuscaloosa, AL, USA
Interests: autonomy; intelligent robotics; machine learning

Special Issue Information

Dear Colleagues,

In recent years, the fields of computer vision and image processing have undergone transformative developments, fueled by advancements in machine learning, deep learning architectures, and computational power. This Special Issue, titled "Applications in Computer Vision and Image Processing", aims to explore the dynamic landscape of these fields, showcasing breakthrough research and innovative applications. We invite contributions that highlight novel methodologies, algorithms, and real-world applications that push the boundaries of what is achievable in analyzing, understanding, and visualizing image data. Papers focusing on areas such as object detection and recognition, image enhancement, 3D reconstruction, medical image analysis, and augmented reality are particularly welcome. This Special Issue endeavors to serve as a repository of knowledge that bridges the gap between theoretical research and practical implementations, making significant impacts across various industries, including medical surgery, automotive, robotics, and more.

We wish this edition to be an exciting collaboration of ideas from both academia and industry, fostering innovation and highlighting the latest advancements in the field. Whether you are a researcher, practitioner, or enthusiast, this Special Issue will provide a comprehensive insight into the ever-evolving world of computer vision and image processing. We look forward to your valuable contributions and are excited to share this knowledge with our readers worldwide. Thank you for being a part of this journey towards advancing technological frontiers.

Dr. Cai Meng
Dr. Hongsheng He
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • big data and computer vision
  • deep learning in computer vision
  • computer vision in robotics
  • computer vision in industry
  • computer vision in auto driving
  • visual SLAM
  • computer vision in remote sensing
  • target detection, recognition, and tracking
  • text detection and recognition
  • 3D reconstruction
  • human–computer interaction
  • pattern recognition and analysis
  • image segmentation
  • image restoration or image impaiting
  • other topics related to applications of computer vision and image processing

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

28 pages, 73215 KB  
Article
Linear-Region-Based Contour Tracking for Edge Images
by Erick Huitrón-Ramírez, Leonel G. Corona-Ramírez and Diego Jiménez-Badillo
Appl. Sci. 2026, 16(1), 509; https://doi.org/10.3390/app16010509 - 4 Jan 2026
Viewed by 1084
Abstract
This work presents the Linear-Region-Based Contour Tracking (LRCT) method for extracting external contours in images, designed to achieve an accurate and efficient description of shapes, particularly useful for archaeological materials with irregular geometries. The approach treats the contour as a discrete signal and [...] Read more.
This work presents the Linear-Region-Based Contour Tracking (LRCT) method for extracting external contours in images, designed to achieve an accurate and efficient description of shapes, particularly useful for archaeological materials with irregular geometries. The approach treats the contour as a discrete signal and analyzes image regions containing edge segments. From these regions, a local linear model is estimated to guide the selection and chaining of representative pixels, yielding a continuous perimeter trajectory. This strategy reduces the amount of data required to describe the contour without compromising shape fidelity. As a case study, the method was applied to images of replicas of archaeological materials exhibiting substantial variations in color and morphology. The results show that the obtained trajectories are comparable in quality to those obtained using classical pipelines based on Canny edge detection followed by Moore tracing, while providing more compact representations well suited for subsequent analyses. Consequently, the method offers an efficient and reproducible alternative for documentation, recording, and morphological comparison, strengthening data-driven approaches in archaeological research. Full article
(This article belongs to the Special Issue Applications in Computer Vision and Image Processing)
Show Figures

Figure 1

15 pages, 1730 KB  
Article
Research on Printed Circuit Board (PCB) Defect Detection Algorithm Based on Convolutional Neural Networks (CNN)
by Zhiduan Ni and Yeonhee Kim
Appl. Sci. 2025, 15(24), 13115; https://doi.org/10.3390/app152413115 - 12 Dec 2025
Viewed by 2427
Abstract
Printed Circuit Board (PCB) defect detection is critical for quality control in electronics manufacturing. Traditional manual inspection and classical Automated Optical Inspection (AOI) methods face challenges in speed, consistency, and flexibility. This paper proposes a CNN-based approach for automatic PCB defect detection using [...] Read more.
Printed Circuit Board (PCB) defect detection is critical for quality control in electronics manufacturing. Traditional manual inspection and classical Automated Optical Inspection (AOI) methods face challenges in speed, consistency, and flexibility. This paper proposes a CNN-based approach for automatic PCB defect detection using the YOLOv5 model. The method leverages a Convolutional Neural Network to identify various PCB defect types (e.g., open circuits, short circuits, and missing holes) from board images. In this study, a model was trained on a PCB image dataset with detailed annotations. Data augmentation techniques, such as sharpening and noise filtering, were applied to improve robustness. The experimental results showed that the proposed approach could locate and classify multiple defect types on PCBs, with overall detection precision and recall above 90% and 91%, respectively, enabling reliable automated inspection. A brief comparison with the latest YOLOv8 model is also presented, showing that the proposed CNN-based detector offers competitive performance. This study shows that deep learning-based defect detection can improve the PCB inspection efficiency and accuracy significantly, paving the way for intelligent manufacturing and quality assurance in PCB production. From a sensing perspective, we frame the system around an industrial RGB camera and controlled illumination, emphasizing how imaging-sensor choices and settings shape defect visibility and model robustness, and sketching future sensor-fusion directions. Full article
(This article belongs to the Special Issue Applications in Computer Vision and Image Processing)
Show Figures

Figure 1

29 pages, 5173 KB  
Article
A Quantitative Evaluation of UAV Flight Parameters for SfM-Based 3D Reconstruction of Buildings
by Inho Jo, Yunku Lee, Namhyuk Ham, Juhyung Kim and Jae-Jun Kim
Appl. Sci. 2025, 15(13), 7196; https://doi.org/10.3390/app15137196 - 26 Jun 2025
Viewed by 1725
Abstract
This study aims to address the critical lack of standardized guidelines for unmanned aerial vehicle (UAV) image acquisition strategies utilizing structure-from-motion (SfM) by focusing on 3D building exterior modeling. A comprehensive experimental analysis was conducted to systematically investigate and quantitatively evaluate the effects [...] Read more.
This study aims to address the critical lack of standardized guidelines for unmanned aerial vehicle (UAV) image acquisition strategies utilizing structure-from-motion (SfM) by focusing on 3D building exterior modeling. A comprehensive experimental analysis was conducted to systematically investigate and quantitatively evaluate the effects of various shooting patterns and parameters on SfM reconstruction quality and processing efficiency. This study implemented a systematic experimental framework to test various UAV flight patterns, including circular, surface, and aerial configurations. Under controlled environmental conditions on representative building structures, key variables were manipulated, and all collected data were processed through a consistent SfM pipeline based on the SIFT algorithm. Quantitative evaluation results using various analytical methodologies (multiple regression analysis, Kruskal–Wallis test, random forest feature importance, principal component analysis including K-means clustering, response surface methodology (RSM), preference ranking technique based on similarity to the ideal solution (TOPSIS), and Pareto optimization) revealed that the basic shooting pattern ‘type’ has a significant and statistically significant influence on all major SfM performance metrics (reprojection error, final point count, computation time, reconstruction completeness; Kruskal–Wallis p < 0.001). Additionally, within the patterns, clear parameter sensitivity and complex nonlinear relationships were identified (e.g., overlapping variables play a decisive role in determining the point count and completeness of surface patterns, with an adjusted R2 ≈ 0.70; the results of circular patterns are strongly influenced by the interaction between radius and tilt angle on reprojection error and point count, with an adjusted R2 ≈ 0.80). Furthermore, composite pattern analysis using TOPSIS identified excellent combinations that balanced multiple criteria, and Pareto optimization explicitly quantified the inherent trade-offs between conflicting objectives (e.g., time vs. accuracy, number of points vs. completeness). In conclusion, this study clearly demonstrates that hierarchical strategic approaches are essential for optimizing UAV-SfM data collection. Additionally, it provides important empirical data, a validated methodological framework, and specific quantitative guidelines for standardizing UAV data collection workflows, thereby improving existing empirical or case-specific approaches. Full article
(This article belongs to the Special Issue Applications in Computer Vision and Image Processing)
Show Figures

Figure 1

20 pages, 8769 KB  
Article
Mamba-DQN: Adaptively Tunes Visual SLAM Parameters Based on Historical Observation DQN
by Xubo Ma, Chuhua Huang, Xin Huang and Wangping Wu
Appl. Sci. 2025, 15(6), 2950; https://doi.org/10.3390/app15062950 - 9 Mar 2025
Cited by 1 | Viewed by 3170
Abstract
The parameter configuration of traditional visual SLAM algorithms usually relies on expert experience and extensive experiments, and the parameter configuration needs to be reset as the scene changes, which is a complex and tedious process. To achieve parameter adaptation in visual SLAM, we [...] Read more.
The parameter configuration of traditional visual SLAM algorithms usually relies on expert experience and extensive experiments, and the parameter configuration needs to be reset as the scene changes, which is a complex and tedious process. To achieve parameter adaptation in visual SLAM, we propose the Mamba-DQN method, which transforms complex parameter adjustment tasks into policy learning assignments for the agent. In this paper, we select the key parameters of visual SLAM to construct the agent action space. The reward function is constructed based on the absolute trajectory error (ATE), and the Mamba history observer is built within the agent to learn the observation trajectory, aiming to improve the quality of the agent’s decisions. Finally, the proposed method was experimented on the EuRoc MAV and TUM-VI datasets. The experimental results show that Mamba-DQN not only enhances the positioning accuracy of visual SLAM and demonstrates good real-time performance but also avoids the tedious parameter adjustment process. Full article
(This article belongs to the Special Issue Applications in Computer Vision and Image Processing)
Show Figures

Figure 1

18 pages, 36707 KB  
Article
High-Precision Image Editing via Dual Attention Control in Diffusion Models Without Fine-Tuning
by Zhiqiang Pan, Yingchun Kuang, Jianmei Lan and Lizhuo Zhang
Appl. Sci. 2025, 15(3), 1079; https://doi.org/10.3390/app15031079 - 22 Jan 2025
Cited by 2 | Viewed by 7420
Abstract
Existing diffusion models outperform generative models like Generative Adversarial Networks in image synthesis and editing. However, they struggle with high-precision image editing while preserving image details and the accuracy of editing instructions. To address these challenges, we propose a dual attention control method [...] Read more.
Existing diffusion models outperform generative models like Generative Adversarial Networks in image synthesis and editing. However, they struggle with high-precision image editing while preserving image details and the accuracy of editing instructions. To address these challenges, we propose a dual attention control method to achieve high-precision image editing. Our approach includes two key attention control modules: (1) cross-attention control module, which combines the cross-attention maps of the original and edited images through weighted parameters, ensures that the synthesized edited image retains the structure of the input image. (2) Self-attention control module, which varies based on the editing task, applied at “coarse” and “fine” layers, since the coarse layers help maintain input image details and the fine layers are better suited for style transformations. Experimental evaluations have demonstrated that our approach achieves excellent results in detail preservation, content consistency, visual realism, and semantic understanding, making it especially suitable for tasks requiring high-precision editing. Specifically, compared to the editing outcomes under no control conditions, the introduction of dual visual attention control has led to an increase of 6.19% in CLIP scores, a reduction of 29.3% in LPIPS, and a decrease of 24.7% in FID. These significant improvements not only validate the effectiveness of the dual attention control but also attest to the method’s substantial flexibility and adaptability across different scenarios. Notably, our approach is a zero-shot solution, requiring no user optimization or fine-tuning, facilitating real-world applications. Full article
(This article belongs to the Special Issue Applications in Computer Vision and Image Processing)
Show Figures

Figure 1

17 pages, 6879 KB  
Article
Machine Learning Models for Artist Classification of Cultural Heritage Sketches
by Gianina Chirosca, Roxana Rădvan, Silviu Mușat, Matei Pop and Alecsandru Chirosca
Appl. Sci. 2025, 15(1), 212; https://doi.org/10.3390/app15010212 - 30 Dec 2024
Cited by 5 | Viewed by 3494
Abstract
Modern computer vision algorithms allow researchers and art historians to search for artist-characteristic contour extraction from sketches, thus providing accurate input for artwork analysis, for possible assignments and classifications, and also for the identification of the specific stylistic features. We approach this challenging [...] Read more.
Modern computer vision algorithms allow researchers and art historians to search for artist-characteristic contour extraction from sketches, thus providing accurate input for artwork analysis, for possible assignments and classifications, and also for the identification of the specific stylistic features. We approach this challenging task with three machine learning algorithms and evaluate their performance on a small collection of images from five distinct artists. These algorithms aim to find the most appropriate artist for a sketch (or a contour of a sketch), with promising results that have a higher level of confidence (around 92%). Models start from common Faster R-CNN architectures, reinforcement learning, and vector extraction tools. The proposed tool provides a base for future improvements to create a tool that aids artwork evaluators. Full article
(This article belongs to the Special Issue Applications in Computer Vision and Image Processing)
Show Figures

Figure 1

23 pages, 106560 KB  
Article
RLUNet: Overexposure-Content-Recovery-Based Single HDR Image Reconstruction with the Imaging Pipeline Principle
by Yiru Zheng, Wei Wang, Xiao Wang and Xin Yuan
Appl. Sci. 2024, 14(23), 11289; https://doi.org/10.3390/app142311289 - 3 Dec 2024
Cited by 1 | Viewed by 3121
Abstract
With the popularity of High Dynamic Range (HDR) display technology, consumer demand for HDR images is increasing. Since HDR cameras are expensive, reconstructing High Dynamic Range (HDR) images from traditional Low Dynamic Range (LDR) images is crucial. However, existing HDR image reconstruction algorithms [...] Read more.
With the popularity of High Dynamic Range (HDR) display technology, consumer demand for HDR images is increasing. Since HDR cameras are expensive, reconstructing High Dynamic Range (HDR) images from traditional Low Dynamic Range (LDR) images is crucial. However, existing HDR image reconstruction algorithms often fail to recover fine details and do not adequately address the fundamental principles of the LDR imaging pipeline. To overcome these limitations, the Reversing Lossy UNet (RLUNet) has been proposed, aiming to effectively balance dynamic range expansion and recover overexposed areas through a deeper understanding of LDR image pipeline principles. The RLUNet model comprises the Reverse Lossy Network, which is designed according to the LDR–HDR framework and focuses on reconstructing HDR images by recovering overexposed regions, dequantizing, linearizing the mapping, and suppressing compression artifacts. This framework, grounded in the principles of the LDR imaging pipeline, is designed to reverse the operations involved in lossy image operations. Furthermore, the integration of the Texture Filling Module (TFM) block with the Recovery of Overexposed Regions (ROR) module in the RLUNet model enhances the visual performance and detail texture of the overexposed areas in the reconstructed HDR image. The experiments demonstrate that the proposed RLUNet model outperforms various state-of-the-art methods on different testsets. Full article
(This article belongs to the Special Issue Applications in Computer Vision and Image Processing)
Show Figures

Figure 1

19 pages, 1669 KB  
Article
FETrack: Feature-Enhanced Transformer Network for Visual Object Tracking
by Hang Liu, Detian Huang and Mingxin Lin
Appl. Sci. 2024, 14(22), 10589; https://doi.org/10.3390/app142210589 - 17 Nov 2024
Cited by 1 | Viewed by 3087
Abstract
Visual object tracking is a fundamental task in computer vision, with applications ranging from video surveillance to autonomous driving. Despite recent advances in transformer-based one-stream trackers, unrestricted feature interactions between the template and the search region often introduce background noise into the template, [...] Read more.
Visual object tracking is a fundamental task in computer vision, with applications ranging from video surveillance to autonomous driving. Despite recent advances in transformer-based one-stream trackers, unrestricted feature interactions between the template and the search region often introduce background noise into the template, degrading the tracking performance. To address this issue, we propose FETrack, a feature-enhanced transformer-based network for visual object tracking. Specifically, we incorporate an independent template stream in the encoder of the one-stream tracker to acquire the high-quality template features while suppressing the harmful background noise effectively. Then, we employ a sequence-learning-based causal transformer in the decoder to generate the bounding box autoregressively, simplifying the prediction head network. Further, we present a dynamic threshold-based online template-updating strategy and a template-filtering approach to boost tracking robustness and reduce redundant computations. Extensive experiments demonstrate that our FETrack achieves a superior performance over state-of-the-art trackers. Specifically, the proposed FETrack achieves a 75.1% AO on GOT-10k, 81.2% AUC on LaSOT, and 89.3% Pnorm on TrackingNet. Full article
(This article belongs to the Special Issue Applications in Computer Vision and Image Processing)
Show Figures

Figure 1

Back to TopTop