Machine Learning Applications in Image Processing and Computer Vision

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "E1: Mathematics and Computer Science".

Deadline for manuscript submissions: 20 June 2026 | Viewed by 5917

Special Issue Editor


E-Mail Website
Guest Editor
Facultad de Ingeniería, Universidad de Ibagué, Ibagué 730002, Colombia
Interests: image processing; machine learning; pattern recognition; microscopy and biomedical image analysis
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

We are pleased to invite you to submit papers to a Special Issue dedicated to advancing the synergy between mathematical modeling, analytical methods, and machine learning techniques in image processing and computer vision. This Special Issue aims to showcase innovative research that addresses pressing challenges in various domains, such as medical diagnostics, biological imaging, remote sensing, and satellite image analysis, among others.

We seek contributions that propose novel solutions to improve the robustness, accuracy, interpretability, and explainability of algorithms within these fields. Submissions may include, but are not limited to, work that combines mathematical modeling with machine learning to enhance image analysis workflows, develop better noise reduction or feature extraction techniques, or create new approaches for image segmentation, classification, and anomaly detection.

Furthermore, we are especially interested in research that pushes the boundaries of current knowledge, particularly in emerging areas such as reinforcement learning, synthetic image generation, domain adaptation, and transfer learning. We welcome studies that propose frameworks to tackle domain-specific image challenges or that demonstrate the potential of interdisciplinary approaches to overcome limitations in traditional methods.

I hope this Special Issue will inspire further exploration and collaboration, contributing to advancements that have real-world impacts.

Dr. Manuel G. Forero
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • mathematical modeling
  • machine learning
  • image processing
  • computer vision
  • interpretability
  • explainable AI
  • medical imaging
  • synthetic image generation
  • feature extraction
  • anomaly detection

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (7 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

24 pages, 2777 KB  
Article
LightSeek-YOLO: A Lightweight Architecture for Real-Time Trapped Victim Detection in Disaster Scenarios
by Xiaowen Tian, Yubi Zheng, Liangqing Huang, Rengui Bi, Yu Chen, Shiqi Wang and Wenkang Su
Mathematics 2025, 13(19), 3231; https://doi.org/10.3390/math13193231 - 9 Oct 2025
Viewed by 424
Abstract
Rapid and accurate detection of trapped victims is vital in disaster rescue operations, yet most existing object detection methods cannot simultaneously deliver high accuracy and fast inference under resource-constrained conditions. To address this limitation, we propose the LightSeek-YOLO, a lightweight, real-time victim detection [...] Read more.
Rapid and accurate detection of trapped victims is vital in disaster rescue operations, yet most existing object detection methods cannot simultaneously deliver high accuracy and fast inference under resource-constrained conditions. To address this limitation, we propose the LightSeek-YOLO, a lightweight, real-time victim detection framework for disaster scenarios built upon YOLOv11. Our LightSeek-YOLO integrates three core innovations. First, it employs HGNetV2 as the backbone, whose HGStem and HGBlock modules leverage depthwise separable convolutions to markedly reduce computational cost while preserving feature extraction. Secondly, it introduces Seek-DS (Seek-DownSampling), a dual-branch downsampling module that preserves key feature extrema through a MaxPool branch while capturing spatial patterns via a progressive convolution branch, thereby effectively mitigating background interference. Third, it incorporates Seek-DH (Seek Detection Head), a lightweight detection head that processes features through a unified pipeline, enhancing scale adaptability while reducing parameter redundancy. Evaluated on the common C2A disaster dataset, LightSeek-YOLO achieves 0.478 AP@small for small-object detection, demonstrating strong robustness in challenging conditions such as rubble and smoke. Moreover, on the COCO, it reaches 0.473 mAP@[0.5:0.95], matching YOLOv8n while achieving superior computational efficiency through 38.2% parameter reduction and 39.5% FLOP reduction, and achieving 571.72 FPS on desktop hardware, with computational efficiency improvements suggesting potential for edge deployment pending validation. Full article
(This article belongs to the Special Issue Machine Learning Applications in Image Processing and Computer Vision)
Show Figures

Figure 1

30 pages, 769 KB  
Article
Mathematical Generalization of Kolmogorov-Arnold Networks (KAN) and Their Variants
by Fray L. Becerra-Suarez, Ana G. Borrero-Ramírez, Edwin Valencia-Castillo and Manuel G. Forero
Mathematics 2025, 13(19), 3128; https://doi.org/10.3390/math13193128 - 30 Sep 2025
Viewed by 912
Abstract
Neural networks have become a fundamental tool for solving complex problems, from image processing and speech recognition to time series prediction and large-scale data classification. However, traditional neural architectures suffer from interpretability problems due to their opaque representations and lack of explicit interaction [...] Read more.
Neural networks have become a fundamental tool for solving complex problems, from image processing and speech recognition to time series prediction and large-scale data classification. However, traditional neural architectures suffer from interpretability problems due to their opaque representations and lack of explicit interaction between linear and nonlinear transformations. To address these limitations, Kolmogorov–Arnold Networks (KAN) have emerged as a mathematically grounded approach capable of efficiently representing complex nonlinear functions. Based on the principles established by Kolmogorov and Arnold, KAN offer an alternative to traditional architectures, mitigating issues such as overfitting and lack of interpretability. Despite their solid theoretical basis, practical implementations of KAN face challenges, such as optimal function selection and computational efficiency. This paper provides a systematic review that goes beyond previous surveys by consolidating the diverse structural variants of KAN (e.g., Wavelet-KAN, Rational-KAN, MonoKAN, Physics-KAN, Linear Spline KAN, and Orthogonal Polynomial KAN) into a unified framework. In addition, we emphasize their mathematical foundations, compare their advantages and limitations, and discuss their applicability across domains. From this review, three main conclusions can be drawn: (i) spline-based KAN remain the most widely used due to their stability and simplicity, (ii) rational and wavelet-based variants provide greater expressivity but introduce numerical challenges, and (iii) emerging approaches such as Physics-KAN and automatic basis selection open promising directions for scalability and interpretability. These insights provide a benchmark for future research and practical implementations of KAN. Full article
(This article belongs to the Special Issue Machine Learning Applications in Image Processing and Computer Vision)
Show Figures

Figure 1

33 pages, 14767 KB  
Article
Night-to-Day Image Translation with Road Light Attention Training for Traffic Information Detection
by Ye-Jin Lee, Young-Ho Go, Seung-Hwan Lee, Dong-Min Son and Sung-Hak Lee
Mathematics 2025, 13(18), 2998; https://doi.org/10.3390/math13182998 - 16 Sep 2025
Viewed by 597
Abstract
Generative adversarial networks (GANs)-based image deep learning methods are useful to improve object visibility in nighttime driving environments, but they often fail to preserve critical road information like traffic light colors and vehicle lighting. This paper proposes a method to address this by [...] Read more.
Generative adversarial networks (GANs)-based image deep learning methods are useful to improve object visibility in nighttime driving environments, but they often fail to preserve critical road information like traffic light colors and vehicle lighting. This paper proposes a method to address this by utilizing both unpaired and four-channel paired training modules. The unpaired module performs the primary night-to-day conversion, while the paired module, enhanced with a fourth channel, focuses on preserving road details. Our key contribution is an inverse road light attention (RLA) map, which acts as this fourth channel to explicitly guide the network’s learning. This map also facilitates a final cross-blending process, synthesizing the results from both modules to maximize their respective advantages. Experimental results demonstrate that our approach more accurately preserves lane markings and traffic light colors. Furthermore, quantitative analysis confirms that our method achieves superior performance across eight no-reference image quality metrics compared to existing techniques. Full article
(This article belongs to the Special Issue Machine Learning Applications in Image Processing and Computer Vision)
Show Figures

Figure 1

28 pages, 3089 KB  
Article
A Taxonomy and Theoretical Analysis of Collapse Phenomena in Unsupervised Representation Learning
by Donghyeon Kim, Chae-Bong Sohn, Do-Yup Kim and Dae-Yeol Kim
Mathematics 2025, 13(18), 2986; https://doi.org/10.3390/math13182986 - 16 Sep 2025
Viewed by 736
Abstract
Unsupervised representation learning has emerged as a promising paradigm in machine learning, owing to its capacity to extract semantically meaningful features from unlabeled data. Despite recent progress, however, such methods remain vulnerable to collapse phenomena, wherein the expressiveness and diversity of learned representations [...] Read more.
Unsupervised representation learning has emerged as a promising paradigm in machine learning, owing to its capacity to extract semantically meaningful features from unlabeled data. Despite recent progress, however, such methods remain vulnerable to collapse phenomena, wherein the expressiveness and diversity of learned representations are severely degraded. This phenomenon poses significant challenges to both model performance and generalizability. This paper presents a systematic investigation into two distinct forms of collapse: complete collapse and dimensional collapse. Complete collapse typically arises in non-contrastive frameworks, where all learned representations converge to trivial constants, thereby rendering the learned feature space non-informative. While contrastive learning has been introduced as a principled remedy, recent empirical findings indicate that it falls to prevent collapse entirely. In particular, contrastive methods are still susceptible to dimensional collapse, where representations are confined to a narrow subspace, thus restricting both the information content and effective dimensionality. To address these concerns, we conduct a comprehensive literature analysis encompassing theoretical definitions, underlying causes, and mitigation strategies for each collapse type. We further categorize recent approaches to collapse prevention, including feature decorrelation techniques, eigenvalue distribution regularization, and batch-level statistical constraints, and assess their effectiveness through a comparative framework. This work aims to establish a unified conceptual foundation for understanding collapse in unsupervised learning and to guide the design of more robust representation learning algorithms. Full article
(This article belongs to the Special Issue Machine Learning Applications in Image Processing and Computer Vision)
Show Figures

Figure 1

23 pages, 6105 KB  
Article
YUV Color Model-Based Adaptive Pansharpening with Lanczos Interpolation and Spectral Weights
by Shavkat Fazilov, Ozod Yusupov, Erali Eshonqulov, Khabiba Abdieva and Ziyodullo Malikov
Mathematics 2025, 13(17), 2868; https://doi.org/10.3390/math13172868 - 5 Sep 2025
Viewed by 469
Abstract
Pansharpening is a method of image fusion that combines a panchromatic (PAN) image with high spatial resolution and multispectral (MS) images which possess different spectral characteristics and are frequently obtained from satellite sensors. Despite the development of numerous pansharpening methods in recent years, [...] Read more.
Pansharpening is a method of image fusion that combines a panchromatic (PAN) image with high spatial resolution and multispectral (MS) images which possess different spectral characteristics and are frequently obtained from satellite sensors. Despite the development of numerous pansharpening methods in recent years, a key challenge continues to be the maintenance of both spatial details and spectral accuracy in the combined image. To tackle this challenge, we introduce a new approach that enhances the component substitution-based Adaptive IHS method by integrating the YUV color model along with weighting coefficients influenced by the multispectral data. In our proposed approach, the conventional IHS color model is substituted with the YUV model to enhance spectral consistency. Additionally, Lanczos interpolation is used to upscale the MS image to match the spatial resolution of the PAN image. Each channel of the MS image is fused using adaptive weights derived from the influence of multispectral data, leading to the final pansharpened image. Based on the findings from experiments conducted on the PairMax and PanCollection datasets, our proposed method exhibited superior spectral and spatial performance when compared to several existing pansharpening techniques. Full article
(This article belongs to the Special Issue Machine Learning Applications in Image Processing and Computer Vision)
Show Figures

Figure 1

16 pages, 6137 KB  
Article
DMET: Dynamic Mask-Enhanced Transformer for Generalizable Deep Image Denoising
by Tong Zhu, Anqi Li, Yuan-Gen Wang, Wenkang Su and Donghua Jiang
Mathematics 2025, 13(13), 2167; https://doi.org/10.3390/math13132167 - 2 Jul 2025
Viewed by 677
Abstract
Different types of noise are inevitably introduced by devices during image acquisition and transmission processes. Therefore, image denoising remains a crucial challenge in computer vision. Deep learning, especially recent Transformer-based architectures, has demonstrated remarkable performance for image denoising tasks. However, due to its [...] Read more.
Different types of noise are inevitably introduced by devices during image acquisition and transmission processes. Therefore, image denoising remains a crucial challenge in computer vision. Deep learning, especially recent Transformer-based architectures, has demonstrated remarkable performance for image denoising tasks. However, due to its data-driven nature, deep learning can easily overfit the training data, leading to a lack of generalization ability. In order to address this issue, we present a novel Dynamic Mask-Enhanced Transformer (DMET) to improve the generalization capacity of denoising networks. Specifically, a texture-guided adaptive masking mechanism is introduced to simulate possible noise in practical applications. Then, we apply a masked hierarchical attention block to mitigate information loss and leverage global statistics, which combines shifted window multi-head self-attention with channel attention. Additionally, an attention mask is applied during training to reduce discrepancies between training and testing. Extensive experiments demonstrate that our approach achieves better generalization performance than state-of-the-art deep learning models and can be directly applied to real-world scenarios. Full article
(This article belongs to the Special Issue Machine Learning Applications in Image Processing and Computer Vision)
Show Figures

Figure 1

21 pages, 4055 KB  
Article
Modified Whale Optimization Algorithm for Multiclass Skin Cancer Classification
by Abdul Majid, Masad A. Alrasheedi, Abdulmajeed Atiah Alharbi, Jeza Allohibi and Seung-Won Lee
Mathematics 2025, 13(6), 929; https://doi.org/10.3390/math13060929 - 11 Mar 2025
Cited by 4 | Viewed by 1422
Abstract
Skin cancer is a major global health concern and one of the deadliest forms of cancer. Early and accurate detection significantly increases the chances of survival. However, traditional visual inspection methods are time-consuming and prone to errors due to artifacts and noise in [...] Read more.
Skin cancer is a major global health concern and one of the deadliest forms of cancer. Early and accurate detection significantly increases the chances of survival. However, traditional visual inspection methods are time-consuming and prone to errors due to artifacts and noise in dermoscopic images. To address these challenges, this paper proposes an innovative deep learning-based framework that integrates an ensemble of two pre-trained convolutional neural networks (CNNs), SqueezeNet and InceptionResNet-V2, combined with an improved Whale Optimization Algorithm (WOA) for feature selection. The deep features extracted from both models are fused to create a comprehensive feature set, which is then optimized using the proposed enhanced WOA that employs a quadratic decay function for dynamic parameter tuning and an advanced mutation mechanism to prevent premature convergence. The optimized features are fed into machine learning classifiers to achieve robust classification performance. The effectiveness of the framework is evaluated on two benchmark datasets, PH2 and Med-Node, achieving state-of-the-art classification accuracies of 95.48% and 98.59%, respectively. Comparative analysis with existing optimization algorithms and skin cancer classification approaches demonstrates the superiority of the proposed method in terms of accuracy, robustness, and computational efficiency. Our method outperforms the genetic algorithm (GA), Particle Swarm Optimization (PSO), and the slime mould algorithm (SMA), as well as deep learning-based skin cancer classification models, which have reported accuracies of 87% to 94% in previous studies. A more effective feature selection methodology improves accuracy and reduces computational overhead while maintaining robust performance. Our enhanced deep learning ensemble and feature selection technique can improve early-stage skin cancer diagnosis, as shown by these data. Full article
(This article belongs to the Special Issue Machine Learning Applications in Image Processing and Computer Vision)
Show Figures

Figure 1

Back to TopTop