applsci-logo

Journal Browser

Journal Browser

Application of Artificial Intelligence in Image Processing

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 20 November 2025 | Viewed by 7305

Special Issue Editor


E-Mail Website
Guest Editor
School of Software, Yunnan University, Kunming 650000, China
Interests: deep learning; image processing; fuzzy sets; information fusion

Special Issue Information

Dear Colleagues,

The field of image processing has long been at the forefront of leveraging artificial intelligence (AI) to revolutionize how we analyze, interpret, and utilize visual data. As we continue to push the boundaries of what is possible with AI, the integration of advanced algorithms and machine learning models has opened up new horizons in the realm of image analysis, leading to innovative applications across various industries. This has been facilitated by the rapid advancements in computational power, the availability of large annotated datasets, and the development of sophisticated AI techniques such as deep learning, computer vision, and pattern recognition.

The adoption of AI in image processing has triggered a wave of innovation that spans from medical imaging and diagnostics to security surveillance, autonomous vehicles, and digital media. The ability of AI to recognize patterns, classify images, and detect anomalies has far-reaching implications for enhancing efficiency, accuracy, and speed in image analysis tasks. In the face of these technological advancements, there is a growing need for research that explores the potential of AI in image processing, addresses the challenges associated with its implementation, and identifies new opportunities for innovation. The integration of AI with image processing technologies not only transforms the way we process visual information but also raises important questions about data privacy, ethical considerations, and the future of work in related fields.

This Special Issue on "Application of Artificial Intelligence in Image Processing" invites submissions that delve into the latest research and development regarding the application of AI to image processing. We welcome contributions that cover a wide range of topics, including (but not limited to) the following:

  • Utilizing AI for image recognition and classification;
  • Deep learning applications in medical imaging and diagnostics;
  • AI-driven techniques for object detection and segmentation in images;
  • Computer vision systems for security and surveillance;
  • AI in the enhancement and restoration of digital media;
  • Ethical considerations and challenges in AI-driven image processing;
  • Integration of AI with Internet of Things (IoT) for real-time image analysis;
  • Applications of AI in autonomous vehicle imaging systems;
  • AI and the future of image forensics and authentication;
  • Case studies in AI application for environmental monitoring and agriculture;
  • Exploring the role of AI in creative industries for image generation;
  • Sustainable AI practices in image processing;
  • Customer experience enhancement through AI-powered image personalization.

We encourage researchers, academics, and industry professionals to share their insights, findings, and innovative applications of AI in image processing. This Special Issue aims to provide a comprehensive overview of the current landscape and future directions in the field, offering a platform for the exchange of knowledge and the inspiration of new ideas and solutions.

Dr. Qian Jiang
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • artificial intelligence
  • image processing
  • computer vision
  • machine learning
  • image recognition
  • image classification
  • medical imaging
  • object detection
  • image generation
  • application of AI-driven image processing

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (9 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

27 pages, 10496 KiB  
Article
A Convolutional Neural Network as a Potential Tool for Camouflage Assessment
by Erik Van der Burg, Alexander Toet, Paola Perone and Maarten A. Hogervorst
Appl. Sci. 2025, 15(9), 5066; https://doi.org/10.3390/app15095066 - 2 May 2025
Viewed by 295
Abstract
Camouflage evaluation is traditionally evaluated through human visual search and detection experiments, which are time-consuming and resource intensive. To address this, we explored whether a pre-trained convolutional neural network (YOLOv4-tiny) can provide an automated, image-based measure of camouflage effectiveness that aligns with human [...] Read more.
Camouflage evaluation is traditionally evaluated through human visual search and detection experiments, which are time-consuming and resource intensive. To address this, we explored whether a pre-trained convolutional neural network (YOLOv4-tiny) can provide an automated, image-based measure of camouflage effectiveness that aligns with human perception. We conducted behavioral experiments to obtain human detection performance metrics—such as search time and target conspicuity—and compared these to the classification probabilities output by the YOLO model when detecting camouflaged individuals in rural and urban scenes. YOLO’s classification probability was adopted as a proxy for detectability, allowing direct comparison with human observer performance. We found a strong overall correspondence between YOLO-predicted camouflage effectiveness and human detection results. However, discrepancies emerged at close distances, where YOLO’s performance was particularly sensitive to high-contrast, shape-breaking elements of the camouflage pattern. CNNs such as YOLO have significant potential for assessing camouflage effectiveness for a wide range of applications, such as evaluating or optimizing one’s signature and predicting optimal hiding locations in each environment. Still, further research is required to fully establish YOLO’s limitations and applicability for this purpose in real time. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)
Show Figures

Figure 1

17 pages, 5133 KiB  
Article
A Real-Time DAO-YOLO Model for Electric Power Operation Violation Recognition
by Xiaoliang Qian, Yang Li, Xinyu Ding, Longxiang Luo, Jinchao Guo, Wei Wang and Peixu Xing
Appl. Sci. 2025, 15(8), 4492; https://doi.org/10.3390/app15084492 - 18 Apr 2025
Viewed by 247
Abstract
Electric power operation violation recognition (EPOVR) is essential for personnel safety, achieved by detecting key objects in electric power operation scenarios. Recent methods usually use the YOLOv8 model to achieve EPOVR; however, the YOLOv8 model still has four problems that need to be [...] Read more.
Electric power operation violation recognition (EPOVR) is essential for personnel safety, achieved by detecting key objects in electric power operation scenarios. Recent methods usually use the YOLOv8 model to achieve EPOVR; however, the YOLOv8 model still has four problems that need to be addressed. Firstly, the capability for feature representation of irregularly shaped objects is not strong enough. Secondly, the capability for feature representation is not strong enough to precisely detect multi-scale objects. Thirdly, the localization accuracy is not ideal. Fourthly, many violation categories in electric power operation cannot be covered by the existing datasets. To address the first problem, a deformable C2f (DC2f) module is proposed, which contains deformable convolutions and depthwise separable convolutions. For the second problem, an adaptive multi-scale feature enhancement (AMFE) module is proposed, which integrates multi-scale depthwise separable convolutions, adaptive convolutions, and a channel attention mechanism to optimize multi-scale feature representation while minimizing the number of parameters. For the third problem, an optimized complete intersection over union (OCIoU) loss is proposed for bounding box localization. Finally, a novel dataset named EPOVR-v1.0 is proposed to evaluate the performance of the object detection model applied in EPOVR. Ablation studies validate the effectiveness of the DC2f module, AMFE module, OCIoU loss, and their combinations. Compared with the baseline YOLOv8 model, the mAP@0.5 and mAP@0.5–0.95 are improved by 3.2% and 4.4%, while SDAP@0.5 and SDAP@0.5–0.95 are reduced by 0.34 and 0.019, respectively. Furthermore, the number of parameters and GFLOPS are shown to have slightly decreased. Comparison with seven YOLO models shows that our DAO-YOLO model achieves the highest detection accuracy while achieving real-time object detection for EPOVR. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)
Show Figures

Figure 1

23 pages, 22938 KiB  
Article
Conditional GAN-Based Two-Stage ISP Tuning Method: A Reconstruction–Enhancement Proxy Framework
by Pengfei Zhan and Jiongyao Ye
Appl. Sci. 2025, 15(6), 3371; https://doi.org/10.3390/app15063371 - 19 Mar 2025
Viewed by 300
Abstract
Image signal processing (ISP), a critical component in camera imaging, has traditionally relied on experience-driven parameter tuning. This approach suffers from inefficiency, fidelity issues, and conflicts with visual enhancement objectives. This paper introduces ReEn-GAN, an innovative staged ISP proxy tuning framework. ReEn-GAN decouples [...] Read more.
Image signal processing (ISP), a critical component in camera imaging, has traditionally relied on experience-driven parameter tuning. This approach suffers from inefficiency, fidelity issues, and conflicts with visual enhancement objectives. This paper introduces ReEn-GAN, an innovative staged ISP proxy tuning framework. ReEn-GAN decouples the ISP process into two distinct stages: reconstruction (physical signal recovery) and enhancement (visual quality and color optimization). By employing distinct network architectures and loss functions tailored to specific objectives, the two-stage proxy can effectively optimize both the reconstruction and enhancement modules within the ISP pipeline. Compared to tuning with an end-to-end proxy network, the proposed method’s proxy more effectively extracts hierarchical information from the ISP pipeline, thereby mitigating the significant changes in image color and texture that often result from parameter adjustments in an end-to-end proxy model. This paper conducts experiments on image denoising and object detection tuning tasks, and compares the performance of the two types of proxies. The results demonstrate that the proposed method outperforms end-to-end proxy methods on public datasets (SIDD, KITTI) and achieves over 21% improvement in performance metrics compared to hand-tuning methods. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)
Show Figures

Figure 1

16 pages, 3603 KiB  
Article
Improvement of a Subpixel Convolutional Neural Network for a Super-Resolution Image
by Muhammed Fatih Ağalday and Ahmet Çinar
Appl. Sci. 2025, 15(5), 2459; https://doi.org/10.3390/app15052459 - 25 Feb 2025
Viewed by 588
Abstract
Super-resolution technologies are one of the tools used in image restoration, which aims to obtain high-resolution content from low-resolution images. Super-resolution technology aims to increase the quality of a low-resolution image by reconstructing it. It is a useful technology, especially in content where [...] Read more.
Super-resolution technologies are one of the tools used in image restoration, which aims to obtain high-resolution content from low-resolution images. Super-resolution technology aims to increase the quality of a low-resolution image by reconstructing it. It is a useful technology, especially in content where low-resolution images need to be enhanced. Super-resolution applications are used in areas such as face recognition, medical imaging, and satellite imaging. Deep neural network models used for single-image super-resolution are quite successful in terms of computational performance. In these models, low-resolution images are converted to high resolution using methods such as bicubic interpolation. Since the super-resolution process is performed in the high-resolution area, it adds a memory cost and computational complexity. In our proposed model, a low-resolution image is given as input to a convolutional neural network to reduce computational complexity. In this model, a subpixel convolution layer is presented that learns a series of filters to enhance low-resolution feature maps to high-resolution images. In our proposed model, convolution layers are added to the efficient subpixel convolutional neural network (ESPCN) model, and in order to prevent the lost gradient value, we transfer the feature information of the current layer from the previous layer to the next upper layer. The efficient subpixel convolutional neural network (R-ESPCN) model proposed in this paper is remodeled to reduce the time required for the real-time subpixel convolutional neural network to perform super-resolution operations on images. The results show that our method is significantly improved in accuracy and demonstrates the applicability of deep learning methods in the field of image data processing. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)
Show Figures

Figure 1

15 pages, 3184 KiB  
Article
A Lightweight Single-Image Super-Resolution Method Based on the Parallel Connection of Convolution and Swin Transformer Blocks
by Tengyun Jing, Cuiyin Liu and Yuanshuai Chen
Appl. Sci. 2025, 15(4), 1806; https://doi.org/10.3390/app15041806 - 10 Feb 2025
Viewed by 708
Abstract
In recent years, with the development of deep learning technologies, Vision Transformers combined with Convolutional Neural Networks (CNNs) have made significant progress in the field of single-image super-resolution (SISR). However, existing methods still face issues such as incomplete high-frequency information reconstruction, training instability [...] Read more.
In recent years, with the development of deep learning technologies, Vision Transformers combined with Convolutional Neural Networks (CNNs) have made significant progress in the field of single-image super-resolution (SISR). However, existing methods still face issues such as incomplete high-frequency information reconstruction, training instability caused by residual connections, and insufficient cross-window information exchange. To address these problems and better leverage both local and global information, this paper proposes a super-resolution reconstruction network based on the Parallel Connection of Convolution and Swin Transformer Block (PCCSTB) to model the local and global features of an image. Specifically, through a parallel structure of channel feature-enhanced convolution and Swin Transformer, the network extracts, enhances, and fuses the local and global information. Additionally, this paper designs a fusion module to integrate the global and local information extracted by CNNs. The experimental results show that the proposed network effectively balances SR performance and network complexity, achieving good results in the lightweight SR domain. For instance, in the 4× super-resolution experiment on the Urban100 dataset, the network achieves an inference speed of 55 frames per second under the same device conditions, which is more than seven times as fast as the state-of-the-art network Shifted Window-based Image Restoration (SwinIR). Moreover, the network’s Peak Signal-to-Noise Ratio (PSNR) outperforms SwinIR by 0.29 dB at a 4× scale on the Set5 dataset, indicating that the network efficiently performs high-resolution image reconstruction. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)
Show Figures

Figure 1

16 pages, 1820 KiB  
Article
GAN-Based Map Generation Technique of Aerial Image Using Residual Blocks and Canny Edge Detector
by Jongwook Si and Sungyoung Kim
Appl. Sci. 2024, 14(23), 10963; https://doi.org/10.3390/app142310963 - 26 Nov 2024
Cited by 1 | Viewed by 952
Abstract
As the significance of meticulous and precise map creation grows in modern Geographic Information Systems (GISs), urban planning, disaster response, and other domains, the necessity for sophisticated map generation technology has become increasingly evident. In response to this demand, this paper puts forward [...] Read more.
As the significance of meticulous and precise map creation grows in modern Geographic Information Systems (GISs), urban planning, disaster response, and other domains, the necessity for sophisticated map generation technology has become increasingly evident. In response to this demand, this paper puts forward a technique based on Generative Adversarial Networks (GANs) for converting aerial imagery into high-quality maps. The proposed method, comprising a generator and a discriminator, introduces novel strategies to overcome existing challenges; namely, the use of a Canny edge detector and Residual Blocks. The proposed loss function enhances the generator’s performance by assigning greater weight to edge regions using the Canny edge map and eliminating superfluous information. This approach enhances the visual quality of the generated maps and ensures the accurate capture of fine details. The experimental results demonstrate that this method generates maps of superior visual quality, achieving outstanding performance compared to existing methodologies. The results show that the proposed technology has significant potential for practical applications in a range of real-world scenarios. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)
Show Figures

Figure 1

15 pages, 2189 KiB  
Article
Entropy-Based Ensemble of Convolutional Neural Networks for Clothes Texture Pattern Recognition
by Reham Al-Majed and Muhammad Hussain
Appl. Sci. 2024, 14(22), 10730; https://doi.org/10.3390/app142210730 - 20 Nov 2024
Cited by 1 | Viewed by 897
Abstract
Automatic clothes pattern recognition is important to assist visually impaired people and for real-world applications such as e-commerce or personal fashion recommendation systems, and it has attracted increased interest from researchers. It is a challenging texture classification problem in that even images of [...] Read more.
Automatic clothes pattern recognition is important to assist visually impaired people and for real-world applications such as e-commerce or personal fashion recommendation systems, and it has attracted increased interest from researchers. It is a challenging texture classification problem in that even images of the same texture class expose a high degree of intraclass variations. Moreover, images of clothes patterns may be taken in an unconstrained illumination environment. Machine learning methods proposed for this problem mostly rely on handcrafted features and traditional classification methods. The research works that utilize the deep learning approach result in poor recognition performance. We propose a deep learning method based on an ensemble of convolutional neural networks where feature engineering is not required while extracting robust local and global features of clothes patterns. The ensemble classifier employs a pre-trained ResNet50 with a non-local (NL) block, a squeeze-and-excitation (SE) block, and a coordinate attention (CA) block as base learners. To fuse the individual decisions of the base learners, we introduce a simple and effective fusing technique based on entropy voting, which incorporates the uncertainties in the decisions of base learners. We validate the proposed method on benchmark datasets for clothes patterns that have six categories: solid, striped, checkered, dotted, zigzag, and floral. The proposed method achieves promising results for limited computational and data resources. In terms of accuracy, it achieves 98.18% for the GoogleClothingDataset and 96.03% for the CCYN dataset. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)
Show Figures

Figure 1

19 pages, 829 KiB  
Article
A New Image Oversampling Method Based on Influence Functions and Weights
by Jun Ye, Shoulei Lu and Jiawei Chen
Appl. Sci. 2024, 14(22), 10553; https://doi.org/10.3390/app142210553 - 15 Nov 2024
Viewed by 822
Abstract
Although imbalanced data have been studied for many years, the problem of data imbalance is still a major problem in the development of machine learning and artificial intelligence. The development of deep learning and artificial intelligence has further expanded the impact of imbalanced [...] Read more.
Although imbalanced data have been studied for many years, the problem of data imbalance is still a major problem in the development of machine learning and artificial intelligence. The development of deep learning and artificial intelligence has further expanded the impact of imbalanced data, so studying imbalanced data classification is of practical significance. We propose an image oversampling algorithm based on the influence function and sample weights. Our scheme not only synthesizes high-quality minority class samples but also preserves the original features and information of minority class images. To address the lack of visually reasonable features in SMOTE when synthesizing images, we improve the pre-training model by removing the pooling layer and the fully connected layer in the model, extracting the important features of the image by convolving the image, executing SMOTE interpolation operation on the extracted important features to derive the synthesized image features, and inputting the features into a DCGAN network generator, which maps these features into the high-dimensional image space to generate a realistic image. To verify that our scheme can synthesize high-quality images and thus improve classification accuracy, we conduct experiments on the processed CIFAR10, CIFAR100, and ImageNet-LT datasets. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)
Show Figures

Figure 1

26 pages, 4018 KiB  
Article
A MediaPipe Holistic Behavior Classification Model as a Potential Model for Predicting Aggressive Behavior in Individuals with Dementia
by Ioannis Galanakis, Rigas Filippos Soldatos, Nikitas Karanikolas, Athanasios Voulodimos, Ioannis Voyiatzis and Maria Samarakou
Appl. Sci. 2024, 14(22), 10266; https://doi.org/10.3390/app142210266 - 7 Nov 2024
Cited by 1 | Viewed by 1572
Abstract
This paper introduces a classification model that detects and classifies argumentative behaviors between two individuals by utilizing a machine learning application, based on the MediaPipe Holistic model. The approach involves the distinction between two different classes based on the behavior of two individuals, [...] Read more.
This paper introduces a classification model that detects and classifies argumentative behaviors between two individuals by utilizing a machine learning application, based on the MediaPipe Holistic model. The approach involves the distinction between two different classes based on the behavior of two individuals, argumentative and non-argumentative behaviors, corresponding to verbal argumentative behavior. By using a dataset extracted from video frames of hand gestures, body stance and facial expression, and by using their corresponding landmarks, three different classification models were trained and evaluated. The results indicate that Random Forest Classifier outperformed the other two by classifying argumentative behaviors with 68.07% accuracy and non-argumentative behaviors with 94.18% accuracy, correspondingly. Thus, there is future scope for advancing this classification model to a prediction model, with the aim of predicting aggressive behavior in patients suffering with dementia before their onset. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)
Show Figures

Figure 1

Back to TopTop