Submit to Special Issue Submit Abstract to Special Issue Review for Applied Sciences Propose a Special Issue

Journal Menu

Journal Browser

Application of Artificial Intelligence in Image Processing

Print Special Issue Flyer
Special Issue Editors
Special Issue Information
Keywords
Benefits of Publishing in a Special Issue
Published Papers

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 20 November 2025 | Viewed by 10578

Share This Special Issue

Special Issue Editor

Dr. Qian Jiang

E-Mail Website
Guest Editor

School of Software, Yunnan University, Kunming 650000, China
Interests: deep learning; image processing; fuzzy sets; information fusion

Special Issue Information

Dear Colleagues,

The field of image processing has long been at the forefront of leveraging artificial intelligence (AI) to revolutionize how we analyze, interpret, and utilize visual data. As we continue to push the boundaries of what is possible with AI, the integration of advanced algorithms and machine learning models has opened up new horizons in the realm of image analysis, leading to innovative applications across various industries. This has been facilitated by the rapid advancements in computational power, the availability of large annotated datasets, and the development of sophisticated AI techniques such as deep learning, computer vision, and pattern recognition.

The adoption of AI in image processing has triggered a wave of innovation that spans from medical imaging and diagnostics to security surveillance, autonomous vehicles, and digital media. The ability of AI to recognize patterns, classify images, and detect anomalies has far-reaching implications for enhancing efficiency, accuracy, and speed in image analysis tasks. In the face of these technological advancements, there is a growing need for research that explores the potential of AI in image processing, addresses the challenges associated with its implementation, and identifies new opportunities for innovation. The integration of AI with image processing technologies not only transforms the way we process visual information but also raises important questions about data privacy, ethical considerations, and the future of work in related fields.

This Special Issue on "Application of Artificial Intelligence in Image Processing" invites submissions that delve into the latest research and development regarding the application of AI to image processing. We welcome contributions that cover a wide range of topics, including (but not limited to) the following:

Utilizing AI for image recognition and classification;
Deep learning applications in medical imaging and diagnostics;
AI-driven techniques for object detection and segmentation in images;
Computer vision systems for security and surveillance;
AI in the enhancement and restoration of digital media;
Ethical considerations and challenges in AI-driven image processing;
Integration of AI with Internet of Things (IoT) for real-time image analysis;
Applications of AI in autonomous vehicle imaging systems;
AI and the future of image forensics and authentication;
Case studies in AI application for environmental monitoring and agriculture;
Exploring the role of AI in creative industries for image generation;
Sustainable AI practices in image processing;
Customer experience enhancement through AI-powered image personalization.

We encourage researchers, academics, and industry professionals to share their insights, findings, and innovative applications of AI in image processing. This Special Issue aims to provide a comprehensive overview of the current landscape and future directions in the field, offering a platform for the exchange of knowledge and the inspiration of new ideas and solutions.

Dr. Qian Jiang
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

artificial intelligence
image processing
computer vision
machine learning
image recognition
image classification
medical imaging
object detection
image generation
application of AI-driven image processing

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (11 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

19 pages, 3102 KiB

Open AccessArticle

A Multicomponent Face Verification and Identification System

by Athanasios Douklias, Ioannis Zorzos, Evangelos Maltezos, Vasilis Nousis, Spyridon Nektarios Bolierakis, Lazaros Karagiannidis, Eleftherios Ouzounoglou and Angelos Amditis

Appl. Sci. 2025, 15(15), 8161; https://doi.org/10.3390/app15158161 - 22 Jul 2025

Viewed by 63

Abstract

Face recognition technology is a biometric technology, which is based on the identification or verification of facial features. Automatic face recognition is an active research field in the context of computer vision and artificial intelligence (AI) that is fundamental for a variety of real-time applications. In this research, the design and implementation of a face verification and identification system of a flexible, modular, secure, and scalable architecture is proposed. The proposed system incorporates several and various types of system components: (i) portable capabilities (mobile application and mixed reality [MR] glasses), (ii) enhanced monitoring and visualization via a user-friendly Web-based user interface (UI), and (iii) information sharing via middleware to other external systems. The experiments showed that such interconnected and complementary system components were able to perform robust and real-time results related to face identification and verification. Furthermore, to identify a proper model of high accuracy, robustness, and performance speed for face identification and verification tasks, a comprehensive evaluation of multiple face recognition pre-trained models (FaceNet, ArcFace, Dlib, and MobileNetV2) on a curated version of the ID vs. Spot dataset was performed. Among the models used, FaceNet emerged as a preferable choice for real-time tasks due to its balance between accuracy and inference speed for both face identification and verification tasks achieving AUC of 0.99, Rank-1 of 91.8%, Rank-5 of 95.8%, FNR of 2% and FAR of 0.1%, accuracy of 98.6%, and inference speed of 52 ms. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)

► Show Figures

Figure 1

25 pages, 10875 KiB

Open AccessArticle

Novel Deepfake Image Detection with PV-ISM: Patch-Based Vision Transformer for Identifying Synthetic Media

by Orkun Çınar and Yunus Doğan

Appl. Sci. 2025, 15(12), 6429; https://doi.org/10.3390/app15126429 - 7 Jun 2025

Viewed by 638

Abstract

This study presents a novel approach to the increasingly important task of distinguishing AI-generated images from authentic photographs. The detection of such synthetic content is critical for combating deepfake misinformation and ensuring the authenticity of digital media in journalism, forensics, and online platforms. A custom-designed Vision Transformer (ViT) model, termed Patch-Based Vision Transformer for Identifying Synthetic Media (PV-ISM), is introduced. Its performance is benchmarked against innovative transfer learning methods using 60,000 authentic images from the CIFAKE dataset, which is derived from CIFAR-10, along with a corresponding collection of images generated using Stable Diffusion 1.4. PV-ISM incorporates patch extraction, positional encoding, and multiple transformer blocks with attention mechanisms to identify subtle artifacts in synthetic images. Following extensive hyperparameter tuning, an accuracy of 96.60% was achieved, surpassing the performance of ResNet50 transfer learning approaches (93.32%) and other comparable methods reported in the literature. The experimental results demonstrate the model’s balanced classification capabilities, exhibiting excellent recall and precision throughout both image categories. The patch-based architecture of Vision Transformers, combined with appropriate data augmentation techniques, proves particularly effective for synthetic image detection while requiring less training time than traditional transfer learning approaches. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)

► Show Figures

Figure 1

27 pages, 10496 KiB

Open AccessArticle

A Convolutional Neural Network as a Potential Tool for Camouflage Assessment

by Erik Van der Burg, Alexander Toet, Paola Perone and Maarten A. Hogervorst

Appl. Sci. 2025, 15(9), 5066; https://doi.org/10.3390/app15095066 - 2 May 2025

Viewed by 540

Abstract

Camouflage evaluation is traditionally evaluated through human visual search and detection experiments, which are time-consuming and resource intensive. To address this, we explored whether a pre-trained convolutional neural network (YOLOv4-tiny) can provide an automated, image-based measure of camouflage effectiveness that aligns with human perception. We conducted behavioral experiments to obtain human detection performance metrics—such as search time and target conspicuity—and compared these to the classification probabilities output by the YOLO model when detecting camouflaged individuals in rural and urban scenes. YOLO’s classification probability was adopted as a proxy for detectability, allowing direct comparison with human observer performance. We found a strong overall correspondence between YOLO-predicted camouflage effectiveness and human detection results. However, discrepancies emerged at close distances, where YOLO’s performance was particularly sensitive to high-contrast, shape-breaking elements of the camouflage pattern. CNNs such as YOLO have significant potential for assessing camouflage effectiveness for a wide range of applications, such as evaluating or optimizing one’s signature and predicting optimal hiding locations in each environment. Still, further research is required to fully establish YOLO’s limitations and applicability for this purpose in real time. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)

► Show Figures

Figure 1

17 pages, 5133 KiB

Open AccessArticle

A Real-Time DAO-YOLO Model for Electric Power Operation Violation Recognition

by Xiaoliang Qian, Yang Li, Xinyu Ding, Longxiang Luo, Jinchao Guo, Wei Wang and Peixu Xing

Appl. Sci. 2025, 15(8), 4492; https://doi.org/10.3390/app15084492 - 18 Apr 2025

Viewed by 439

Abstract

Electric power operation violation recognition (EPOVR) is essential for personnel safety, achieved by detecting key objects in electric power operation scenarios. Recent methods usually use the YOLOv8 model to achieve EPOVR; however, the YOLOv8 model still has four problems that need to be addressed. Firstly, the capability for feature representation of irregularly shaped objects is not strong enough. Secondly, the capability for feature representation is not strong enough to precisely detect multi-scale objects. Thirdly, the localization accuracy is not ideal. Fourthly, many violation categories in electric power operation cannot be covered by the existing datasets. To address the first problem, a deformable C2f (DC2f) module is proposed, which contains deformable convolutions and depthwise separable convolutions. For the second problem, an adaptive multi-scale feature enhancement (AMFE) module is proposed, which integrates multi-scale depthwise separable convolutions, adaptive convolutions, and a channel attention mechanism to optimize multi-scale feature representation while minimizing the number of parameters. For the third problem, an optimized complete intersection over union (OCIoU) loss is proposed for bounding box localization. Finally, a novel dataset named EPOVR-v1.0 is proposed to evaluate the performance of the object detection model applied in EPOVR. Ablation studies validate the effectiveness of the DC2f module, AMFE module, OCIoU loss, and their combinations. Compared with the baseline YOLOv8 model, the mAP@0.5 and mAP@0.5–0.95 are improved by 3.2% and 4.4%, while SDAP@0.5 and SDAP@0.5–0.95 are reduced by 0.34 and 0.019, respectively. Furthermore, the number of parameters and GFLOPS are shown to have slightly decreased. Comparison with seven YOLO models shows that our DAO-YOLO model achieves the highest detection accuracy while achieving real-time object detection for EPOVR. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)

► Show Figures

Figure 1

23 pages, 22938 KiB

Open AccessArticle

Conditional GAN-Based Two-Stage ISP Tuning Method: A Reconstruction–Enhancement Proxy Framework

by Pengfei Zhan and Jiongyao Ye

Appl. Sci. 2025, 15(6), 3371; https://doi.org/10.3390/app15063371 - 19 Mar 2025

Viewed by 522

Abstract

Image signal processing (ISP), a critical component in camera imaging, has traditionally relied on experience-driven parameter tuning. This approach suffers from inefficiency, fidelity issues, and conflicts with visual enhancement objectives. This paper introduces ReEn-GAN, an innovative staged ISP proxy tuning framework. ReEn-GAN decouples the ISP process into two distinct stages: reconstruction (physical signal recovery) and enhancement (visual quality and color optimization). By employing distinct network architectures and loss functions tailored to specific objectives, the two-stage proxy can effectively optimize both the reconstruction and enhancement modules within the ISP pipeline. Compared to tuning with an end-to-end proxy network, the proposed method’s proxy more effectively extracts hierarchical information from the ISP pipeline, thereby mitigating the significant changes in image color and texture that often result from parameter adjustments in an end-to-end proxy model. This paper conducts experiments on image denoising and object detection tuning tasks, and compares the performance of the two types of proxies. The results demonstrate that the proposed method outperforms end-to-end proxy methods on public datasets (SIDD, KITTI) and achieves over 21% improvement in performance metrics compared to hand-tuning methods. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)

► Show Figures

Figure 1

16 pages, 3603 KiB

Open AccessArticle

Improvement of a Subpixel Convolutional Neural Network for a Super-Resolution Image

by Muhammed Fatih Ağalday and Ahmet Çinar

Appl. Sci. 2025, 15(5), 2459; https://doi.org/10.3390/app15052459 - 25 Feb 2025

Cited by 1 | Viewed by 899

Abstract

Super-resolution technologies are one of the tools used in image restoration, which aims to obtain high-resolution content from low-resolution images. Super-resolution technology aims to increase the quality of a low-resolution image by reconstructing it. It is a useful technology, especially in content where low-resolution images need to be enhanced. Super-resolution applications are used in areas such as face recognition, medical imaging, and satellite imaging. Deep neural network models used for single-image super-resolution are quite successful in terms of computational performance. In these models, low-resolution images are converted to high resolution using methods such as bicubic interpolation. Since the super-resolution process is performed in the high-resolution area, it adds a memory cost and computational complexity. In our proposed model, a low-resolution image is given as input to a convolutional neural network to reduce computational complexity. In this model, a subpixel convolution layer is presented that learns a series of filters to enhance low-resolution feature maps to high-resolution images. In our proposed model, convolution layers are added to the efficient subpixel convolutional neural network (ESPCN) model, and in order to prevent the lost gradient value, we transfer the feature information of the current layer from the previous layer to the next upper layer. The efficient subpixel convolutional neural network (R-ESPCN) model proposed in this paper is remodeled to reduce the time required for the real-time subpixel convolutional neural network to perform super-resolution operations on images. The results show that our method is significantly improved in accuracy and demonstrates the applicability of deep learning methods in the field of image data processing. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)

► Show Figures

Figure 1

15 pages, 3184 KiB

Open AccessArticle

A Lightweight Single-Image Super-Resolution Method Based on the Parallel Connection of Convolution and Swin Transformer Blocks

by Tengyun Jing, Cuiyin Liu and Yuanshuai Chen

Appl. Sci. 2025, 15(4), 1806; https://doi.org/10.3390/app15041806 - 10 Feb 2025

Cited by 1 | Viewed by 1103

Abstract

In recent years, with the development of deep learning technologies, Vision Transformers combined with Convolutional Neural Networks (CNNs) have made significant progress in the field of single-image super-resolution (SISR). However, existing methods still face issues such as incomplete high-frequency information reconstruction, training instability caused by residual connections, and insufficient cross-window information exchange. To address these problems and better leverage both local and global information, this paper proposes a super-resolution reconstruction network based on the Parallel Connection of Convolution and Swin Transformer Block (PCCSTB) to model the local and global features of an image. Specifically, through a parallel structure of channel feature-enhanced convolution and Swin Transformer, the network extracts, enhances, and fuses the local and global information. Additionally, this paper designs a fusion module to integrate the global and local information extracted by CNNs. The experimental results show that the proposed network effectively balances SR performance and network complexity, achieving good results in the lightweight SR domain. For instance, in the 4× super-resolution experiment on the Urban100 dataset, the network achieves an inference speed of 55 frames per second under the same device conditions, which is more than seven times as fast as the state-of-the-art network Shifted Window-based Image Restoration (SwinIR). Moreover, the network’s Peak Signal-to-Noise Ratio (PSNR) outperforms SwinIR by 0.29 dB at a 4× scale on the Set5 dataset, indicating that the network efficiently performs high-resolution image reconstruction. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)

► Show Figures

Figure 1

16 pages, 1820 KiB

Open AccessArticle

GAN-Based Map Generation Technique of Aerial Image Using Residual Blocks and Canny Edge Detector

by Jongwook Si and Sungyoung Kim

Appl. Sci. 2024, 14(23), 10963; https://doi.org/10.3390/app142310963 - 26 Nov 2024

Cited by 1 | Viewed by 1147

Abstract

As the significance of meticulous and precise map creation grows in modern Geographic Information Systems (GISs), urban planning, disaster response, and other domains, the necessity for sophisticated map generation technology has become increasingly evident. In response to this demand, this paper puts forward a technique based on Generative Adversarial Networks (GANs) for converting aerial imagery into high-quality maps. The proposed method, comprising a generator and a discriminator, introduces novel strategies to overcome existing challenges; namely, the use of a Canny edge detector and Residual Blocks. The proposed loss function enhances the generator’s performance by assigning greater weight to edge regions using the Canny edge map and eliminating superfluous information. This approach enhances the visual quality of the generated maps and ensures the accurate capture of fine details. The experimental results demonstrate that this method generates maps of superior visual quality, achieving outstanding performance compared to existing methodologies. The results show that the proposed technology has significant potential for practical applications in a range of real-world scenarios. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)

► Show Figures

Figure 1

15 pages, 2189 KiB

Open AccessArticle

Entropy-Based Ensemble of Convolutional Neural Networks for Clothes Texture Pattern Recognition

by Reham Al-Majed and Muhammad Hussain

Appl. Sci. 2024, 14(22), 10730; https://doi.org/10.3390/app142210730 - 20 Nov 2024

Cited by 1 | Viewed by 1064

Abstract

Automatic clothes pattern recognition is important to assist visually impaired people and for real-world applications such as e-commerce or personal fashion recommendation systems, and it has attracted increased interest from researchers. It is a challenging texture classification problem in that even images of the same texture class expose a high degree of intraclass variations. Moreover, images of clothes patterns may be taken in an unconstrained illumination environment. Machine learning methods proposed for this problem mostly rely on handcrafted features and traditional classification methods. The research works that utilize the deep learning approach result in poor recognition performance. We propose a deep learning method based on an ensemble of convolutional neural networks where feature engineering is not required while extracting robust local and global features of clothes patterns. The ensemble classifier employs a pre-trained ResNet50 with a non-local (NL) block, a squeeze-and-excitation (SE) block, and a coordinate attention (CA) block as base learners. To fuse the individual decisions of the base learners, we introduce a simple and effective fusing technique based on entropy voting, which incorporates the uncertainties in the decisions of base learners. We validate the proposed method on benchmark datasets for clothes patterns that have six categories: solid, striped, checkered, dotted, zigzag, and floral. The proposed method achieves promising results for limited computational and data resources. In terms of accuracy, it achieves 98.18% for the GoogleClothingDataset and 96.03% for the CCYN dataset. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)

► Show Figures

Figure 1

19 pages, 829 KiB

Open AccessArticle

A New Image Oversampling Method Based on Influence Functions and Weights

by Jun Ye, Shoulei Lu and Jiawei Chen

Appl. Sci. 2024, 14(22), 10553; https://doi.org/10.3390/app142210553 - 15 Nov 2024

Viewed by 1093

Abstract

Although imbalanced data have been studied for many years, the problem of data imbalance is still a major problem in the development of machine learning and artificial intelligence. The development of deep learning and artificial intelligence has further expanded the impact of imbalanced data, so studying imbalanced data classification is of practical significance. We propose an image oversampling algorithm based on the influence function and sample weights. Our scheme not only synthesizes high-quality minority class samples but also preserves the original features and information of minority class images. To address the lack of visually reasonable features in SMOTE when synthesizing images, we improve the pre-training model by removing the pooling layer and the fully connected layer in the model, extracting the important features of the image by convolving the image, executing SMOTE interpolation operation on the extracted important features to derive the synthesized image features, and inputting the features into a DCGAN network generator, which maps these features into the high-dimensional image space to generate a realistic image. To verify that our scheme can synthesize high-quality images and thus improve classification accuracy, we conduct experiments on the processed CIFAR10, CIFAR100, and ImageNet-LT datasets. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)

► Show Figures

Figure 1

26 pages, 4018 KiB

Open AccessArticle

A MediaPipe Holistic Behavior Classification Model as a Potential Model for Predicting Aggressive Behavior in Individuals with Dementia

by Ioannis Galanakis, Rigas Filippos Soldatos, Nikitas Karanikolas, Athanasios Voulodimos, Ioannis Voyiatzis and Maria Samarakou

Appl. Sci. 2024, 14(22), 10266; https://doi.org/10.3390/app142210266 - 7 Nov 2024

Cited by 4 | Viewed by 1955

Abstract

This paper introduces a classification model that detects and classifies argumentative behaviors between two individuals by utilizing a machine learning application, based on the MediaPipe Holistic model. The approach involves the distinction between two different classes based on the behavior of two individuals, argumentative and non-argumentative behaviors, corresponding to verbal argumentative behavior. By using a dataset extracted from video frames of hand gestures, body stance and facial expression, and by using their corresponding landmarks, three different classification models were trained and evaluated. The results indicate that Random Forest Classifier outperformed the other two by classifying argumentative behaviors with 68.07% accuracy and non-argumentative behaviors with 94.18% accuracy, correspondingly. Thus, there is future scope for advancing this classification model to a prediction model, with the aim of predicting aggressive behavior in patients suffering with dementia before their onset. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)

► Show Figures

Journal Menu

Journal Browser

Application of Artificial Intelligence in Image Processing

Share This Special Issue

Special Issue Editor

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (11 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI