AI-Based Image Processing: 2nd Edition

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 30 April 2024 | Viewed by 6709

Special Issue Editors

Guangdong Key Laboratory of Intelligent Information Processing, College of Electronics and Information Engineering, Shenzhen University, Shenzhen 518118, China
Interests: multi-objective evolutionary algorithm; evolutionary multiobjective optimization; pareto front
Department of Electronic and Electrical Engineering, Brunel University London, London UB8 3PH, UK
Interests: artificial Intelligence; machine learning
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

With the rapid development of artificial intelligence, a series of artificial intelligence algorithms have been proposed, and intelligent applications have started to play an increasingly important role in industrial production and our social lives. The intersections of AI in medical image processing, image processing for intelligent transportation systems, satellite image processing, face recognition, object recognition, etc. are still challenging.

Therefore, a Special Issue on “AI-based Image Processing: 2nd Edition”, focusing on tackling the most pressing problems faced by the world is organized, and more specifically, in the topics specified below:

  • Image processing algorithms;
  • Image analytics;
  • Medial image processing;
  • Biomedical image analysis;
  • Image generation;
  • Image restoration and enhancement;
  • Image compression;
  • Edge detection;
  • Image segmentation;
  • Semantic segmentation;
  • Image classification;
  • Image inpainting;
  • Image captioning;
  • Feature detection and extraction;
  • Content-based image retrieval;
  • Optical character recognition;
  • Face recognition;
  • Emotion recognition;
  • Gesture recognition;
  • Object recognition and tracking.

Dr. Jianping Luo
Dr. Hongying Meng
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • image processing
  • feature detection and extraction
  • object recognition and tracking

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

17 pages, 6014 KiB  
Article
Rectification for Stitched Images with Deformable Meshes and Residual Networks
by Yingbo Fan, Shanjun Mao, Mei Li, Zheng Wu, Jitong Kang and Ben Li
Appl. Sci. 2024, 14(7), 2821; https://doi.org/10.3390/app14072821 (registering DOI) - 27 Mar 2024
Viewed by 156
Abstract
Image stitching is an important method for digital image processing, which is often prone to the problem of the irregularity of stitched images after stitching. And the traditional image cropping or complementation methods usually lead to a large number of information loss. Therefore, [...] Read more.
Image stitching is an important method for digital image processing, which is often prone to the problem of the irregularity of stitched images after stitching. And the traditional image cropping or complementation methods usually lead to a large number of information loss. Therefore, this paper proposes an image rectification method based on deformable mesh and residual network. The method aims to minimize the information loss at the edges of the spliced image and the information loss inside the image. Specifically, the method can select the most suitable mesh shape for residual network regression according to different images. Its loss function includes global loss and local loss, aiming to minimize the loss of image information within the grid and global target. The method in this paper not only greatly reduces the information loss caused by irregular shapes after image stitching, but also adapts to different images with various rigid structures. Meanwhile, its validation on the DIR-D dataset shows that the method outperforms the state-of-the-art methods in image rectification. Full article
(This article belongs to the Special Issue AI-Based Image Processing: 2nd Edition)
Show Figures

Figure 1

12 pages, 3775 KiB  
Article
Multi-Identity Recognition of Darknet Vendors Based on Metric Learning
by Yilei Wang, Yuelin Hu, Wenliang Xu and Futai Zou
Appl. Sci. 2024, 14(4), 1619; https://doi.org/10.3390/app14041619 - 17 Feb 2024
Viewed by 391
Abstract
Dark web vendor identification can be seen as an authorship aliasing problem, aiming to determine whether different accounts on different markets belong to the same real-world vendor, in order to locate cybercriminals involved in dark web market transactions. Existing open-source datasets for dark [...] Read more.
Dark web vendor identification can be seen as an authorship aliasing problem, aiming to determine whether different accounts on different markets belong to the same real-world vendor, in order to locate cybercriminals involved in dark web market transactions. Existing open-source datasets for dark web marketplaces are outdated and cannot simulate real-world situations, while data labeling methods are difficult and suffer from issues such as inaccurate labeling and limited cross-market research. The problem of identifying vendors’ multiple identities on the dark web involves a large number of categories and a limited number of samples, making it difficult to use traditional multiclass classification models. To address these issues, this paper proposes a metric learning-based method for dark web vendor identification, collecting product data from 21 currently active English dark web marketplaces and using a multi-dimensional feature extraction method based on product titles, descriptions, and images. Using pseudo-labeling technology combined with manual labeling improves data labeling accuracy compared to previous labeling methods. The proposed method uses a Siamese neural network with metric learning to learn the similarity between vendors and achieve the recognition of vendors’ multiple identities. This method achieved better performance with an average F1-score of 0.889 and an accuracy rate of 97.535% on the constructed dataset. The contributions of this paper lie in the proposed method for collecting and labeling data for dark web marketplaces and overcoming the limitations of traditional multiclass classifiers to achieve effective recognition of vendors’ multiple identities. Full article
(This article belongs to the Special Issue AI-Based Image Processing: 2nd Edition)
Show Figures

Figure 1

20 pages, 24378 KiB  
Article
Diffusion-Denoising Process with Gated U-Net for High-Quality Document Binarization
by Sangkwon Han, Seungbin Ji and Jongtae Rhee
Appl. Sci. 2023, 13(20), 11141; https://doi.org/10.3390/app132011141 - 10 Oct 2023
Viewed by 986
Abstract
The binarization of degraded documents represents a crucial preprocessing task for various document analyses, including optical character recognition and historical document analysis. Various convolutional neural network models and generative models have been used for document binarization. However, these models often struggle to deliver [...] Read more.
The binarization of degraded documents represents a crucial preprocessing task for various document analyses, including optical character recognition and historical document analysis. Various convolutional neural network models and generative models have been used for document binarization. However, these models often struggle to deliver generalized performance on noise types the model has not encountered during training and may have difficulty extracting intricate text strokes. We herein propose a novel approach to address these challenges by introducing the use of the latent diffusion model, a well-known high-quality image-generation model, into the realm of document binarization for the first time. By leveraging an iterative diffusion-denoising process within the latent space, our approach excels at producing high-quality, clean, binarized images and demonstrates excellent generalization using both data distribution and time steps during training. Furthermore, we enhance our model’s ability to preserve text strokes by incorporating a gated U-Net into the backbone network. The gated convolution mechanism allows the model to focus on the text region by combining gating values and features, facilitating the extraction of intricate text strokes. To maximize the effectiveness of our proposed model, we use a combination of the latent diffusion model loss and pixel-level loss, which aligns with the model’s structure. The experimental results on the Handwritten Document Image Binarization Contest and Document Image Binarization Contest benchmark datasets showcase the superior performance of our proposed model compared to existing methods. Full article
(This article belongs to the Special Issue AI-Based Image Processing: 2nd Edition)
Show Figures

Figure 1

21 pages, 5457 KiB  
Article
Research on Students’ Action Behavior Recognition Method Based on Classroom Time-Series Images
by Zhaoyu Shou, Mingbang Yan, Hui Wen, Jinghua Liu, Jianwen Mo and Huibing Zhang
Appl. Sci. 2023, 13(18), 10426; https://doi.org/10.3390/app131810426 - 18 Sep 2023
Viewed by 943
Abstract
Students’ action behavior performance is an important part of classroom teaching evaluation. To detect the action behavior of students in classroom teaching videos, and based on the detection results, the action behavior sequence of individual students in the teaching time of knowledge points [...] Read more.
Students’ action behavior performance is an important part of classroom teaching evaluation. To detect the action behavior of students in classroom teaching videos, and based on the detection results, the action behavior sequence of individual students in the teaching time of knowledge points is obtained and analyzed. This paper proposes a method for recognizing students’ action behaviors based on classroom time-series images. First, we propose an improved Asynchronous Interaction Aggregation (AIA) network for student action behavior detection. By adding a Multi-scale Temporal Attention (MsTA) module and a Multi-scale Channel Spatial Attention (MsCSA) module to the fast pathway and slow pathway, respectively, the accuracy of student action behavior recognition is improved in SlowFast, which is the backbone network of the improved AIA network,. Second, the Equalized Focal Loss function is introduced to improve the category imbalance that exists in the student action behavior dataset. Experimental results on the student action behavior dataset show that the method proposed in this paper can detect different action behaviors of students in the classroom and has better detection performance compared to the original AIA network. Finally, based on the results of action behavior recognition, the seat number is used as the index to obtain the action behavior sequence of individual students during the teaching time of knowledge points and the performance of students in this period is analyzed. Full article
(This article belongs to the Special Issue AI-Based Image Processing: 2nd Edition)
Show Figures

Figure 1

18 pages, 5975 KiB  
Article
Intelligent Simulation Technology Based on RCS Imaging
by Jiaxing Hao, Xuetian Wang, Sen Yang and Hongmin Gao
Appl. Sci. 2023, 13(18), 10119; https://doi.org/10.3390/app131810119 - 08 Sep 2023
Viewed by 539
Abstract
The target simulation of airplanes is an important research topic. It is particularly important to find the right balance between high performance and low cost. In order to balance the contradictions between realistic target simulations and controllable costs, the scientific formulation of the [...] Read more.
The target simulation of airplanes is an important research topic. It is particularly important to find the right balance between high performance and low cost. In order to balance the contradictions between realistic target simulations and controllable costs, the scientific formulation of the performance parameters of target simulation is the key to achieving high performance. This paper proposes an intelligent simulation technology based on RCS imaging simulation through the combination of 60° variation corner reflector and a Luneberg lens reflector. It is designed to simulate several important RCS characteristics of the aircraft. At the same time, the different RCS images are automatically shifted to the corresponding gear position to achieve the purpose of simulation, and the price is low and the performance is good. It can be used for the training of radar target searching. Full article
(This article belongs to the Special Issue AI-Based Image Processing: 2nd Edition)
Show Figures

Figure 1

22 pages, 4219 KiB  
Article
MixerNet-SAGA A Novel Deep Learning Architecture for Superior Road Extraction in High-Resolution Remote Sensing Imagery
by Wei Wu, Chao Ren, Anchao Yin and Xudong Zhang
Appl. Sci. 2023, 13(18), 10067; https://doi.org/10.3390/app131810067 - 06 Sep 2023
Cited by 1 | Viewed by 773
Abstract
In this study, we address the limitations of current deep learning models in road extraction tasks from remote sensing imagery. We introduce MixerNet-SAGA, a novel deep learning model that incorporates the strengths of U-Net, integrates a ConvMixer block for enhanced feature extraction, and [...] Read more.
In this study, we address the limitations of current deep learning models in road extraction tasks from remote sensing imagery. We introduce MixerNet-SAGA, a novel deep learning model that incorporates the strengths of U-Net, integrates a ConvMixer block for enhanced feature extraction, and includes a Scaled Attention Gate (SAG) for augmented spatial attention. Experimental validation on the Massachusetts road dataset and the DeepGlobe road dataset demonstrates that MixerNet-SAGA achieves a 10% improvement in precision, 8% in recall, and 12% in IoU compared to leading models such as U-Net, ResNet, and SDUNet. Furthermore, our model excels in computational efficiency, being 20% faster, and has a smaller model size. Notably, MixerNet-SAGA shows exceptional robustness against challenges such as same-spectrum–different-object and different-spectrum–same-object phenomena. Ablation studies further reveal the critical roles of the ConvMixer block and SAG. Despite its strengths, the model’s scalability to extremely large datasets remains an area for future investigation. Collectively, MixerNet-SAGA offers an efficient and accurate solution for road extraction in remote sensing imagery and presents significant potential for broader applications. Full article
(This article belongs to the Special Issue AI-Based Image Processing: 2nd Edition)
Show Figures

Figure 1

17 pages, 5445 KiB  
Article
RepNet: A Lightweight Human Pose Regression Network Based on Re-Parameterization
by Xinjing Zhang and Qixun Zhou
Appl. Sci. 2023, 13(16), 9475; https://doi.org/10.3390/app13169475 - 21 Aug 2023
Viewed by 865
Abstract
Human pose estimation, as the basis of advanced computer vision, has a wide application perspective. In existing studies, the high-capacity model based on the heatmap method can achieve accurate recognition results, but it encounters many difficulties when used in real-world scenarios. To solve [...] Read more.
Human pose estimation, as the basis of advanced computer vision, has a wide application perspective. In existing studies, the high-capacity model based on the heatmap method can achieve accurate recognition results, but it encounters many difficulties when used in real-world scenarios. To solve this problem, we propose a lightweight pose regression algorithm (RepNet) that introduces a multi-parameter network structure, fuses multi-level features, and combines the idea of residual likelihood estimation. A well-designed convolutional architecture is used for training. By reconstructing the parameters of each level, the network model is simplified, and the computation time and efficiency of the detection task are optimized. The prediction performance is also improved by the output of the maximum likelihood model and the reversible transformation of the underlying distribution learned by the flow generation model. RepNet achieves a recognition accuracy of 66.1 AP on the COCO dataset, at a computational speed of 15 ms on GPU and 40 ms on CPU. This resolves the contradiction between prediction accuracy and computational complexity and contributes to research in lightweight pose estimation. Full article
(This article belongs to the Special Issue AI-Based Image Processing: 2nd Edition)
Show Figures

Figure 1

19 pages, 35842 KiB  
Article
Real-Time Intelligent Detection System for Illegal Wearing of On-Site Power Construction Worker Based on Edge-YOLO and Low-Cost Edge Devices
by Rong Chang, Bangyuan Li, Junpeng Dang, Chuanxu Yang, Anning Pan and Yang Yang
Appl. Sci. 2023, 13(14), 8287; https://doi.org/10.3390/app13148287 - 18 Jul 2023
Viewed by 873
Abstract
Ensuring personal safety and preventing accidents are critical aspects of power construction safety supervision. However, current monitoring methods are inefficient and unreliable as most of them rely on manual monitoring and transmission, which results in slow detection and delayed warnings regarding violations. To [...] Read more.
Ensuring personal safety and preventing accidents are critical aspects of power construction safety supervision. However, current monitoring methods are inefficient and unreliable as most of them rely on manual monitoring and transmission, which results in slow detection and delayed warnings regarding violations. To overcome these challenges, we propose an intelligent detection system that can accurately identify instances of illegal wearing of power construction workers in real-time. Firstly, we integrated the squeeze-and-excitation (SE) module into our convolutional neural network to enhance detection accuracy. This module effectively prioritizes informative features while suppressing less relevant ones, resulting in improved overall performance. Secondly, we present an embedded real-time detection system that utilizes Jetson Xavier NX and Edge-YOLO. This system promptly detects and alerts power construction workers of instances of illegal wearing behavior. To ensure a lightweight implementation, we design appropriate detection heads based on target size and distribution, reducing model parameters while enhancing detection speed and minimizing accuracy loss. Additionally, we employed data augmentation to enhance the system’s robustness. Our experimental results demonstrate that our improved Edge-YOLO model achieves high detection precision and recall rates of 0.964 and 0.966, respectively, with a frame rate of 35.36 frames per second when deployed on Jetson Xavier NX. Therefore, Edge-YOLO proves to be an ideal choice for intelligent real-time detection systems, providing superior accuracy and speed performance compared to the original YOLOv5s model and other models in the YOLO series for safety monitoring at construction sites. Full article
(This article belongs to the Special Issue AI-Based Image Processing: 2nd Edition)
Show Figures

Figure 1

Back to TopTop