Artificial Intelligence (AI) for Image Processing

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Computer Science & Engineering".

Deadline for manuscript submissions: closed (28 February 2023) | Viewed by 25602

Special Issue Editor


E-Mail Website
Guest Editor
School of Physics and Electronic Information Engineering, Zhejiang Normal University, Jinhua 321004, China
Interests: artificial intelligence; swarm intelligence; deep learning; data science; remote sensing; hyperspectral image processing;
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

We launch this Special Issue on “Artificial Intelligence (AI) for image Processing”. AI methods are widely adopted by academics and the industry for different applications, including image processing applications, such as image segmentation, classification, and recognition.  There are different AI technologies, such as machine learning, metaheuristic optimization algorithms (including swarm intelligence (SI) and bio-inspired algorithms), knowledge, and expert systems.

This Special Issue presents a forum for the publication of articles describing the use of classical and modern artificial intelligence methods in image processing applications.

The main aim of this Special Issue is to capture recent contributions of high-quality papers focusing on advanced image processing and analysis applications, including medical images, remote sensing images, galaxy images, and others. We are pleased to invite our colleagues to contribute original research papers as well as review papers that focus on the applications of artificial intelligence methods, including traditional machine learning methods, advanced deep learning approaches, metaheuristic optimization algorithms, and other AI-based methods for solving image processing problems. 

The topics of this Special Issue include (but are not limited to) the following:

  • Image classification and recognition
  • Machine learning for image processing
  • Metaheuristic optimization algorithms for image processing
  • Remote sensing image classification
  • Medical image classification
  • Neural computing for image processing
  • Evolutionary algorithms for image processing

Dr. Mohammed A. A. Al-qaness
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Image processing
  • Deep learning
  • Metaheuristic algorithms
  • Swarm intelligence
  • Medical image processing
  • Image segmentation

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

18 pages, 7077 KiB  
Article
ASA-DRNet: An Improved Deeplabv3+ Framework for SAR Image Segmentation
by Siyuan Chen, Xueyun Wei and Wei Zheng
Electronics 2023, 12(6), 1300; https://doi.org/10.3390/electronics12061300 - 8 Mar 2023
Cited by 1 | Viewed by 1661
Abstract
Pollution caused by oil spills does irreversible harm to marine biosystems. To find maritime oil spills, Synthetic Aperture Radar (SAR) has emerged as a crucial mean. How to accurately distinguish oil spill areas from other types of areas is a committed step in [...] Read more.
Pollution caused by oil spills does irreversible harm to marine biosystems. To find maritime oil spills, Synthetic Aperture Radar (SAR) has emerged as a crucial mean. How to accurately distinguish oil spill areas from other types of areas is a committed step in detecting oil spills. Owing to its capacity to extract multiscale features and its distinctive decoder, the Deeplabv3+ framework has been developed into an excellent deep learning model in field of picture segmentation. However, in some SAR pictures, there is a lack of clarity in the segmentation of oil film edges and incorrect segmentation of small areas. In order to solve these problems, an improved network, named ASA-DRNet, has been proposed. Firstly, a new structure which combines an axial self-attention module with ResNet-18 is proposed as the backbone of DeepLabv3+ encoder. Secondly, a atrous spatial pyramid pooling (ASPP) module is optimized to improve the network’s capacity of extracting multiscale features and to increase the speed of model calculation and finally merging low-level features of different resolutions to enhance the competence of network to extract edge information. The experiments show that ASA-DRNet obtains the better results compared to other neural network models. Full article
(This article belongs to the Special Issue Artificial Intelligence (AI) for Image Processing)
Show Figures

Figure 1

14 pages, 1214 KiB  
Article
Deep-Learning-Based Sequence Causal Long-Term Recurrent Convolutional Network for Data Fusion Using Video Data
by DaeHyeon Jeon and Min-Suk Kim
Electronics 2023, 12(5), 1115; https://doi.org/10.3390/electronics12051115 - 24 Feb 2023
Viewed by 1588
Abstract
The purpose of AI-Based schemes in intelligent systems is to advance and optimize system performance. Most intelligent systems adopt sequential data types derived from such systems. Realtime video data, for example, are continuously updated as a sequence to make necessary predictions for efficient [...] Read more.
The purpose of AI-Based schemes in intelligent systems is to advance and optimize system performance. Most intelligent systems adopt sequential data types derived from such systems. Realtime video data, for example, are continuously updated as a sequence to make necessary predictions for efficient system performance. The majority of deep-learning-based network architectures such as long short-term memory (LSTM), data fusion, two streams, and temporal convolutional network (TCN) for sequence data fusion are generally used to enhance robust system efficiency. In this paper, we propose a deep-learning-based neural network architecture for non-fix data that uses both a causal convolutional neural network (CNN) and a long-term recurrent convolutional network (LRCN). Causal CNNs and LRCNs use incorporated convolutional layers for feature extraction, so both architectures are capable of processing sequential data such as time series or video data that can be used in a variety of applications. Both architectures also have extracted features from the input sequence data to reduce the dimensionality of the data and capture the important information, and learn hierarchical representations for effective sequence processing tasks. We have also adopted a concept of series compact convolutional recurrent neural network (SCCRNN), which is a type of neural network architecture designed for processing sequential data combined by both convolutional and recurrent layers compactly, reducing the number of parameters and memory usage to maintain high accuracy. The architecture is challenge-able and suitable for continuously incoming sequence video data, and doing so allowed us to bring advantages to both LSTM-based networks and CNNbased networks. To verify this method, we evaluated it through a sequence learning model with network parameters and memory that are required in real environments based on the UCF-101 dataset, which is an action recognition data set of realistic action videos, collected from YouTube with 101 action categories. The results show that the proposed model in a sequence causal long-term recurrent convolutional network (SCLRCN) provides a performance improvement of at least 12% approximately or more to be compared with the existing models (LRCN and TCN). Full article
(This article belongs to the Special Issue Artificial Intelligence (AI) for Image Processing)
Show Figures

Figure 1

14 pages, 17709 KiB  
Article
Rice Disease Identification Method Based on Attention Mechanism and Deep Dense Network
by Minlan Jiang, Changguang Feng, Xiaosheng Fang, Qi Huang, Changjiang Zhang and Xiaowei Shi
Electronics 2023, 12(3), 508; https://doi.org/10.3390/electronics12030508 - 18 Jan 2023
Cited by 11 | Viewed by 2037
Abstract
It is of great practical significance to quickly, accurately, and effectively identify the effects of rice diseases on rice yield. This paper proposes a rice disease identification method based on an improved DenseNet network (DenseNet). This method uses DenseNet as the benchmark model [...] Read more.
It is of great practical significance to quickly, accurately, and effectively identify the effects of rice diseases on rice yield. This paper proposes a rice disease identification method based on an improved DenseNet network (DenseNet). This method uses DenseNet as the benchmark model and uses the channel attention mechanism squeeze-and-excitation to strengthen the favorable features, while suppressing the unfavorable features. Then, depth wise separable convolutions are introduced to replace some standard convolutions in the dense network to improve the parameter utilization and training speed. Using the AdaBound algorithm, combined with the adaptive optimization method, the parameter adjustment time reduces. In the experiments on five kinds of rice disease datasets, the average classification accuracy of the method in this paper is 99.4%, which is 13.8 percentage points higher than the original model. At the same time, it is compared with other existing recognition methods, such as ResNet, VGG, and Vision Transformer. The recognition accuracy of this method is higher, realizes the effective classification of rice disease images, and provides a new method for the development of crop disease identification technology and smart agriculture. Full article
(This article belongs to the Special Issue Artificial Intelligence (AI) for Image Processing)
Show Figures

Figure 1

19 pages, 14182 KiB  
Article
RGB-Based Triple-Dual-Path Recurrent Network for Underwater Image Dehazing
by Fayadh Alenezi
Electronics 2022, 11(18), 2894; https://doi.org/10.3390/electronics11182894 - 13 Sep 2022
Cited by 5 | Viewed by 1250
Abstract
In this paper, we present a powerful underwater image dehazing technique that exploits two image characteristics—RGB color channels and image features. In using RGB color channels, each color channel is decomposed into two units based on the similarities via the k-mean. This markedly [...] Read more.
In this paper, we present a powerful underwater image dehazing technique that exploits two image characteristics—RGB color channels and image features. In using RGB color channels, each color channel is decomposed into two units based on the similarities via the k-mean. This markedly improves the adaptability and identification of similar pixels, and thus reduces pixels with a weak correlation, leaving only pixels with a higher correlation. We use an infinite impulse response (IIR) in the triple-dual and parallel interaction structure to suppress hazed pixels via a pixel comparison and amplification to increase the visibility of even very minor features. This improves the visual perception of the final image, thus improving the overall usefulness and quality of the image. The softmax-weighted fusion is finally used to fuse the output color channel features to attain the final image. This preserves the color, leaving our proposed method’s output very true to the original scene’s. This is accomplished by taking advantage of adaptive learning based on the confidence levels of the pixel contribution variation in each color channel during subsequent fuses. The proposed technique both visually and objectively outperforms the existing methods in several rigorous tests. Full article
(This article belongs to the Special Issue Artificial Intelligence (AI) for Image Processing)
Show Figures

Figure 1

20 pages, 4514 KiB  
Article
Detection of Diabetic Retinopathy in Retinal Fundus Images Using CNN Classification Models
by Al-Omaisi Asia, Cheng-Zhang Zhu, Sara A. Althubiti, Dalal Al-Alimi, Ya-Long Xiao, Ping-Bo Ouyang and Mohammed A. A. Al-Qaness
Electronics 2022, 11(17), 2740; https://doi.org/10.3390/electronics11172740 - 31 Aug 2022
Cited by 22 | Viewed by 8981
Abstract
Diabetes is a widespread disease in the world and can lead to diabetic retinopathy, macular edema, and other obvious microvascular complications in the retina of the human eye. This study attempts to detect diabetic retinopathy (DR), which has been the main reason behind [...] Read more.
Diabetes is a widespread disease in the world and can lead to diabetic retinopathy, macular edema, and other obvious microvascular complications in the retina of the human eye. This study attempts to detect diabetic retinopathy (DR), which has been the main reason behind the blindness of people in the last decade. Timely or early treatment is necessary to prevent some DR complications and control blood glucose. DR is very difficult to detect in time-consuming manual diagnosis because of its diversity and complexity. This work utilizes a deep learning application, a convolutional neural network (CNN), in fundus photography to distinguish the stages of DR. The images dataset in this study is obtained from Xiangya No. 2 Hospital Ophthalmology (XHO), Changsha, China, which is very large, little and the labels are unbalanced. Thus, this study first solves the problem of the existing dataset by proposing a method that uses preprocessing, regularization, and augmentation steps to increase and prepare the image dataset of XHO for training and improve performance. Then, it takes the advantages of the power of CNN with different residual neural network (ResNet) structures, namely, ResNet-101, ResNet-50, and VggNet-16, to detect DR on XHO datasets. ResNet-101 achieved the maximum level of accuracy, 0.9888, with a training loss of 0.3499 and a testing loss of 0.9882. ResNet-101 is then assessed on 1787 photos from the HRF, STARE, DIARETDB0, and XHO databases, achieving an average accuracy of 0.97, which is greater than prior efforts. Results prove that the CNN model (ResNet-101) has better accuracy than ResNet-50 and VggNet-16 in DR image classification. Full article
(This article belongs to the Special Issue Artificial Intelligence (AI) for Image Processing)
Show Figures

Figure 1

15 pages, 3014 KiB  
Article
Segmentation of Spectral Plant Images Using Generative Adversary Network Techniques
by Sanjay Kumar, Sahil Kansal, Monagi H. Alkinani, Ahmed Elaraby, Saksham Garg, Shanthi Natarajan and Vishnu Sharma
Electronics 2022, 11(16), 2611; https://doi.org/10.3390/electronics11162611 - 20 Aug 2022
Cited by 2 | Viewed by 1542
Abstract
The spectral image analysis of complex analytic systems is usually performed in analytical chemistry. Signals associated with the key analytics present in an image scene are extracted during spectral image analysis. Accordingly, the first step in spectral image analysis is to segment the [...] Read more.
The spectral image analysis of complex analytic systems is usually performed in analytical chemistry. Signals associated with the key analytics present in an image scene are extracted during spectral image analysis. Accordingly, the first step in spectral image analysis is to segment the image in order to extract the applicable signals for analysis. In contrast, using traditional methods of image segmentation in chronometry makes it difficult to extract the relevant signals. None of the approaches incorporate contextual information present in an image scene; therefore, the classification is limited to thresholds or pixels only. An image translation pixel-to-pixel (p2p) method for segmenting spectral images using a generative adversary network (GAN) is presented in this paper. The p2p GAN forms two neuronal models. During the production and detection processes, the representation learns how to segment ethereal images precisely. For the evaluation of the results, a partial discriminate analysis of the least-squares method was used to classify the images based on thresholds and pixels. From the experimental results, it was determined that the GAN-based p2p segmentation performs the best segmentation with an overall accuracy of 0.98 ± 0.06. This result shows that image processing techniques using deep learning contribute to enhanced spectral image processing. The outcomes of this research demonstrated the effectiveness of image-processing techniques that use deep learning to enhance spectral-image processing. Full article
(This article belongs to the Special Issue Artificial Intelligence (AI) for Image Processing)
Show Figures

Figure 1

19 pages, 12444 KiB  
Article
Detection of COVID-19 from Deep Breathing Sounds Using Sound Spectrum with Image Augmentation and Deep Learning Techniques
by Olusola O. Abayomi-Alli, Robertas Damaševičius, Aaqif Afzaal Abbasi and Rytis Maskeliūnas
Electronics 2022, 11(16), 2520; https://doi.org/10.3390/electronics11162520 - 11 Aug 2022
Cited by 7 | Viewed by 2449
Abstract
The COVID-19 pandemic is one of the most disruptive outbreaks of the 21st century considering its impacts on our freedoms and social lifestyle. Several methods have been used to monitor and diagnose this virus, which includes the use of RT-PCR test and chest [...] Read more.
The COVID-19 pandemic is one of the most disruptive outbreaks of the 21st century considering its impacts on our freedoms and social lifestyle. Several methods have been used to monitor and diagnose this virus, which includes the use of RT-PCR test and chest CT/CXR scans. Recent studies have employed various crowdsourced sound data types such as coughing, breathing, sneezing, etc., for the detection of COVID-19. However, the application of artificial intelligence methods and machine learning algorithms on these sound datasets still suffer some limitations such as the poor performance of the test results due to increase of misclassified data, limited datasets resulting in the overfitting of deep learning methods, the high computational cost of some augmentation models, and varying quality feature-extracted images resulting in poor reliability. We propose a simple yet effective deep learning model, called DeepShufNet, for COVID-19 detection. A data augmentation method based on the color transformation and noise addition was used for generating synthetic image datasets from sound data. The efficiencies of the synthetic dataset were evaluated using two feature extraction approaches, namely Mel spectrogram and GFCC. The performance of the proposed DeepShufNet model was evaluated using a deep breathing COSWARA dataset, which shows improved performance with a lower misclassification rate of the minority class. The proposed model achieved an accuracy, precision, recall, specificity, and f-score of 90.1%, 77.1%, 62.7%, 95.98%, and 69.1%, respectively, for positive COVID-19 detection using the Mel COCOA-2 augmented training datasets. The proposed model showed an improved performance compared to some of the state-of-the-art-methods. Full article
(This article belongs to the Special Issue Artificial Intelligence (AI) for Image Processing)
Show Figures

Figure 1

18 pages, 2697 KiB  
Article
Cinematographic Shot Classification with Deep Ensemble Learning
by Bartolomeo Vacchetti and Tania Cerquitelli
Electronics 2022, 11(10), 1570; https://doi.org/10.3390/electronics11101570 - 13 May 2022
Cited by 5 | Viewed by 2079
Abstract
Cinematographic shot classification assigns a category to each shot either on the basis of the field size or on the movement performed by the camera. In this work, we focus on the camera field of view, which is determined by the portion of [...] Read more.
Cinematographic shot classification assigns a category to each shot either on the basis of the field size or on the movement performed by the camera. In this work, we focus on the camera field of view, which is determined by the portion of the subject and of the environment shown in the field of view of the camera. The automation of this task can help freelancers and studios belonging to the visual creative field in their daily activities. In our study, we took into account eight classes of film shots: long shot, medium shot, full figure, american shot, half figure, half torso, close up and extreme close up. The cinematographic shot classification is a complex task, so we combined state-of-the-art techniques to deal with it. Specifically, we finetuned three separated VGG-16 models and combined their predictions in order to obtain better performances by exploiting the stacking learning technique. Experimental results demonstrate the effectiveness of the proposed approach in performing the classification task with good accuracy. Our method was able to achieve 77% accuracy without relying on data augmentation techniques. We also evaluated our approach in terms of f1 score, precision, and recall and we showed confusion matrices to show that most of our misclassified samples belonged to a neighboring class. Full article
(This article belongs to the Special Issue Artificial Intelligence (AI) for Image Processing)
Show Figures

Graphical abstract

Back to TopTop