Electronics

Research

18 pages, 7077 KiB

Open AccessArticle

ASA-DRNet: An Improved Deeplabv3+ Framework for SAR Image Segmentation

by Siyuan Chen, Xueyun Wei and Wei Zheng

Electronics 2023, 12(6), 1300; https://doi.org/10.3390/electronics12061300 - 8 Mar 2023

Cited by 1 | Viewed by 1661

Pollution caused by oil spills does irreversible harm to marine biosystems. To find maritime oil spills, Synthetic Aperture Radar (SAR) has emerged as a crucial mean. How to accurately distinguish oil spill areas from other types of areas is a committed step in [...] Read more.

Pollution caused by oil spills does irreversible harm to marine biosystems. To find maritime oil spills, Synthetic Aperture Radar (SAR) has emerged as a crucial mean. How to accurately distinguish oil spill areas from other types of areas is a committed step in detecting oil spills. Owing to its capacity to extract multiscale features and its distinctive decoder, the Deeplabv3+ framework has been developed into an excellent deep learning model in field of picture segmentation. However, in some SAR pictures, there is a lack of clarity in the segmentation of oil film edges and incorrect segmentation of small areas. In order to solve these problems, an improved network, named ASA-DRNet, has been proposed. Firstly, a new structure which combines an axial self-attention module with ResNet-18 is proposed as the backbone of DeepLabv3+ encoder. Secondly, a atrous spatial pyramid pooling (ASPP) module is optimized to improve the network’s capacity of extracting multiscale features and to increase the speed of model calculation and finally merging low-level features of different resolutions to enhance the competence of network to extract edge information. The experiments show that ASA-DRNet obtains the better results compared to other neural network models. Full article

(This article belongs to the Special Issue Artificial Intelligence (AI) for Image Processing)

► Show Figures

Figure 1

14 pages, 1214 KiB

Open AccessArticle

Deep-Learning-Based Sequence Causal Long-Term Recurrent Convolutional Network for Data Fusion Using Video Data

by DaeHyeon Jeon and Min-Suk Kim

Electronics 2023, 12(5), 1115; https://doi.org/10.3390/electronics12051115 - 24 Feb 2023

Viewed by 1588

Abstract

The purpose of AI-Based schemes in intelligent systems is to advance and optimize system performance. Most intelligent systems adopt sequential data types derived from such systems. Realtime video data, for example, are continuously updated as a sequence to make necessary predictions for efficient [...] Read more.

The purpose of AI-Based schemes in intelligent systems is to advance and optimize system performance. Most intelligent systems adopt sequential data types derived from such systems. Realtime video data, for example, are continuously updated as a sequence to make necessary predictions for efficient system performance. The majority of deep-learning-based network architectures such as long short-term memory (LSTM), data fusion, two streams, and temporal convolutional network (TCN) for sequence data fusion are generally used to enhance robust system efficiency. In this paper, we propose a deep-learning-based neural network architecture for non-fix data that uses both a causal convolutional neural network (CNN) and a long-term recurrent convolutional network (LRCN). Causal CNNs and LRCNs use incorporated convolutional layers for feature extraction, so both architectures are capable of processing sequential data such as time series or video data that can be used in a variety of applications. Both architectures also have extracted features from the input sequence data to reduce the dimensionality of the data and capture the important information, and learn hierarchical representations for effective sequence processing tasks. We have also adopted a concept of series compact convolutional recurrent neural network (SCCRNN), which is a type of neural network architecture designed for processing sequential data combined by both convolutional and recurrent layers compactly, reducing the number of parameters and memory usage to maintain high accuracy. The architecture is challenge-able and suitable for continuously incoming sequence video data, and doing so allowed us to bring advantages to both LSTM-based networks and CNNbased networks. To verify this method, we evaluated it through a sequence learning model with network parameters and memory that are required in real environments based on the UCF-101 dataset, which is an action recognition data set of realistic action videos, collected from YouTube with 101 action categories. The results show that the proposed model in a sequence causal long-term recurrent convolutional network (SCLRCN) provides a performance improvement of at least 12% approximately or more to be compared with the existing models (LRCN and TCN). Full article

(This article belongs to the Special Issue Artificial Intelligence (AI) for Image Processing)

► Show Figures

Figure 1

14 pages, 17709 KiB

Open AccessArticle

Rice Disease Identification Method Based on Attention Mechanism and Deep Dense Network

by Minlan Jiang, Changguang Feng, Xiaosheng Fang, Qi Huang, Changjiang Zhang and Xiaowei Shi

Electronics 2023, 12(3), 508; https://doi.org/10.3390/electronics12030508 - 18 Jan 2023

Cited by 11 | Viewed by 2037

Abstract

It is of great practical significance to quickly, accurately, and effectively identify the effects of rice diseases on rice yield. This paper proposes a rice disease identification method based on an improved DenseNet network (DenseNet). This method uses DenseNet as the benchmark model [...] Read more.

It is of great practical significance to quickly, accurately, and effectively identify the effects of rice diseases on rice yield. This paper proposes a rice disease identification method based on an improved DenseNet network (DenseNet). This method uses DenseNet as the benchmark model and uses the channel attention mechanism squeeze-and-excitation to strengthen the favorable features, while suppressing the unfavorable features. Then, depth wise separable convolutions are introduced to replace some standard convolutions in the dense network to improve the parameter utilization and training speed. Using the AdaBound algorithm, combined with the adaptive optimization method, the parameter adjustment time reduces. In the experiments on five kinds of rice disease datasets, the average classification accuracy of the method in this paper is 99.4%, which is 13.8 percentage points higher than the original model. At the same time, it is compared with other existing recognition methods, such as ResNet, VGG, and Vision Transformer. The recognition accuracy of this method is higher, realizes the effective classification of rice disease images, and provides a new method for the development of crop disease identification technology and smart agriculture. Full article

(This article belongs to the Special Issue Artificial Intelligence (AI) for Image Processing)

► Show Figures

Figure 1

19 pages, 14182 KiB

Open AccessArticle

RGB-Based Triple-Dual-Path Recurrent Network for Underwater Image Dehazing

by Fayadh Alenezi

Electronics 2022, 11(18), 2894; https://doi.org/10.3390/electronics11182894 - 13 Sep 2022

Cited by 5 | Viewed by 1250

Abstract

In this paper, we present a powerful underwater image dehazing technique that exploits two image characteristics—RGB color channels and image features. In using RGB color channels, each color channel is decomposed into two units based on the similarities via the k-mean. This markedly [...] Read more.

In this paper, we present a powerful underwater image dehazing technique that exploits two image characteristics—RGB color channels and image features. In using RGB color channels, each color channel is decomposed into two units based on the similarities via the k-mean. This markedly improves the adaptability and identification of similar pixels, and thus reduces pixels with a weak correlation, leaving only pixels with a higher correlation. We use an infinite impulse response (IIR) in the triple-dual and parallel interaction structure to suppress hazed pixels via a pixel comparison and amplification to increase the visibility of even very minor features. This improves the visual perception of the final image, thus improving the overall usefulness and quality of the image. The softmax-weighted fusion is finally used to fuse the output color channel features to attain the final image. This preserves the color, leaving our proposed method’s output very true to the original scene’s. This is accomplished by taking advantage of adaptive learning based on the confidence levels of the pixel contribution variation in each color channel during subsequent fuses. The proposed technique both visually and objectively outperforms the existing methods in several rigorous tests. Full article

(This article belongs to the Special Issue Artificial Intelligence (AI) for Image Processing)

► Show Figures

Figure 1

20 pages, 4514 KiB

Open AccessArticle

Detection of Diabetic Retinopathy in Retinal Fundus Images Using CNN Classification Models

by Al-Omaisi Asia, Cheng-Zhang Zhu, Sara A. Althubiti, Dalal Al-Alimi, Ya-Long Xiao, Ping-Bo Ouyang and Mohammed A. A. Al-Qaness

Electronics 2022, 11(17), 2740; https://doi.org/10.3390/electronics11172740 - 31 Aug 2022

Cited by 22 | Viewed by 8981

Abstract

Diabetes is a widespread disease in the world and can lead to diabetic retinopathy, macular edema, and other obvious microvascular complications in the retina of the human eye. This study attempts to detect diabetic retinopathy (DR), which has been the main reason behind [...] Read more.

Diabetes is a widespread disease in the world and can lead to diabetic retinopathy, macular edema, and other obvious microvascular complications in the retina of the human eye. This study attempts to detect diabetic retinopathy (DR), which has been the main reason behind the blindness of people in the last decade. Timely or early treatment is necessary to prevent some DR complications and control blood glucose. DR is very difficult to detect in time-consuming manual diagnosis because of its diversity and complexity. This work utilizes a deep learning application, a convolutional neural network (CNN), in fundus photography to distinguish the stages of DR. The images dataset in this study is obtained from Xiangya No. 2 Hospital Ophthalmology (XHO), Changsha, China, which is very large, little and the labels are unbalanced. Thus, this study first solves the problem of the existing dataset by proposing a method that uses preprocessing, regularization, and augmentation steps to increase and prepare the image dataset of XHO for training and improve performance. Then, it takes the advantages of the power of CNN with different residual neural network (ResNet) structures, namely, ResNet-101, ResNet-50, and VggNet-16, to detect DR on XHO datasets. ResNet-101 achieved the maximum level of accuracy, 0.9888, with a training loss of 0.3499 and a testing loss of 0.9882. ResNet-101 is then assessed on 1787 photos from the HRF, STARE, DIARETDB0, and XHO databases, achieving an average accuracy of 0.97, which is greater than prior efforts. Results prove that the CNN model (ResNet-101) has better accuracy than ResNet-50 and VggNet-16 in DR image classification. Full article

(This article belongs to the Special Issue Artificial Intelligence (AI) for Image Processing)

► Show Figures

Figure 1

15 pages, 3014 KiB

Open AccessArticle

Segmentation of Spectral Plant Images Using Generative Adversary Network Techniques

by Sanjay Kumar, Sahil Kansal, Monagi H. Alkinani, Ahmed Elaraby, Saksham Garg, Shanthi Natarajan and Vishnu Sharma

Electronics 2022, 11(16), 2611; https://doi.org/10.3390/electronics11162611 - 20 Aug 2022

Cited by 2 | Viewed by 1542

Abstract

The spectral image analysis of complex analytic systems is usually performed in analytical chemistry. Signals associated with the key analytics present in an image scene are extracted during spectral image analysis. Accordingly, the first step in spectral image analysis is to segment the [...] Read more.

The spectral image analysis of complex analytic systems is usually performed in analytical chemistry. Signals associated with the key analytics present in an image scene are extracted during spectral image analysis. Accordingly, the first step in spectral image analysis is to segment the image in order to extract the applicable signals for analysis. In contrast, using traditional methods of image segmentation in chronometry makes it difficult to extract the relevant signals. None of the approaches incorporate contextual information present in an image scene; therefore, the classification is limited to thresholds or pixels only. An image translation pixel-to-pixel (p2p) method for segmenting spectral images using a generative adversary network (GAN) is presented in this paper. The p2p GAN forms two neuronal models. During the production and detection processes, the representation learns how to segment ethereal images precisely. For the evaluation of the results, a partial discriminate analysis of the least-squares method was used to classify the images based on thresholds and pixels. From the experimental results, it was determined that the GAN-based p2p segmentation performs the best segmentation with an overall accuracy of 0.98 ± 0.06. This result shows that image processing techniques using deep learning contribute to enhanced spectral image processing. The outcomes of this research demonstrated the effectiveness of image-processing techniques that use deep learning to enhance spectral-image processing. Full article

(This article belongs to the Special Issue Artificial Intelligence (AI) for Image Processing)

► Show Figures

Figure 1

19 pages, 12444 KiB

Open AccessArticle

Detection of COVID-19 from Deep Breathing Sounds Using Sound Spectrum with Image Augmentation and Deep Learning Techniques

by Olusola O. Abayomi-Alli, Robertas Damaševičius, Aaqif Afzaal Abbasi and Rytis Maskeliūnas

Electronics 2022, 11(16), 2520; https://doi.org/10.3390/electronics11162520 - 11 Aug 2022

Cited by 7 | Viewed by 2449

Abstract

The COVID-19 pandemic is one of the most disruptive outbreaks of the 21st century considering its impacts on our freedoms and social lifestyle. Several methods have been used to monitor and diagnose this virus, which includes the use of RT-PCR test and chest [...] Read more.

The COVID-19 pandemic is one of the most disruptive outbreaks of the 21st century considering its impacts on our freedoms and social lifestyle. Several methods have been used to monitor and diagnose this virus, which includes the use of RT-PCR test and chest CT/CXR scans. Recent studies have employed various crowdsourced sound data types such as coughing, breathing, sneezing, etc., for the detection of COVID-19. However, the application of artificial intelligence methods and machine learning algorithms on these sound datasets still suffer some limitations such as the poor performance of the test results due to increase of misclassified data, limited datasets resulting in the overfitting of deep learning methods, the high computational cost of some augmentation models, and varying quality feature-extracted images resulting in poor reliability. We propose a simple yet effective deep learning model, called DeepShufNet, for COVID-19 detection. A data augmentation method based on the color transformation and noise addition was used for generating synthetic image datasets from sound data. The efficiencies of the synthetic dataset were evaluated using two feature extraction approaches, namely Mel spectrogram and GFCC. The performance of the proposed DeepShufNet model was evaluated using a deep breathing COSWARA dataset, which shows improved performance with a lower misclassification rate of the minority class. The proposed model achieved an accuracy, precision, recall, specificity, and f-score of 90.1%, 77.1%, 62.7%, 95.98%, and 69.1%, respectively, for positive COVID-19 detection using the Mel COCOA-2 augmented training datasets. The proposed model showed an improved performance compared to some of the state-of-the-art-methods. Full article

(This article belongs to the Special Issue Artificial Intelligence (AI) for Image Processing)

► Show Figures

Figure 1

18 pages, 2697 KiB

Open AccessArticle

Cinematographic Shot Classification with Deep Ensemble Learning

by Bartolomeo Vacchetti and Tania Cerquitelli

Electronics 2022, 11(10), 1570; https://doi.org/10.3390/electronics11101570 - 13 May 2022

Cited by 5 | Viewed by 2079

Abstract

Cinematographic shot classification assigns a category to each shot either on the basis of the field size or on the movement performed by the camera. In this work, we focus on the camera field of view, which is determined by the portion of [...] Read more.

Cinematographic shot classification assigns a category to each shot either on the basis of the field size or on the movement performed by the camera. In this work, we focus on the camera field of view, which is determined by the portion of the subject and of the environment shown in the field of view of the camera. The automation of this task can help freelancers and studios belonging to the visual creative field in their daily activities. In our study, we took into account eight classes of film shots: long shot, medium shot, full figure, american shot, half figure, half torso, close up and extreme close up. The cinematographic shot classification is a complex task, so we combined state-of-the-art techniques to deal with it. Specifically, we finetuned three separated VGG-16 models and combined their predictions in order to obtain better performances by exploiting the stacking learning technique. Experimental results demonstrate the effectiveness of the proposed approach in performing the classification task with good accuracy. Our method was able to achieve 77% accuracy without relying on data augmentation techniques. We also evaluated our approach in terms of f1 score, precision, and recall and we showed confusion matrices to show that most of our misclassified samples belonged to a neighboring class. Full article

(This article belongs to the Special Issue Artificial Intelligence (AI) for Image Processing)

► Show Figures

Graphical abstract

Journal Menu

Journal Browser

Artificial Intelligence (AI) for Image Processing

Share This Special Issue

Special Issue Editor

Special Issue Information

Keywords

Published Papers (8 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI