Special Issue "Deep Learning for Signal Processing Applications"

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 30 October 2021.

Special Issue Editor

Prof. Dr. Valentina E. Balas
E-Mail Website
Guest Editor
Aurel Vlaicu Univ Arad, Bd Revolutiei 77, Arad 310130, Romania
Interests: intelligent systems; soft computing; fuzzy control; modeling and simulation; biometrics
Special Issues and Collections in MDPI journals

Special Issue Information

Dear Colleagues,

In recent times, deep learning has emerged as one of the most effective learning techniques in the broader area of artificial intelligence, especially for image and video analysis. Deep learning techniques have been used extensively for computer vision and recently for video analysis. In fact, in industry and academia scientists and research scholars have come up with effective solutions for various image and video related problems using different deep learning algorithms. The prime reason for the growing popularity of deep learning is that it can achieve high recognition accuracy than earlier methods that were existing before. With the applications of deep learning, excellent results have been achieved in image and video related classification and segmentation. While substantial progress has been achieved in medical image analysis with deep learning, many issues still remain and new problems emerge such as deep learning in medical imaging focusing on MRI with high accuracy, availability of limited datasets for classification task, major problems due to imbalance data sets, detecting diseases from medical imaging, image registration, and computer-aided diagnosis. Apart from medical images, deep learning can also be applied to solve other problems such as image inpainting, sound classification, voice assistants and augmented intelligence, high-resolution medical image reconstruction. Recently due to the introduction of deep learning, video analysis has become more interesting. Earlier it was a challenging task as the video was a data-intensive media with huge variations and complexities. Thanks to the deep learning technology, the multimedia people are now able develop better performance-intensive techniques to analyse the content of the video.

This special issue aims to provide comprehensive coverage on cutting edge research and state of the art methods on deep learning applications, especially with images and videos. Authors are requested to submit papers on the following topics (but not limited to):

  • Image classification and segmentation by deep learning techniques
  • Object Detection, image reconstruction, image super-resolution and image synthesis by deep learning techniques.
  • Cancer imaging using deep learning techniques.
  • Deep Learning in Gastrointestinal Endoscopy.
  • Tumour detection using deep learning.
  • Deep learning for image analysis using multimodality fusion.
  • Image quality recognition methods inspired by deep learning.
  • Advanced Deep Learning methods in computer vision with 3D data.
  • Deep Learning models to solve the task of MOT (Multiple Object Tracking).
  • Novel applications of Deep learning in a video classification framework
  • Deep learning techniques for video semantic segmentation.
  • Applications of deep learning video and image forensics.
  • Video summarization using deep learning.
  • Human action recognition using deep learning.
  • Application of deep learning in satellite imagery.
  • Aerospace, defense and communications.
  • Industrial automation
  • Automotive

Prof. Dr. Valentina E. Balas
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All papers will be peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2000 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Article
Implementation of Pavement Defect Detection System on Edge Computing Platform
Appl. Sci. 2021, 11(8), 3725; https://doi.org/10.3390/app11083725 - 20 Apr 2021
Viewed by 295
Abstract
Road surfaces in Taiwan, as well as other developed countries, often experience structural failures, such as patches, bumps, longitudinal and lateral cracking, and potholes, which cause discomfort and pose direct safety risks to motorists. To minimize damage to vehicles from pavement defects or [...] Read more.
Road surfaces in Taiwan, as well as other developed countries, often experience structural failures, such as patches, bumps, longitudinal and lateral cracking, and potholes, which cause discomfort and pose direct safety risks to motorists. To minimize damage to vehicles from pavement defects or provide the corresponding comfortable ride promotion strategy later, in this study, we developed a pavement defect detection system using a deep learning perception scheme for implementation on Xilinx Edge AI platforms. To increase the detection distance and accuracy of pavement defects, two cameras with different fields of view, at 70 and 30, respectively, were used to capture the front views of a car, and then the YOLOv3 (you only look once, version 3) model was employed to recognize the pavement defects, such as potholes, cracks, manhole covers, patches, and bumps. In addition, to promote continuous pavement defect recognition rate, a tracking-via-detection strategy was employed, which first detects pavement defects in each frame and then associates them to different frames using the Kalman filter method. Thus, the average detection accuracy of the pothole category could reach 71%, and the miss rate was about 29%. To confirm the effectiveness of the proposed detection strategy, experiments were conducted on an established Taiwan pavement defect image dataset (TPDID), which is the first dataset for Taiwan pavement defects. Moreover, different AI methods were used to detect the pavement defects for quantitative comparative analysis. Finally, a field-programmable gate-array-based edge computing platform was used as an embedded system to implement the proposed YOLOv3-based pavement defect detection system; the execution speed reached 27.8 FPS while maintaining the accuracy of the original system model. Full article
(This article belongs to the Special Issue Deep Learning for Signal Processing Applications)
Show Figures

Figure 1

Article
A Deep Neural Network Model for Speaker Identification
by and
Appl. Sci. 2021, 11(8), 3603; https://doi.org/10.3390/app11083603 - 16 Apr 2021
Viewed by 312
Abstract
Speaker identification is a classification task which aims to identify a subject from a given time-series sequential data. Since the speech signal is a continuous one-dimensional time series, most of the current research methods are based on convolutional neural network (CNN) or recurrent [...] Read more.
Speaker identification is a classification task which aims to identify a subject from a given time-series sequential data. Since the speech signal is a continuous one-dimensional time series, most of the current research methods are based on convolutional neural network (CNN) or recurrent neural network (RNN). Indeed, these methods perform well in many tasks, but there is no attempt to combine these two network models to study the speaker identification task. Due to the spectrogram that a speech signal contains, the spatial features of voiceprint (which corresponds to the voice spectrum) and CNN are effective for spatial feature extraction (which corresponds to modeling spectral correlations in acoustic features). At the same time, the speech signal is in a time series, and deep RNN can better represent long utterances than shallow networks. Considering the advantage of gated recurrent unit (GRU) (compared with traditional RNN) in the segmentation of sequence data, we decide to use stacked GRU layers in our model for frame-level feature extraction. In this paper, we propose a deep neural network (DNN) model based on a two-dimensional convolutional neural network (2-D CNN) and gated recurrent unit (GRU) for speaker identification. In the network model design, the convolutional layer is used for voiceprint feature extraction and reduces dimensionality in both the time and frequency domains, allowing for faster GRU layer computation. In addition, the stacked GRU recurrent network layers can learn a speaker’s acoustic features. During this research, we tried to use various neural network structures, including 2-D CNN, deep RNN, and deep LSTM. The above network models were evaluated on the Aishell-1 speech dataset. The experimental results showed that our proposed DNN model, which we call deep GRU, achieved a high recognition accuracy of 98.96%. At the same time, the results also demonstrate the effectiveness of the proposed deep GRU network model versus other models for speaker identification. Through further optimization, this method could be applied to other research similar to the study of speaker identification. Full article
(This article belongs to the Special Issue Deep Learning for Signal Processing Applications)
Show Figures

Figure 1

Article
Real-Time Physical Activity Recognition on Smart Mobile Devices Using Convolutional Neural Networks
Appl. Sci. 2020, 10(23), 8482; https://doi.org/10.3390/app10238482 - 27 Nov 2020
Cited by 1 | Viewed by 508
Abstract
Given the ubiquity of mobile devices, understanding the context of human activity with non-intrusive solutions is of great value. A novel deep neural network model is proposed, which combines feature extraction and convolutional layers, able to recognize human physical activity in real-time from [...] Read more.
Given the ubiquity of mobile devices, understanding the context of human activity with non-intrusive solutions is of great value. A novel deep neural network model is proposed, which combines feature extraction and convolutional layers, able to recognize human physical activity in real-time from tri-axial accelerometer data when run on a mobile device. It uses a two-layer convolutional neural network to extract local features, which are combined with 40 statistical features and are fed to a fully-connected layer. It improves the classification performance, while it takes up 5–8 times less storage space and outputs more than double the throughput of the current state-of-the-art user-independent implementation on the Wireless Sensor Data Mining (WISDM) dataset. It achieves 94.18% classification accuracy on a 10-fold user-independent cross-validation of the WISDM dataset. The model is further tested on the Actitracker dataset, achieving 79.12% accuracy, while the size and throughput of the model are evaluated on a mobile device. Full article
(This article belongs to the Special Issue Deep Learning for Signal Processing Applications)
Show Figures

Figure 1

Article
Deep Learning Models Compression for Agricultural Plants
Appl. Sci. 2020, 10(19), 6866; https://doi.org/10.3390/app10196866 - 30 Sep 2020
Viewed by 1085
Abstract
Deep learning has been successfully showing promising results in plant disease detection, fruit counting, yield estimation, and gaining an increasing interest in agriculture. Deep learning models are generally based on several millions of parameters that generate exceptionally large weight matrices. The latter requires [...] Read more.
Deep learning has been successfully showing promising results in plant disease detection, fruit counting, yield estimation, and gaining an increasing interest in agriculture. Deep learning models are generally based on several millions of parameters that generate exceptionally large weight matrices. The latter requires large memory and computational power for training, testing, and deploying. Unfortunately, these requirements make it difficult to deploy on low-cost devices with limited resources that are present at the fieldwork. In addition, the lack or the bad quality of connectivity in farms does not allow remote computation. An approach that has been used to save memory and speed up the processing is to compress the models. In this work, we tackle the challenges related to the resource limitation by compressing some state-of-the-art models very often used in image classification. For this we apply model pruning and quantization to LeNet5, VGG16, and AlexNet. Original and compressed models were applied to the benchmark of plant seedling classification (V2 Plant Seedlings Dataset) and Flavia database. Results reveal that it is possible to compress the size of these models by a factor of 38 and to reduce the FLOPs of VGG16 by a factor of 99 without considerable loss of accuracy. Full article
(This article belongs to the Special Issue Deep Learning for Signal Processing Applications)
Show Figures

Figure 1

Article
Incremental Dilations Using CNN for Brain Tumor Classification
Appl. Sci. 2020, 10(14), 4915; https://doi.org/10.3390/app10144915 - 17 Jul 2020
Cited by 2 | Viewed by 728
Abstract
Brain tumor classification is a challenging task in the field of medical image processing. Technology has now enabled medical doctors to have additional aid for diagnosis. We aim to classify brain tumors using MRI images, which were collected from anonymous patients and artificial [...] Read more.
Brain tumor classification is a challenging task in the field of medical image processing. Technology has now enabled medical doctors to have additional aid for diagnosis. We aim to classify brain tumors using MRI images, which were collected from anonymous patients and artificial brain simulators. In this article, we carry out a comparative study between Simple Artificial Neural Networks with dropout, Basic Convolutional Neural Networks (CNN), and Dilated Convolutional Neural Networks. The experimental results shed light on the high classification performance (accuracy 97%) of Dilated CNN. On the other hand, Dilated CNN suffers from the gridding phenomenon. An incremental, even number dilation rate takes advantage of the reduced computational overhead and also overcomes the adverse effects of gridding. Comparative analysis between different combinations of dilation rates for the different convolution layers, help validate the results. The computational overhead in terms of efficiency for training the model to reach an acceptable threshold accuracy of 90% is another parameter to compare the model performance. Full article
(This article belongs to the Special Issue Deep Learning for Signal Processing Applications)
Show Figures

Figure 1

Back to TopTop