Deep Learning for Multiple-Level Visual Feature Extraction

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Artificial Intelligence".

Deadline for manuscript submissions: closed (10 February 2021) | Viewed by 4383

Special Issue Editor


E-Mail Website
Guest Editor
School of Electronics Engineering, Kyungpook National University, Daegu 41566, Korea
Interests: deep learning and machine learning theories; acoustic signal processing; speech recognition and enhancement; computer vision; multimedia data analysis; and biomedical signal engineering

Special Issue Information

Dear Colleagues, 

The outputs of deep neural networks usually have different abstraction levels from the input images, if trained with proper data and appropriate learning methods. For example, in the case of convolutional neural networks with smaller kernel sizes, in most cases it is known that the first few layer outputs are simple edge and corner detectors, that outputs from middle layers often react to simple parts of the visual objects, and that outputs from the layers close to the target labels model the most abstractive information of the visual objects. 

This gives us the possibility of using intermediate layer outputs as multi-level descriptors for the given visual objects, which may be used for some other purposes such as explaining the prediction from the networks, reconstruction of the input images, style transfer, etc. Another usage of intermediate layers is the re-use of the information from the previous layers not to lose input information. ResNet and U-Net are recent deep learning models for re-using the outputs from the previous layers. 

This Special Issue focuses on how multiple-level visual features are used. The topics of interest include, but are not limited to: 

  • Shortcut modification in residual networks and U-Net
  • Feature extraction for semantic segmentation
  • Nonlinear feature transformation using deep learning
  • Multiple-level output fusion in deep neural networks
  • Multiple-level feature transformation in deep neural networks
  • Weakly supervised learning visual feature extraction
  • Pooling methods for feature abstraction in deep neural networks
  • Nonlinear dimensionality reduction by deep neural networks

Prof. Dr. Gil-Jin Jang
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Shortcut modification in residual networks and U-Net
  • Feature extraction for semantic segmentation
  • Nonlinear feature transformation using deep learning
  • Multiple-level output fusion in deep neural networks
  • Multiple-level feature transformation in deep neural networks
  • Weakly supervised learning visual feature extraction
  • Pooling methods for feature abstraction in deep neural networks
  • Nonlinear dimensionality reduction by deep neural networks.

Published Papers (1 paper)

Order results
Result details
Select all
Export citation of selected articles as:

Research

17 pages, 633 KiB  
Article
Improving Mispronunciation Detection of Arabic Words for Non-Native Learners Using Deep Convolutional Neural Network Features
by Shamila Akhtar, Fawad Hussain, Fawad Riasat Raja, Muhammad Ehatisham-ul-haq, Naveed Khan Baloch, Farruh Ishmanov and Yousaf Bin Zikria
Electronics 2020, 9(6), 963; https://doi.org/10.3390/electronics9060963 - 09 Jun 2020
Cited by 26 | Viewed by 3938
Abstract
Computer-Aided Language Learning (CALL) is growing nowadays because learning new languages is essential for communication with people of different linguistic backgrounds. Mispronunciation detection is an integral part of CALL, which is used for automatic pointing of errors for the non-native speaker. In this [...] Read more.
Computer-Aided Language Learning (CALL) is growing nowadays because learning new languages is essential for communication with people of different linguistic backgrounds. Mispronunciation detection is an integral part of CALL, which is used for automatic pointing of errors for the non-native speaker. In this paper, we investigated the mispronunciation detection of Arabic words using deep Convolution Neural Network (CNN). For automated pronunciation error detection, we proposed CNN features-based model and extracted features from different layers of Alex Net (layers 6, 7, and 8) to train three machine learning classifiers; K-nearest neighbor (KNN), Support Vector Machine (SVM) and Random Forest (RF). We also used a transfer learning-based model in which feature extraction and classification are performed automatically. To evaluate the performance of the proposed method, a comprehensive evaluation is provided on these methods with a traditional machine learning-based method using Mel Frequency Cepstral Coefficients (MFCC) features. We used the same three classifiers KNN, SVM, and RF in the baseline method for mispronunciation detection. Experimental results show that with handcrafted features, transfer learning-based method and classification based on deep features extracted from Alex Net achieved an average accuracy of 73.67, 85 and 93.20 on Arabic words, respectively. Moreover, these results reveal that the proposed method with feature selection achieved the best average accuracy of 93.20% than all other methods. Full article
(This article belongs to the Special Issue Deep Learning for Multiple-Level Visual Feature Extraction)
Show Figures

Figure 1

Back to TopTop