Special Issue "Computer Vision and Pattern Recognition in the Era of Deep Learning"

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 15 December 2019.

Special Issue Editor

Guest Editor
Prof. Athanasios Nikolaidis

Department of Computer Engineering, Technological Educational Institute of Central Macedonia, Greece
Website | E-Mail
Interests: image processing; computer vision; computer graphics; pattern recognition; virtual and augmented reality; multimedia systems and applications

Special Issue Information

Dear Colleagues,

Deep learning has become a highly popular trend in the machine learning community in recent years, although the term was coined several decades ago. The idea behind deep learning is to try to imitate the function of the human brain by constructing an artificial neural network that has multiple hidden layers, in order to learn better features compared to a conventional shallow network. More precisely, deep learning introduces a hierarchical learning architecture that resembles the layered learning process that takes place in the primary sensory areas of the neocortex in the human brain. It has been shown that by increasing the size of the input dataset, the performance of deep networks increases at a much higher rate than that of shallow networks after a point. This has enabled the practical use of deep neural networks in recent years, since a vast amount of unlabeled multimedia information is now available, and the processing capability of modern computers has risen immensely.

Deep learning and related neural networks such as CNNs and RNNs have already been exploited in a great variety of applications such as automatic text translation, spoken language recognition, music composition, autonomous vehicles, robots, medical diagnosis, stock market prediction, and so on.

An especially popular field of deep learning applications has been that of computer vision and pattern recognition. Typical examples of areas where deep networks have been used are object detection, face detection and recognition, optical character recognition, and image classification. In this Special Issue, we welcome contributions from scholars in all related subjects, presenting either a deep learning solution to a novel application, or a deep learning enhancement to a preexisting application.

Prof. Athanasios Nikolaidis
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All papers will be peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1500 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Color restoration
  • Face detection
  • Pose estimation
  • Sentiment recognition
  • Behavior analysis
  • Text image translation
  • Automated lip reading
  • Image synthesis
  • Image classification
  • Handwriting recognition
  • Object detection
  • Object classification

Published Papers (7 papers)

View options order results:
result details:
Displaying articles 1-7
Export citation of selected articles as:

Research

Jump to: Review

Open AccessArticle
1D Barcode Detection via Integrated Deep-Learning and Geometric Approach
Appl. Sci. 2019, 9(16), 3268; https://doi.org/10.3390/app9163268
Received: 29 June 2019 / Revised: 24 July 2019 / Accepted: 6 August 2019 / Published: 9 August 2019
PDF Full-text (2646 KB) | HTML Full-text | XML Full-text
Abstract
Vision-based 1D barcode reading has been the subject of extensive research in recent years due to the high demand for automation in various industrial settings. With the aim of detecting the image region of 1D barcodes, existing approaches are both slow and imprecise. [...] Read more.
Vision-based 1D barcode reading has been the subject of extensive research in recent years due to the high demand for automation in various industrial settings. With the aim of detecting the image region of 1D barcodes, existing approaches are both slow and imprecise. Deep-learning-based methods can locate the 1D barcode region fast but lack an adequate and accurate segmentation process; while simple geometric-based techniques perform weakly in terms of localization and take unnecessary computational cost when processing high-resolution images. We propose integrating the deep-learning and geometric approaches with the objective of tackling robust barcode localization in the presence of complicated backgrounds and accurately detecting the barcode within the localized region. Our integrated real-time solution combines the advantages of the two methods. Furthermore, there is no need to manually tune parameters in our approach. Through extensive experimentation on standard benchmarks, we show that our integrated approach outperforms the state-of-the-art methods by at least 5%. Full article
(This article belongs to the Special Issue Computer Vision and Pattern Recognition in the Era of Deep Learning)
Figures

Graphical abstract

Open AccessArticle
Periodic Surface Defect Detection in Steel Plates Based on Deep Learning
Appl. Sci. 2019, 9(15), 3127; https://doi.org/10.3390/app9153127
Received: 12 June 2019 / Revised: 15 July 2019 / Accepted: 27 July 2019 / Published: 1 August 2019
PDF Full-text (4567 KB) | HTML Full-text | XML Full-text
Abstract
It is difficult to detect roll marks on hot-rolled steel plates as they have a low contrast in the images. A periodical defect detection method based on a convolutional neural network (CNN) and long short-term memory (LSTM) is proposed to detect periodic defects, [...] Read more.
It is difficult to detect roll marks on hot-rolled steel plates as they have a low contrast in the images. A periodical defect detection method based on a convolutional neural network (CNN) and long short-term memory (LSTM) is proposed to detect periodic defects, such as roll marks, according to the strong time-sequenced characteristics of such defects. Firstly, the features of the defect image are extracted through a CNN network, and then the extracted feature vectors are inputted into an LSTM network for defect recognition. The experiment shows that the detection rate of this method is 81.9%, which is 10.2% higher than a CNN method. In order to make more accurate use of the previous information, the method is improved with the attention mechanism. The improved method specifies the importance of inputted information at each previous moment, and gives the quantitative weight according to the importance. The experiment shows that the detection rate of the improved method is increased to 86.2%. Full article
(This article belongs to the Special Issue Computer Vision and Pattern Recognition in the Era of Deep Learning)
Figures

Figure 1

Open AccessArticle
An Integrated Wildlife Recognition Model Based on Multi-Branch Aggregation and Squeeze-And-Excitation Network
Appl. Sci. 2019, 9(14), 2794; https://doi.org/10.3390/app9142794
Received: 29 May 2019 / Revised: 25 June 2019 / Accepted: 9 July 2019 / Published: 12 July 2019
PDF Full-text (3186 KB) | HTML Full-text | XML Full-text
Abstract
Infrared camera trapping, which helps capture large volumes of wildlife images, is a widely-used, non-intrusive monitoring method in wildlife surveillance. This method can greatly reduce the workload of zoologists through automatic image identification. To achieve higher accuracy in wildlife recognition, the integrated model [...] Read more.
Infrared camera trapping, which helps capture large volumes of wildlife images, is a widely-used, non-intrusive monitoring method in wildlife surveillance. This method can greatly reduce the workload of zoologists through automatic image identification. To achieve higher accuracy in wildlife recognition, the integrated model based on multi-branch aggregation and Squeeze-and-Excitation network is introduced. This model adopts multi-branch aggregation transformation to extract features, and uses Squeeze-and-Excitation block to adaptively recalibrate channel-wise feature responses based on explicit self-mapped interdependencies between channels. The efficacy of the integrated model is tested on two datasets: the Snapshot Serengeti dataset and our own dataset. From experimental results on the Snapshot Serengeti dataset, the integrated model applies to the recognition of 26 wildlife species, with the highest accuracies in Top-1 (when the correct class is the most probable class) and Top-5 (when the correct class is within the five most probable classes) at 95.3% and 98.8%, respectively. Compared with the ROI-CNN algorithm and ResNet (Deep Residual Network), on our own dataset, the integrated model, shows a maximum improvement of 4.4% in recognition accuracy. Full article
(This article belongs to the Special Issue Computer Vision and Pattern Recognition in the Era of Deep Learning)
Figures

Figure 1

Open AccessArticle
CP-SSD: Context Information Scene Perception Object Detection Based on SSD
Appl. Sci. 2019, 9(14), 2785; https://doi.org/10.3390/app9142785
Received: 20 June 2019 / Revised: 7 July 2019 / Accepted: 8 July 2019 / Published: 11 July 2019
PDF Full-text (6998 KB) | HTML Full-text | XML Full-text
Abstract
Single Shot MultiBox Detector (SSD) has achieved good results in object detection but there are problems such as insufficient understanding of context information and loss of features in deep layers. In order to alleviate these problems, we propose a single-shot object detection network [...] Read more.
Single Shot MultiBox Detector (SSD) has achieved good results in object detection but there are problems such as insufficient understanding of context information and loss of features in deep layers. In order to alleviate these problems, we propose a single-shot object detection network Context Perception-SSD (CP-SSD). CP-SSD promotes the network’s understanding of context information by using context information scene perception modules, so as to capture context information for objects of different scales. Deep layer feature map used semantic activation module, through self-supervised learning to adjust the context feature information and channel interdependence, and enhance useful semantic information. CP-SSD was validated on benchmark dataset PASCAL VOC 2007. The experimental results show that, compared with SSD, the mean Average Precision (mAP) of the CP-SSD detection method reaches 77.8%, which is 0.6% higher than that of SSD, and the detection effect was significantly improved in images with difficult to distinguish the object from the background. Full article
(This article belongs to the Special Issue Computer Vision and Pattern Recognition in the Era of Deep Learning)
Figures

Figure 1

Open AccessArticle
Classification Method of Plug Seedlings Based on Transfer Learning
Appl. Sci. 2019, 9(13), 2725; https://doi.org/10.3390/app9132725
Received: 30 May 2019 / Revised: 28 June 2019 / Accepted: 30 June 2019 / Published: 5 July 2019
PDF Full-text (3343 KB) | HTML Full-text | XML Full-text
Abstract
The classification of plug seedlings is important work in the replanting process. This paper proposed a classification method for plug seedlings based on transfer learning. Firstly, by extracting and graying the interest region of the original image acquired, a regional grayscale cumulative distribution [...] Read more.
The classification of plug seedlings is important work in the replanting process. This paper proposed a classification method for plug seedlings based on transfer learning. Firstly, by extracting and graying the interest region of the original image acquired, a regional grayscale cumulative distribution curve is obtained. Calculating the number of peak points of the curve to identify the plug tray specification is then done. Secondly, the transfer learning method based on convolutional neural network is used to construct the classification model of plug seedlings. According to the growth characteristics of the seedlings, 2286 seedlings samples were collected to train the model at the two-leaf and one-heart stages. Finally, the image of the interest region is divided into cell images according to the specification of the plug tray, and the cell images are put into the classification model, thereby classifying the qualified seedling, the unqualified seedling and the lack of seedling. After testing, the identification method of the tray specification has an average accuracy of 100% for the three specifications (50 cells, 72 cells, 105 cells) of the 20-day and 25-day pepper seedlings. Seedling classification models based on the transfer learning method of four different convolutional neural networks (Alexnet, Inception-v3, Resnet-18, VGG16) are constructed and tested. The classification accuracy of the VGG16-based classification model is the best, which is 95.50%, the Alexnet-based classification model has the shortest training time, which is 6 min and 8 s. This research has certain theoretical reference significance for intelligent replanting classification work. Full article
(This article belongs to the Special Issue Computer Vision and Pattern Recognition in the Era of Deep Learning)
Figures

Figure 1

Open AccessArticle
Multi-Task Learning Using Task Dependencies for Face Attributes Prediction
Appl. Sci. 2019, 9(12), 2535; https://doi.org/10.3390/app9122535
Received: 21 May 2019 / Revised: 14 June 2019 / Accepted: 19 June 2019 / Published: 21 June 2019
PDF Full-text (4571 KB) | HTML Full-text | XML Full-text
Abstract
Face attributes prediction has an increasing amount of applications in human–computer interaction, face verification and video surveillance. Various studies show that dependencies exist in face attributes. Multi-task learning architecture can build a synergy among the correlated tasks by parameter sharing in the shared [...] Read more.
Face attributes prediction has an increasing amount of applications in human–computer interaction, face verification and video surveillance. Various studies show that dependencies exist in face attributes. Multi-task learning architecture can build a synergy among the correlated tasks by parameter sharing in the shared layers. However, the dependencies between the tasks have been ignored in the task-specific layers of most multi-task learning architectures. Thus, how to further boost the performance of individual tasks by using task dependencies among face attributes is quite challenging. In this paper, we propose a multi-task learning using task dependencies architecture for face attributes prediction and evaluate the performance with the tasks of smile and gender prediction. The designed attention modules in task-specific layers of our proposed architecture are used for learning task-dependent disentangled representations. The experimental results demonstrate the effectiveness of our proposed network by comparing with the traditional multi-task learning architecture and the state-of-the-art methods on Faces of the world (FotW) and Labeled faces in the wild-a (LFWA) datasets. Full article
(This article belongs to the Special Issue Computer Vision and Pattern Recognition in the Era of Deep Learning)
Figures

Figure 1

Review

Jump to: Research

Open AccessReview
A Survey of Handwritten Character Recognition with MNIST and EMNIST
Appl. Sci. 2019, 9(15), 3169; https://doi.org/10.3390/app9153169
Received: 5 July 2019 / Revised: 27 July 2019 / Accepted: 2 August 2019 / Published: 4 August 2019
PDF Full-text (592 KB) | HTML Full-text | XML Full-text
Abstract
This paper summarizes the top state-of-the-art contributions reported on the MNIST dataset for handwritten digit recognition. This dataset has been extensively used to validate novel techniques in computer vision, and in recent years, many authors have explored the performance of convolutional neural networks [...] Read more.
This paper summarizes the top state-of-the-art contributions reported on the MNIST dataset for handwritten digit recognition. This dataset has been extensively used to validate novel techniques in computer vision, and in recent years, many authors have explored the performance of convolutional neural networks (CNNs) and other deep learning techniques over this dataset. To the best of our knowledge, this paper is the first exhaustive and updated review of this dataset; there are some online rankings, but they are outdated, and most published papers survey only closely related works, omitting most of the literature. This paper makes a distinction between those works using some kind of data augmentation and works using the original dataset out-of-the-box. Also, works using CNNs are reported separately; as they are becoming the state-of-the-art approach for solving this problem. Nowadays, a significant amount of works have attained a test error rate smaller than 1% on this dataset; which is becoming non-challenging. By mid-2017, a new dataset was introduced: EMNIST, which involves both digits and letters, with a larger amount of data acquired from a database different than MNIST’s. In this paper, EMNIST is explained and some results are surveyed. Full article
(This article belongs to the Special Issue Computer Vision and Pattern Recognition in the Era of Deep Learning)
Figures

Figure 1

Appl. Sci. EISSN 2076-3417 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top