Submit to Algorithms Review for Algorithms Propose a Special Issue

Journal Menu

Journal Browser

Deep Learning for Image and Video Understanding

Special Issue Editors
Special Issue Information
Keywords
Benefits of Publishing in a Special Issue
Published Papers

A special issue of Algorithms (ISSN 1999-4893). This special issue belongs to the section "Evolutionary Algorithms and Machine Learning".

Deadline for manuscript submissions: closed (31 March 2019) | Viewed by 31516

Share This Special Issue

Special Issue Editors

Dr. Adil Mehmood Khan

E-Mail Website
Guest Editor

Machine Learning & Knowledge Representation Lab, Innopolis University, Innopolis, Russia
Interests: machine learning; data mining; pattern recognition; context-aware computing; intelligent systems; data modeling and analysis

Prof. Adín Ramírez Rivera

E-Mail Website
Guest Editor

Institute of Computing, Universidade Estadual de Campinas, Brazil
Interests: Image Processing; Computer Vision; Machine Learning

Special Issue Information

Dear Colleagues,

Comically trivial tasks, such as recognizing a handwritten digit in images, become dauntingly difficult when we try to automate them by writing a computer program. However, thanks to artificial neural networks, especially deep learning, we have found a solution to this problem. Such methods can learn to model such complex problems as a layered representation of simple concepts, directly from data, without requiring any hand-crafted features or hard-coded knowledge from experts.

Deep learning methods are therefore being employed on a large scale to solve computer vision problems. We invite you to submit your latest research in the area of deep learning and computer vision to this Special Issue, “Deep Learning for Image and Video Understanding.” We are looking for new and innovative deep learning approaches to solving problems such as object detection, segmentation, recognition, tracking, action recognition, etc.

High-quality papers are solicited to address both theoretical and practical issues of deep learning algorithms. Submissions are welcome both for traditional computer vision problems, as well as new applications. Potential topics include, but are not limited to:

K-shot learning for image and video understanding
Open set recognition for image and video understanding
Small-data learning for image and video understanding
Hierarchical and ensemble learning for image and video understanding
Data augmentation and transfer learning for image and video understanding
Semantic segmentation
Representation learning, feature detection and description for image and video understanding
Scene modeling and reconstruction
Scene understanding
Object detection, recognition and classification
Object pose estimation and tracking for image and video understanding
Person detection, tracking and identification for image and video understanding
Action and activity recognition for image and video understanding
Video annotation

Prof. Adil Mehmood Khan
Prof. Adín Ramírez Rivera
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Algorithms is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

Deep learning
computer vision
image processing

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (5 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

13 pages, 4317 KiB

Open AccessArticle

Refinement of Background-Subtraction Methods Based on Convolutional Neural Network Features for Dynamic Background

by Tianming Yu, Jianhua Yang and Wei Lu

Algorithms 2019, 12(7), 128; https://doi.org/10.3390/a12070128 - 27 Jun 2019

Cited by 5 | Viewed by 4468

Abstract

Advancing the background-subtraction method in dynamic scenes is an ongoing timely goal for many researchers. Recently, background subtraction methods have been developed with deep convolutional features, which have improved their performance. However, most of these deep methods are supervised, only available for a certain scene, and have high computational cost. In contrast, the traditional background subtraction methods have low computational costs and can be applied to general scenes. Therefore, in this paper, we propose an unsupervised and concise method based on the features learned from a deep convolutional neural network to refine the traditional background subtraction methods. For the proposed method, the low-level features of an input image are extracted from the lower layer of a pretrained convolutional neural network, and the main features are retained to further establish the dynamic background model. The evaluation of the experiments on dynamic scenes demonstrates that the proposed method significantly improves the performance of traditional background subtraction methods. Full article

(This article belongs to the Special Issue Deep Learning for Image and Video Understanding)

► Show Figures

Figure 1

16 pages, 3842 KiB

Open AccessFeature PaperArticle

Triplet Loss Network for Unsupervised Domain Adaptation

by Imad Eddine Ibrahim Bekkouch, Youssef Youssry, Rustam Gafarov, Adil Khan and Asad Masood Khattak

Algorithms 2019, 12(5), 96; https://doi.org/10.3390/a12050096 - 8 May 2019

Cited by 13 | Viewed by 8891

Abstract

Domain adaptation is a sub-field of transfer learning that aims at bridging the dissimilarity gap between different domains by transferring and re-using the knowledge obtained in the source domain to the target domain. Many methods have been proposed to resolve this problem, using techniques such as generative adversarial networks (GAN), but the complexity of such methods makes it hard to use them in different problems, as fine-tuning such networks is usually a time-consuming task. In this paper, we propose a method for unsupervised domain adaptation that is both simple and effective. Our model (referred to as TripNet) harnesses the idea of a discriminator and Linear Discriminant Analysis (LDA) to push the encoder to generate domain-invariant features that are category-informative. At the same time, pseudo-labelling is used for the target data to train the classifier and to bring the same classes from both domains together. We evaluate TripNet against several existing, state-of-the-art methods on three image classification tasks: Digit classification (MNIST, SVHN, and USPC datasets), object recognition (Office31 dataset), and traffic sign recognition (GTSRB and Synthetic Signs datasets). Our experimental results demonstrate that (i) TripNet beats almost all existing methods (having a similar simple model like it) on all of these tasks; and (ii) for models that are significantly more complex (or hard to train) than TripNet, it even beats their performance in some cases. Hence, the results confirm the effectiveness of using TripNet for unsupervised domain adaptation in image classification. Full article

(This article belongs to the Special Issue Deep Learning for Image and Video Understanding)

► Show Figures

Figure 1

14 pages, 8047 KiB

Open AccessArticle

Learning an Efficient Convolution Neural Network for Pansharpening

by Yecai Guo, Fei Ye and Hao Gong

Algorithms 2019, 12(1), 16; https://doi.org/10.3390/a12010016 - 8 Jan 2019

Cited by 9 | Viewed by 5811

Abstract

Pansharpening is a domain-specific task of satellite imagery processing, which aims at fusing a multispectral image with a corresponding panchromatic one to enhance the spatial resolution of multispectral image. Most existing traditional methods fuse multispectral and panchromatic images in linear manners, which greatly restrict the fusion accuracy. In this paper, we propose a highly efficient inference network to cope with pansharpening, which breaks the linear limitation of traditional methods. In the network, we adopt a dilated multilevel block coupled with a skip connection to perform local and overall compensation. By using dilated multilevel block, the proposed model can make full use of the extracted features and enlarge the receptive field without introducing extra computational burden. Experiment results reveal that our network tends to induce competitive even superior pansharpening performance compared with deeper models. As our network is shallow and trained with several techniques to prevent overfitting, our model is robust to the inconsistencies across different satellites. Full article

(This article belongs to the Special Issue Deep Learning for Image and Video Understanding)

► Show Figures

Figure 1

16 pages, 5416 KiB

Open AccessArticle

A Robust Visual Tracking Algorithm Based on Spatial-Temporal Context Hierarchical Response Fusion

by Wancheng Zhang, Yanmin Luo, Zhi Chen, Yongzhao Du, Daxin Zhu and Peizhong Liu

Algorithms 2019, 12(1), 8; https://doi.org/10.3390/a12010008 - 26 Dec 2018

Cited by 6 | Viewed by 6208

Abstract

Discriminative correlation filters (DCFs) have been shown to perform superiorly in visual object tracking. However, visual tracking is still challenging when the target objects undergo complex scenarios such as occlusion, deformation, scale changes and illumination changes. In this paper, we utilize the hierarchical features of convolutional neural networks (CNNs) and learn a spatial-temporal context correlation filter on convolutional layers. Then, the translation is estimated by fusing the response score of the filters on the three convolutional layers. In terms of scale estimation, we learn a discriminative correlation filter to estimate scale from the best confidence results. Furthermore, we proposed a re-detection activation discrimination method to improve the robustness of visual tracking in the case of tracking failure and an adaptive model update method to reduce tracking drift caused by noisy updates. We evaluate the proposed tracker with DCFs and deep features on OTB benchmark datasets. The tracking results demonstrated that the proposed algorithm is superior to several state-of-the-art DCF methods in terms of accuracy and robustness. Full article

(This article belongs to the Special Issue Deep Learning for Image and Video Understanding)

► Show Figures

Figure 1

15 pages, 11769 KiB

Open AccessArticle

A Study on Faster R-CNN-Based Subway Pedestrian Detection with ACE Enhancement

by Hongquan Qu, Meihan Wang, Changnian Zhang and Yun Wei

Algorithms 2018, 11(12), 192; https://doi.org/10.3390/a11120192 - 26 Nov 2018

Cited by 6 | Viewed by 5173

Abstract

At present, the problem of pedestrian detection has attracted increasing attention in the field of computer vision. The faster regions with convolutional neural network features (Faster R-CNN) are regarded as one of the most important techniques for studying this problem. However, the detection capability of the model trained by faster R-CNN is susceptible to the diversity of pedestrians’ appearance and the light intensity in specific scenarios, such as in a subway, which can lead to the decline in recognition rate and the offset of target selection for pedestrians. In this paper, we propose the modified faster R-CNN method with automatic color enhancement (ACE), which can improve sample contrast by calculating the relative light and dark relationship to correct the final pixel value. In addition, a calibration method based on sample categories reduction is presented to accurately locate the target for detection. Then, we choose the faster R-CNN target detection framework on the experimental dataset. Finally, the effectiveness of this method is verified with the actual data sample collected from the subway passenger monitoring video. Full article

(This article belongs to the Special Issue Deep Learning for Image and Video Understanding)

► Show Figures

Journal Menu

Journal Browser

Deep Learning for Image and Video Understanding

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (5 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI