MDPI - Publisher of Open Access Journals

8 pages, 1538 KiB

Open AccessArticle

Human Action Recognition Based on Foreground Trajectory and Motion Difference Descriptors

by Suge Dong, Daidi Hu, Ruijun Li and Mingtao Ge

Appl. Sci. 2019, 9(10), 2126; https://doi.org/10.3390/app9102126 - 24 May 2019

Cited by 7 | Viewed by 2365

Aimed at the problems of high redundancy of trajectory and susceptibility to background interference in traditional dense trajectory behavior recognition methods, a human action recognition method based on foreground trajectory and motion difference descriptors is proposed. First, the motion magnitude of each frame is estimated by optical flow, and the foreground region is determined according to each motion magnitude of the pixels; the trajectories are only extracted from behavior-related foreground regions. Second, in order to better describe the relative temporal information between different actions, a motion difference descriptor is introduced to describe the foreground trajectory, and the direction histogram of the motion difference is constructed by calculating the direction information of the motion difference per unit time of the trajectory point. Finally, a Fisher vector (FV) is used to encode histogram features to obtain video-level action features, and a support vector machine (SVM) is utilized to classify the action category. Experimental results show that this method can better extract the action-related trajectory, and it can improve the recognition accuracy by 7% compared to the traditional dense trajectory method. Full article

(This article belongs to the Section Applied Industrial Technologies)

► Show Figures

Figure 1

15 pages, 4042 KiB

Open AccessArticle

Leaf Counting with Multi-Scale Convolutional Neural Network Features and Fisher Vector Coding

by Boran Jiang, Ping Wang, Shuo Zhuang, Maosong Li, Zhenfa Li and Zhihong Gong

Symmetry 2019, 11(4), 516; https://doi.org/10.3390/sym11040516 - 10 Apr 2019

Cited by 13 | Viewed by 3266

Abstract

The number of leaves in maize plant is one of the key traits describing its growth conditions. It is directly related to plant development and leaf counts also give insight into changing plant development stages. Compared with the traditional solutions which need excessive human interventions, the methods of computer vision and machine learning are more efficient. However, leaf counting with computer vision remains a challenging problem. More and more researchers are trying to improve accuracy. To this end, an automated, deep learning based approach for counting leaves in maize plants is developed in this paper. A Convolution Neural Network(CNN) is used to extract leaf features. The CNN model in this paper is inspired by Google Inception Net V3, which using multi-scale convolution kernels in one convolution layer. To compress feature maps generated from some middle layers in CNN, the Fisher Vector (FV) is used to reduce redundant information. Finally, these encoded feature maps are used to regress the leaf numbers by using Random Forests. To boost the related research, a relatively single maize image dataset (Different growth stage with 2845 samples, which 80% for train and 20% for test) is constructed by our team. The proposed algorithm in single maize data set achieves Mean Square Error (MSE) of 0.32. Full article

► Show Figures

Figure 1

18 pages, 2461 KiB

Open AccessArticle

A Concurrent and Hierarchy Target Learning Architecture for Classification in SAR Application

by Mohamed Touafria and Qiang Yang

Sensors 2018, 18(10), 3218; https://doi.org/10.3390/s18103218 - 24 Sep 2018

Cited by 4 | Viewed by 3147

Abstract

This article discusses the issue of Automatic Target Recognition (ATR) on Synthetic Aperture Radar (SAR) images. Through learning the hierarchy of features automatically from a massive amount of training data, learning networks such as Convolutional Neural Networks (CNN) has recently achieved state-of-the-art results in many tasks. To extract better features about SAR targets, and to obtain better accuracies, a new framework is proposed: First, three CNN models based on different convolution and pooling kernel sizes are proposed. Second, they are applied simultaneously on the SAR images to generate image features via extracting CNN features from different layers in two scenarios. In the first scenario, the activation vectors obtained from fully connected layers are considered as the final image features; in the second scenario, dense features are extracted from the last convolutional layer and then encoded into global image features through one of the commonly used feature coding approaches, which is Fisher Vectors (FVs). Finally, different combination and fusion approaches between the two sets of experiments are considered to construct the final representation of the SAR images for final classification. Extensive experiments on the Moving and Stationary Target Acquisition and Recognition (MSTAR) dataset are conducted. Experimental results prove the capability of the proposed method, as compared to several state-of-the-art methods. Full article

(This article belongs to the Special Issue Automatic Target Recognition of High Resolution SAR/ISAR Images)

► Show Figures

Figure 1

28 pages, 3976 KiB

Open AccessArticle

Fisher Vector Coding for Covariance Matrix Descriptors Based on the Log-Euclidean and Affine Invariant Riemannian Metrics

by Ioana Ilea, Lionel Bombrun, Salem Said and Yannick Berthoumieu

J. Imaging 2018, 4(7), 85; https://doi.org/10.3390/jimaging4070085 - 22 Jun 2018

Cited by 9 | Viewed by 5260

Abstract

This paper presents an overview of coding methods used to encode a set of covariance matrices. Starting from a Gaussian mixture model (GMM) adapted to the Log-Euclidean (LE) or affine invariant Riemannian metric, we propose a Fisher Vector (FV) descriptor adapted to each of these metrics: the Log-Euclidean Fisher Vectors (LE FV) and the Riemannian Fisher Vectors (RFV). Some experiments on texture and head pose image classification are conducted to compare these two metrics and to illustrate the potential of these FV-based descriptors compared to state-of-the-art BoW and VLAD-based descriptors. A focus is also applied to illustrate the advantage of using the Fisher information matrix during the derivation of the FV. In addition, finally, some experiments are conducted in order to provide fairer comparison between the different coding strategies. This includes some comparisons between anisotropic and isotropic models, and a estimation performance analysis of the GMM dispersion parameter for covariance matrices of large dimension. Full article

► Show Figures

Figure 1

12 pages, 3029 KiB

Open AccessArticle

Local Patch Vectors Encoded by Fisher Vectors for Image Classification

by Shuangshuang Chen, Huiyi Liu, Xiaoqin Zeng, Subin Qian, Wei Wei, Guomin Wu and Baobin Duan

Information 2018, 9(2), 38; https://doi.org/10.3390/info9020038 - 9 Feb 2018

Cited by 6 | Viewed by 5200

Abstract

The objective of this work is image classification, whose purpose is to group images into corresponding semantic categories. Four contributions are made as follows: (i) For computational simplicity and efficiency, we directly adopt raw image patch vectors as local descriptors encoded by Fisher vector (FV) subsequently; (ii) For obtaining representative local features within the FV encoding framework, we compare and analyze three typical sampling strategies: random sampling, saliency-based sampling and dense sampling; (iii) In order to embed both global and local spatial information into local features, we construct an improved spatial geometry structure which shows good performance; (iv) For reducing the storage and CPU costs of high dimensional vectors, we adopt a new feature selection method based on supervised mutual information (MI), which chooses features by an importance sorting algorithm. We report experimental results on dataset STL-10. It shows very promising performance with this simple and efficient framework compared to conventional methods. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Figure 1

17 pages, 4703 KiB

Open AccessArticle

Remote Sensing Image Scene Classification Using Multi-Scale Completed Local Binary Patterns and Fisher Vectors

by Longhui Huang, Chen Chen, Wei Li and Qian Du

Remote Sens. 2016, 8(6), 483; https://doi.org/10.3390/rs8060483 - 8 Jun 2016

Cited by 154 | Viewed by 9737

Abstract

An effective remote sensing image scene classification approach using patch-based multi-scale completed local binary pattern (MS-CLBP) features and a Fisher vector (FV) is proposed. The approach extracts a set of local patch descriptors by partitioning an image and its multi-scale versions into dense patches and using the CLBP descriptor to characterize local rotation invariant texture information. Then, Fisher vector encoding is used to encode the local patch descriptors (i.e., patch-based CLBP features) into a discriminative representation. To improve the discriminative power of feature representation, multiple sets of parameters are used for CLBP to generate multiple FVs that are concatenated as the final representation for an image. A kernel-based extreme learning machine (KELM) is then employed for classification. The proposed method is extensively evaluated on two public benchmark remote sensing image datasets (i.e., the 21-class land-use dataset and the 19-class satellite scene dataset) and leads to superior classification performance (93.00% for the 21-class dataset with an improvement of approximately 3% when compared with the state-of-the-art MS-CLBP and 94.32% for the 19-class dataset with an improvement of approximately 1%). Full article

► Show Figures

Graphical abstract

17 pages, 4939 KiB

Open AccessArticle

Hierarchical Coding Vectors for Scene Level Land-Use Classification

by Hang Wu, Baozhen Liu, Weihua Su, Wenchang Zhang and Jinggong Sun

Remote Sens. 2016, 8(5), 436; https://doi.org/10.3390/rs8050436 - 23 May 2016

Cited by 48 | Viewed by 6504

Abstract

Land-use classification from remote sensing images has become an important but challenging task. This paper proposes Hierarchical Coding Vectors (HCV), a novel representation based on hierarchically coding structures, for scene level land-use classification. We stack multiple Bag of Visual Words (BOVW) coding layers and one Fisher coding layer to develop the hierarchical feature learning structure. In BOVW coding layers, we extract local descriptors from a geographical image with densely sampled interest points, and encode them using soft assignment (SA). The Fisher coding layer encodes those semi-local features with Fisher vectors (FV) and aggregates them to develop a final global representation. The graphical semantic information is refined by feeding the output of one layer into the next computation layer. HCV describes the geographical images through a high-level representation of richer semantic information by using a hierarchical coding structure. The experimental results on the 21-Class Land Use (LU) and RSSCN7 image databases indicate the effectiveness of the proposed HCV. Combined with the standard FV, our method (FV + HCV) achieves superior performance compared to the state-of-the-art methods on the two databases, obtaining the average classification accuracy of 91.5% on the LU database and 86.4% on the RSSCN7 database. Full article

► Show Figures

Graphical abstract

Search Results (7)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (7)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI