Next Article in Journal
A Weighted Histogram-Based Tone Mapping Algorithm for CT Images
Previous Article in Journal
Image De-Quantization Using Plate Bending Model
Article Menu

Export Article

Open AccessArticle
Algorithms 2018, 11(8), 109; https://doi.org/10.3390/a11080109

Long Length Document Classification by Local Convolutional Feature Aggregation

1
School of Electronic and Information Engineering, Nanjing University of Information Science and Technology, Nanjing 210044, China
2
State Grid Corporation of China, Beijing 100031, China
3
NARI Group Corporation of China/State Grid Electric Power Research Institute, Nanjing 211106, China
*
Author to whom correspondence should be addressed.
Received: 7 June 2018 / Revised: 17 July 2018 / Accepted: 20 July 2018 / Published: 24 July 2018
Full-Text   |   PDF [1153 KB, uploaded 24 July 2018]   |  

Abstract

The exponential increase in online reviews and recommendations makes document classification and sentiment analysis a hot topic in academic and industrial research. Traditional deep learning based document classification methods require the use of full textual information to extract features. In this paper, in order to tackle long document, we proposed three methods that use local convolutional feature aggregation to implement document classification. The first proposed method randomly draws blocks of continuous words in the full document. Each block is then fed into the convolution neural network to extract features and then are concatenated together to output the classification probability through a classifier. The second model improves the first by capturing the contextual order information of the sampled blocks with a recurrent neural network. The third model is inspired by the recurrent attention model (RAM), in which a reinforcement learning module is introduced to act as a controller for selecting the next block position based on the recurrent state. Experiments on our collected four-class arXiv paper dataset show that the three proposed models all perform well, and the RAM model achieves the best test accuracy with the least information. View Full-Text
Keywords: document classification; deep learning; convolutional feature aggregation; recurrent neural network; recurrent attention model document classification; deep learning; convolutional feature aggregation; recurrent neural network; recurrent attention model
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).
SciFeed

Share & Cite This Article

MDPI and ACS Style

Liu, L.; Liu, K.; Cong, Z.; Zhao, J.; Ji, Y.; He, J. Long Length Document Classification by Local Convolutional Feature Aggregation. Algorithms 2018, 11, 109.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Algorithms EISSN 1999-4893 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top