Next Article in Journal
A Novel Approach for the Definition of an Integrated Visual Quality Index for Residential Buildings
Previous Article in Journal
Dual-View Three-Dimensional Display Based on Direct-Projection Integral Imaging with Convex Mirror Arrays
Article Menu

Article Versions

Export Article

Open AccessArticle
Appl. Sci. 2019, 9(8), 1578; https://doi.org/10.3390/app9081578

Method of Feature Reduction in Short Text Classification Based on Feature Clustering

1,†,‡, 1,‡, 1,*, 2,* and 1
1
School of Computer Science and Engineering, Central South University, Changsha 410073, China
2
Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, China
*
Authors to whom correspondence should be addressed.
Current address: Central South University, Changsha 410073, China.
These authors contributed equally to this work.
Received: 15 February 2019 / Revised: 3 April 2019 / Accepted: 10 April 2019 / Published: 16 April 2019
(This article belongs to the Section Computing and Artificial Intelligence)
PDF [560 KB, uploaded 16 April 2019]

Abstract

One decisive problem of short text classification is the serious dimensional disaster when utilizing a statistics-based approach to construct vector spaces. Here, a feature reduction method is proposed that is based on two-stage feature clustering (TSFC), which is applied to short text classification. Features are semi-loosely clustered by combining spectral clustering with a graph traversal algorithm. Next, intra-cluster feature screening rules are designed to remove outlier feature words, which improves the effect of similar feature clusters. We classify short texts with corresponding similar feature clusters instead of original feature words. Similar feature clusters replace feature words, and the dimension of vector space is significantly reduced. Several classifiers are utilized to evaluate the effectiveness of this method. The results show that the method largely resolves the dimensional disaster and it can significantly improve the accuracy of short text classification.
Keywords: feature reduction; feature clustering; short text classification; word embedding feature reduction; feature clustering; short text classification; word embedding
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).
SciFeed

Share & Cite This Article

MDPI and ACS Style

Li, F.; Yin, Y.; Shi, J.; Mao, X.; Shi, R. Method of Feature Reduction in Short Text Classification Based on Feature Clustering. Appl. Sci. 2019, 9, 1578.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Appl. Sci. EISSN 2076-3417 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top