Next Article in Journal
Maximum Entropy Production Is Not a Steady State Attractor for 2D Fluid Convection
Previous Article in Journal
Fiber-Mixing Codes between Shifts of Finite Type and Factors of Gibbs Measures
Open AccessArticle

CoFea: A Novel Approach to Spam Review Identification Based on Entropy and Co-Training

1
Center for Reseach on Big Data Sciences, Beijing University of Chemical Technology, Beijing 100029, China
2
School of Knowledge Science, Japan Advanced Institute of Science and Technology, 1-1 Ashahidai, Nomi City, Ishikawa 923-1292, Japan
3
Institute of Policy and Management, Chinese Academy of Sciences, Beijing 100190, China
*
Author to whom correspondence should be addressed.
Entropy 2016, 18(12), 429; https://doi.org/10.3390/e18120429
Received: 20 October 2016 / Accepted: 28 November 2016 / Published: 30 November 2016
With the rapid development of electronic commerce, spam reviews are rapidly growing on the Internet to manipulate online customers’ opinions on goods being sold. This paper proposes a novel approach, called CoFea (Co-training by Features), to identify spam reviews, based on entropy and the co-training algorithm. After sorting all lexical terms of reviews by entropy, we produce two views on the reviews by dividing the lexical terms into two subsets. One subset contains odd-numbered terms and the other contains even-numbered terms. Using SVM (support vector machine) as the base classifier, we further propose two strategies, CoFea-T and CoFea-S, embedded with the CoFea approach. The CoFea-T strategy uses all terms in the subsets for spam review identification by SVM. The CoFea-S strategy uses a predefined number of terms with small entropy for spam review identification by SVM. The experiment results show that the CoFea-T strategy produces better accuracy than the CoFea-S strategy, while the CoFea-S strategy saves more computing time than the CoFea-T strategy with acceptable accuracy in spam review identification. View Full-Text
Keywords: spam review; co-training; CoFea spam review; co-training; CoFea
Show Figures

Figure 1

MDPI and ACS Style

Zhang, W.; Bu, C.; Yoshida, T.; Zhang, S. CoFea: A Novel Approach to Spam Review Identification Based on Entropy and Co-Training. Entropy 2016, 18, 429.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop