Next Article in Journal
A Combination Prediction Model of Long-Term Ionospheric foF2 Based on Entropy Weight Method
Previous Article in Journal
Association Factor for Identifying Linear and Nonlinear Correlations in Noisy Conditions
Open AccessArticle

Entropy-Based Approach for the Detection of Changes in Arabic Newspapers’ Content

1
Research Laboratory for Analysis and Modeling of Social Processes, Saint Petersburg State University, Universitetskaya nab. 7-9, Saint Petersburg 190000, Russia
2
Faculty of Mathematics and Mechanics, and Research Laboratory for Analysis and Modeling of Social Processes, Saint Petersburg State University, Universitetsky prospekt 28, Saint Petersburg 198504, Russia
3
Software Engineering Department, ORT Braude College of Engineering, Karmiel 21982, Israel
*
Author to whom correspondence should be addressed.
Entropy 2020, 22(4), 441; https://doi.org/10.3390/e22040441
Received: 8 March 2020 / Revised: 4 April 2020 / Accepted: 7 April 2020 / Published: 14 April 2020
(This article belongs to the Section Multidisciplinary Applications)
A new method for the recognition of meaningful changes in social state based on transformations of the linguistic content in Arabic newspapers is suggested. The detected alterations of the linguistic material in Arabic newspapers play an indicator role. The currently proposed approach acts in an “online” fashion and uses pre-trained vector representations of Arabic words. After a pre-processing stage, the words in the issues’ texts are substituted by vectors obtained within a word embedding methodology. The approach typifies the consistent linguistic template by the similarity of the embedded vectors. A change in the distributions of the issue-grounded samples indicates a difference in the underlying newspaper template. A two-step procedure implements the concept, where the first step compares the similarity distribution of the current issue versus the union of ones corresponding to several of its predecessors. A repeating under-sampling approach accompanied by a two-sample test stabilizes the sampling and returns a collection of the resultant p-values. In the second stage, the entropy of these sets is sequentially calculated, such that the change points of the time series obtained in this way indicate the changes in the newspaper content. Numerical experiments provided on the following issues of several Arabic newspapers published in the Arab Spring period demonstrate the high reliability of the method. View Full-Text
Keywords: publishing model modeling; anomaly detection; word embedding publishing model modeling; anomaly detection; word embedding
Show Figures

Figure 1

MDPI and ACS Style

Bernikova, O.; Granichin, O.; Lemberg, D.; Redkin, O.; Volkovich, Z. Entropy-Based Approach for the Detection of Changes in Arabic Newspapers’ Content. Entropy 2020, 22, 441.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop