This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Open AccessArticle
Literal Pattern Analysis of Texts Written with the Multiple Form of Characters: A Comparative Study of the Human and Machine Styles
by
Kazuya Hayata
Kazuya Hayata
Sapporo Gakuin University, Ebetsu 069-8555, Japan
Entropy 2026, 28(1), 36; https://doi.org/10.3390/e28010036 (registering DOI)
Submission received: 30 October 2025
/
Revised: 22 December 2025
/
Accepted: 23 December 2025
/
Published: 27 December 2025
Abstract
Aside from languages having no form of written expression, it is usually the case with every language on this planet that texts are written in a single character. But every rule has its exceptions. A very rare exception is Japanese, the texts of which are written in the three kinds of characters. In European languages, no one can find a text written in a mixture of the Latin, Cyrillic, and Greek alphabets. For several Japanese texts currently available, we conduct a quantitative analysis of how the three characters are mixed using a methodology based on a binary pattern approach to the sequence that has been generated by a procedure. Specifically, we consider two different texts in the former and present constitutions as well as a famous American story that has been translated at least 13 times into Japanese. For the latter, a comparison is made among the human translations and four machine translations by DeepL and Google Translate. As metrics of divergence and diversity, the Hellinger distance, chi-square value, normalized Shannon entropy, and Simpson’s diversity index are employed. Numerical results suggest that in terms of the entropy, the 17 translations consist of three clusters, and that overall, the machine-translated texts exhibit entropy higher than the human translations. The finding suggests that the present method can provide a tool useful for stylometry and author attribution. Finally, through comparison with the diversity index, capabilities of the entropic measure are confirmed. Lastly, in addition to the abovementioned texts, applicability to the Japanese version of the periodic table of elements is investigated.
Share and Cite
MDPI and ACS Style
Hayata, K.
Literal Pattern Analysis of Texts Written with the Multiple Form of Characters: A Comparative Study of the Human and Machine Styles. Entropy 2026, 28, 36.
https://doi.org/10.3390/e28010036
AMA Style
Hayata K.
Literal Pattern Analysis of Texts Written with the Multiple Form of Characters: A Comparative Study of the Human and Machine Styles. Entropy. 2026; 28(1):36.
https://doi.org/10.3390/e28010036
Chicago/Turabian Style
Hayata, Kazuya.
2026. "Literal Pattern Analysis of Texts Written with the Multiple Form of Characters: A Comparative Study of the Human and Machine Styles" Entropy 28, no. 1: 36.
https://doi.org/10.3390/e28010036
APA Style
Hayata, K.
(2026). Literal Pattern Analysis of Texts Written with the Multiple Form of Characters: A Comparative Study of the Human and Machine Styles. Entropy, 28(1), 36.
https://doi.org/10.3390/e28010036
Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details
here.
Article Metrics
Article metric data becomes available approximately 24 hours after publication online.