Next Article in Journal
Entropy Associated with Information Storage and Its Retrieval
Previous Article in Journal
Generalised Complex Geometry in Thermodynamical Fluctuation Theory
Article Menu

Export Article

Open AccessArticle
Entropy 2015, 17(8), 5903-5919; doi:10.3390/e17085903

Maximal Repetitions in Written Texts: Finite Energy Hypothesis vs. Strong Hilberg Conjecture

Institute of Computer Science, Polish Academy of Sciences, ul. Jana Kazimierza 5, 01-248 Warszawa, Poland
Academic Editor: J. A. Tenreiro Machado
Received: 22 May 2015 / Revised: 17 August 2015 / Accepted: 19 August 2015 / Published: 21 August 2015
(This article belongs to the Section Complexity)
View Full-Text   |   Download PDF [259 KB, uploaded 21 August 2015]   |  

Abstract

The article discusses two mutually-incompatible hypotheses about the stochastic mechanism of the generation of texts in natural language, which could be related to entropy. The first hypothesis, the finite energy hypothesis, assumes that texts are generated by a process with exponentially-decaying probabilities. This hypothesis implies a logarithmic upper bound for maximal repetition, as a function of the text length. The second hypothesis, the strong Hilberg conjecture, assumes that the topological entropy grows as a power law. This hypothesis leads to a hyperlogarithmic lower bound for maximal repetition. By a study of 35 written texts in German, English and French, it is found that the hyperlogarithmic growth of maximal repetition holds for natural language. In this way, the finite energy hypothesis is rejected, and the strong Hilberg conjecture is partly corroborated. View Full-Text
Keywords: finite energy processes; Hilberg’s conjecture; entropy rate; maximal repetition; natural language finite energy processes; Hilberg’s conjecture; entropy rate; maximal repetition; natural language
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

SciFeed Share & Cite This Article

MDPI and ACS Style

Dębowski, Ł. Maximal Repetitions in Written Texts: Finite Energy Hypothesis vs. Strong Hilberg Conjecture. Entropy 2015, 17, 5903-5919.

Show more citation formats Show less citations formats

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Entropy EISSN 1099-4300 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top