Next Article in Journal
Thermodynamics Analysis of Variable Viscosity Hydromagnetic Couette Flow in a Rotating System with Hall Effects
Previous Article in Journal
The Second Law Today: Using Maximum-Minimum Entropy Generation
Article Menu

Export Article

Open AccessArticle
Entropy 2015, 17(11), 7798-7810; doi:10.3390/e17117798

Word-Length Correlations and Memory in Large Texts: A Visibility Network Analysis

1
Unidad Interdisciplinaria en Ingeniería y Tecnologías Avanzadas, Instituto Politécnico Nacional, Av. IPN No. 2580, L. Ticomán, México D.F., 07340, Mexico
2
Facultad de Ciencias, Universidad Nacional Autónoma de México, Ciudad Universitaria, México D.F., 04510, Mexico
3
Departamento de Física, Escuela Superior de Física y Matemáticas, Instituto Politécnico Nacional, Edif. No. 9 U.P. Zacatenco, México D.F., 07738, Mexico
4
Departments of Physics and Psychology, Queens College, City University of New York, 65-30 Kissena Boulevard, SB B322, Flushing, NY 11367, USA
5
Adjunct Senior Research Scholar, Advanced Consortium on Cooperation, Conflict, and Complexity (AC4), Earth Institute, Columbia University, New York, NY 10027, USA
6
Physics Program, The Graduate Center, City University of New York, New York, NY 10016, USA
*
Author to whom correspondence should be addressed.
Academic Editor: J. A. Tenreiro Machado
Received: 27 August 2015 / Revised: 12 November 2015 / Accepted: 13 November 2015 / Published: 20 November 2015
(This article belongs to the Section Complexity)
View Full-Text   |   Download PDF [397 KB, uploaded 20 November 2015]   |  

Abstract

We study the correlation properties of word lengths in large texts from 30 ebooks in the English language from the Gutenberg Project (www.gutenberg.org) using the natural visibility graph method (NVG). NVG converts a time series into a graph and then analyzes its graph properties. First, the original sequence of words is transformed into a sequence of values containing the length of each word, and then, it is integrated. Next, we apply the NVG to the integrated word-length series and construct the network. We show that the degree distribution of that network follows a power law, P ( k ) ∼ k - γ , with two regimes, which are characterized by the exponents γ s ≈ 1 . 7 (at short degree scales) and γ l ≈ 1 . 3 (at large degree scales). This suggests that word lengths are much more strongly correlated at large distances between words than at short distances between words. That finding is also supported by the detrended fluctuation analysis (DFA) and recurrence time distribution. These results provide new information about the universal characteristics of the structure of written texts beyond that given by word frequencies. View Full-Text
Keywords: words frequency; words recurrence; syllables; texts words frequency; words recurrence; syllables; texts
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

SciFeed Share & Cite This Article

MDPI and ACS Style

Guzmán-Vargas, L.; Obregón-Quintana, B.; Aguilar-Velázquez, D.; Hernández-Pérez, R.; Liebovitch, L.S. Word-Length Correlations and Memory in Large Texts: A Visibility Network Analysis. Entropy 2015, 17, 7798-7810.

Show more citation formats Show less citations formats

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Entropy EISSN 1099-4300 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top