Entropy 2010, 12(7), 1743-1764; doi:10.3390/e12071743

Fitting Ranked Linguistic Data with Two-Parameter Functions

1,* email, 2,4email and 3,4email
Received: 4 April 2010; in revised form: 18 May 2010 / Accepted: 1 July 2010 / Published: 7 July 2010
(This article belongs to the Special Issue Complexity of Human Language and Cognition)
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract: It is well known that many ranked linguistic data can fit well with one-parameter models such as Zipf’s law for ranked word frequencies. However, in cases where discrepancies from the one-parameter model occur (these will come at the two extremes of the rank), it is natural to use one more parameter in the fitting model. In this paper, we compare several two-parameter models, including Beta function, Yule function, Weibull function—all can be framed as a multiple regression in the logarithmic scale—in their fitting performance of several ranked linguistic data, such as letter frequencies, word-spacings, and word frequencies. We observed that Beta function fits the ranked letter frequency the best, Yule function fits the ranked word-spacing distribution the best, and Altmann, Beta, Yule functions all slightly outperform the Zipf’s power-law function in word ranked- frequency distribution.
Keywords: Zipf’s law; regression; model selection; Beta function; letter frequency distribution; word-spacing distribution; word frequency distribution; weighting
PDF Full-text Download PDF Full-Text [964 KB, uploaded 7 July 2010 14:26 CEST]

Export to BibTeX |

MDPI and ACS Style

Li, W.; Miramontes, P.; Cocho, G. Fitting Ranked Linguistic Data with Two-Parameter Functions. Entropy 2010, 12, 1743-1764.

AMA Style

Li W, Miramontes P, Cocho G. Fitting Ranked Linguistic Data with Two-Parameter Functions. Entropy. 2010; 12(7):1743-1764.

Chicago/Turabian Style

Li, Wentian; Miramontes, Pedro; Cocho, Germinal. 2010. "Fitting Ranked Linguistic Data with Two-Parameter Functions." Entropy 12, no. 7: 1743-1764.

Entropy EISSN 1099-4300 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert