Next Article in Journal
Entropy and Information Approaches to Genetic Diversity and its Expression: Genomic Geography
Previous Article in Journal
Autocatalytic Sets and the Origin of Life
Previous Article in Special Issue
A Network Model of Interpersonal Alignment in Dialog
Article Menu

Export Article

Open AccessArticle
Entropy 2010, 12(7), 1743-1764; doi:10.3390/e12071743

Fitting Ranked Linguistic Data with Two-Parameter Functions

1
Feinstein Institute for Medical Research, North Shore LIJ Health Systems, 350 Community Drive, Manhasset, NY 11030, USA
2
Departamento de Matemáticas, Facultad de Ciencias, Universidad Nacional Autónoma de México, Circuito Exterior, Ciudad Universitaria, México 04510 DF, Mexico
3
Departamento de Sistemas Complejos, Instituto de Física, Universidad Nacional Autónoma de México, Apartado Postal 20-364, México 01000 DF, Mexico
4
Centro de Ciencias de la Complejidad, Universidad Nacional Autónoma de México, Circuito Escolar, Ciudad Universitaria, México 04510 DF, Mexico
*
Author to whom correspondence should be addressed.
Received: 4 April 2010 / Revised: 18 May 2010 / Accepted: 1 July 2010 / Published: 7 July 2010
(This article belongs to the Special Issue Complexity of Human Language and Cognition)
View Full-Text   |   Download PDF [964 KB, uploaded 24 February 2015]   |  

Abstract

It is well known that many ranked linguistic data can fit well with one-parameter models such as Zipf’s law for ranked word frequencies. However, in cases where discrepancies from the one-parameter model occur (these will come at the two extremes of the rank), it is natural to use one more parameter in the fitting model. In this paper, we compare several two-parameter models, including Beta function, Yule function, Weibull function—all can be framed as a multiple regression in the logarithmic scale—in their fitting performance of several ranked linguistic data, such as letter frequencies, word-spacings, and word frequencies. We observed that Beta function fits the ranked letter frequency the best, Yule function fits the ranked word-spacing distribution the best, and Altmann, Beta, Yule functions all slightly outperform the Zipf’s power-law function in word ranked- frequency distribution. View Full-Text
Keywords: Zipf’s law; regression; model selection; Beta function; letter frequency distribution; word-spacing distribution; word frequency distribution; weighting Zipf’s law; regression; model selection; Beta function; letter frequency distribution; word-spacing distribution; word frequency distribution; weighting
This is an open access article distributed under the Creative Commons Attribution License (CC BY 3.0).

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

SciFeed Share & Cite This Article

MDPI and ACS Style

Li, W.; Miramontes, P.; Cocho, G. Fitting Ranked Linguistic Data with Two-Parameter Functions. Entropy 2010, 12, 1743-1764.

Show more citation formats Show less citations formats

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Entropy EISSN 1099-4300 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top