Reprint

Information Theory and Language

Edited by
August 2020
244 pages
  • ISBN978-3-03936-026-0 (Hardback)
  • ISBN978-3-03936-027-7 (PDF)

This book is a reprint of the Special Issue Information Theory and Language that was published in

Chemistry & Materials Science
Computer Science & Mathematics
Physical Sciences
Summary
“Information Theory and Language” is a collection of 12 articles that appeared recently in Entropy as part of a Special Issue of the same title. These contributions represent state-of-the-art interdisciplinary research at the interface of information theory and language studies. They concern in particular: • Applications of information theoretic concepts such as Shannon and Rényi entropies, mutual information, and rate–distortion curves to the research of natural languages; • Mathematical work in information theory inspired by natural language phenomena, such as deriving moments of subword complexity or proving continuity of mutual information; • Empirical and theoretical investigation of quantitative laws of natural language such as Zipf’s law, Herdan’s law, and Menzerath–Altmann’s law; • Empirical and theoretical investigations of statistical language models, including recently developed neural language models, their entropies, and other parameters; • Standardizing language resources for statistical investigation of natural language; • Other topics concerning semantics, syntax, and critical phenomena. Whereas the traditional divide between probabilistic and formal approaches to human language, cultivated in the disjoint scholarships of natural sciences and humanities, has been blurred in recent years, this book can contribute to pointing out potential areas of future research cross-fertilization.
Format
  • Hardback
License
© 2020 by the authors; CC BY-NC-ND license
Keywords
generalized entropy; generalized divergence; Jensen–Shannon divergence; sample size; text length; Zipf’s law; Predictive Rate–Distortion; natural language; information bottleneck; neural variational inference; Zipf’s law; Brevity law; Menzerath–Altmann’s law; Herdan’s law; lognormal distribution; size-rank law; quantitative linguistics; Glissando corpus; scaling; speech; natural language; entropy; neural networks; entropy rate; natural language; crowd source; Amazon Mechanical Turk; Shannon entropy; language complexity; morphology; TTR; language model; entropy rate; Shannon information measures; fields; invariance of completion; chain rule; speech variance; communicative efficiency; sampling invariance; power laws; communicative distributions; Project Gutenberg; Jensen–Shannon divergence; reproducibility; quantitative linguistics; natural language processing; syntax; Pareto-optimality; bottleneck method; phase transitions; statistical mechanics; subword complexity; asymptotics; generating functions; saddle point method; probability; the Mellin transform; moments; quantitative linguistics; brevity law; abbreviation law; power laws; scaling; Zipf’s law; entropy; mutual information; natural language; statistical language models; statistical language laws; semantics; syntax; complexity; criticality; language resources