Readability Indices Do Not Say It All on a Text Readability
Abstract
:1. Introduction
2. A Readability Formula for Alphabetical Languages
3. Word Interval and Short-Term Memory
4. A Universal Readability Formula
5. A “Footprint” of Humans
6. Conclusions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A
Appendix B
Language | Correlation Coefficient | ||
---|---|---|---|
Greek | 8.62 | 113.66 | −0.9477 |
Latin | 10.59 | 120.82 | −0.8666 |
Esperanto | 9.87 | 114.20 | −0.8803 |
French | 7.46 | 107.51 | −0.9311 |
Italian | 7.80 | 108.54 | −0.9065 |
Portuguese | 8.34 | 112.33 | −0.8261 |
Romanian | 8.08 | 111.11 | −0.8163 |
Spanish | 8.46 | 112.60 | −0.9061 |
Danish | 9.46 | 120.71 | −0.9182 |
English | 7.88 | 110.23 | −0.9129 |
Finnish | 10.06 | 118.22 | −0.8057 |
German | 8.68 | 113.79 | −0.8563 |
Icelandic | 8.68 | 114.98 | −0.8848 |
Norwegian | 7.32 | 110.28 | −0.9426 |
Swedish | 7.32 | 109.98 | −0.9546 |
Bulgarian | 9.00 | 117.63 | −0.8697 |
Czech | 10.41 | 125.50 | −0.8269 |
Croatian | 9.86 | 122.33 | −0.8868 |
Polish | 9.98 | 123.60 | −0.7160 |
Russian | 10.70 | 118.04 | −0.7326 |
Serbian | 8.71 | 117.24 | −0.8312 |
Slovak | 10.03 | 124.83 | −0.8417 |
Ukrainian | 8.34 | 113.42 | −0.7092 |
Estonian | 9.97 | 120.11 | −0.8643 |
Hungarian | 10.83 | 118.91 | −0.8034 |
Albanian | 8.01 | 107.04 | −0.8776 |
Armenian | 12.11 | 133.87 | −0.7805 |
Welsh | 7.74 | 103.12 | −0.7828 |
Basque | 9.99 | 117.48 | −0.8361 |
Hebrew | 10.27 | 129.58 | −0.8163 |
Cebuano | 6.97 | 107.50 | −0.9683 |
Tagalog | 7.78 | 112.54 | −0.9188 |
Chichewa | 8.40 | 118.76 | −0.9325 |
Luganda | 8.69 | 118.42 | −0.8713 |
Somali | 8.65 | 113.41 | −0.9492 |
Haitian | 8.25 | 115.41 | −0.9132 |
Nahuatl | 7.55 | 113.02 | −0.9420 |
Overall | 8.94 ± 1.22 | 116.00 ± 6.49 | 0.8681 ± 0.0661 |
References
- Flesch, R. A New Readability Yardstick. J. Appl. Psychol. 1948, 32, 222–233. [Google Scholar] [CrossRef] [PubMed]
- Flesch, R. The Art of Readable Writing; revised and enlarged edition; Harper & Row: New York, NY, USA, 1974. [Google Scholar]
- Kincaid, J.P.; Fishburne, R.P.; Rogers, R.L.; Chissom, B.S. Derivation of New Readability Formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula) For Navy Enlisted Personnel; Research Branch Report 8-75; Chief of Naval Technical Training, Naval Air Station: Memphis, TN, USA, 1975. [Google Scholar]
- DuBay, W.H. The Principles of Readability; Impact Information: Costa Mesa, CA, USA, 2004. [Google Scholar]
- Bailin, A.; Graftstein, A. The linguistic assumptions underlying readability formulae: A critique. Lang. Commun. 2001, 21, 285–301. [Google Scholar] [CrossRef]
- DuBay, W.H. (Ed.) The Classic Readability Studies; Impact Information: Costa Mesa, CA, USA, 2006. [Google Scholar]
- Zamanian, M.; Heydari, P. Readability of Texts: State of the Art. Theory Pract. Lang. Stud. 2012, 2, 43–53. [Google Scholar] [CrossRef]
- Benjamin, R.G. Reconstructing Readability: Recent Developments and Recommendations in the Analysis of Text Difficulty. Educ. Psychol. Rev. 2011, 24, 63–88. [Google Scholar] [CrossRef]
- Collins-Thompson, K. Computational Assessment of Text Readability: A Survey of Past, in Present and Future Research, Recent Advances in Automatic Readability Assessment and Text Simplification. ITL Int. J. Appl. Linguist. 2014, 165, 97–135. [Google Scholar] [CrossRef]
- Kandel, L.; Moles, A. Application de l’indice de Flesch à la langue française. Cah. Etudes Radio-Télévis. 1958, 19, 253–274. [Google Scholar]
- Matricciani, E. A Statistical Theory of Language Translation Based on Communication Theory. Open J. Stat. 2020, 10, 936–997. [Google Scholar] [CrossRef]
- Lucisano, P.; Piemontese, M.E. GULPEASE: Una formula per la predizione della difficoltà dei testi in lingua italiana. Sc. Città 1988, 3, 110–124. [Google Scholar]
- Matricciani, E. Deep Language Statistics of Italian throughout Seven Centuries of Literature and Empirical Connections with Miller’s 7 ∓ 2 Law and Short-Term Memory. Open J. Stat. 2019, 09, 373–406. [Google Scholar] [CrossRef] [Green Version]
- Miller, G.A. The Magical Number Seven, Plus or Minus Two. Some Limits on Our Capacity for Processing Information. Psychol. Rev. 1955, 62, 343–352. [Google Scholar]
- Matricciani, E. Linguistic Mathematical Relationships Saved or Lost in Translating Texts: Extension of the Statistical Theory of Translation and Its Application to the New Testament. Information 2022, 13, 20. [Google Scholar] [CrossRef]
- Matricciani, E. Multiple Communication Channels in Literary Texts. Open J. Stat. 2022, 12, 486–520. [Google Scholar] [CrossRef]
- Matricciani, E. Capacity of Linguistic Communication Channels in Literary Texts: Application to Charles Dickens’ Novels. Information 2023, 14, 68. [Google Scholar] [CrossRef]
- François, T. An analysis of a French as Foreign language corpus for readability assessment. In Proceedings of the 3rd Workshop on NLP for CALL; NEALT Proceedings Series 22; Linköping 2014 Electronic Conference Proceedings; Linköping University Electronic Press: Linköping, Sweden, 2014; Volume 107, pp. 13–32. [Google Scholar]
- Baddeley, A.D.; Thomson, N.; Buchanan, M. Word Length and the Structure of Short-Term Memory. J. Verbal Learn. Verbal Behav. 1975, 14, 575–589. [Google Scholar] [CrossRef]
- Cowan, N. The magical number 4 in short-term memory: A reconsideration of mental storage capacity. Behav. Brain Sci. 2000, 24, 87–114. [Google Scholar] [CrossRef] [Green Version]
- Pothos, E.M.; Joula, P. Linguistic structure and short-term memory. Behav. Brain Sci. 2000, 24, 138–139. [Google Scholar] [CrossRef]
- Jones, G.; Macken, B. Questioning short-term memory and its measurements: Why digit span measures long-term associative learning. Cognition 2015, 144, 1–13. [Google Scholar] [CrossRef] [Green Version]
- Saaty, T.L.; Ozdemir, M.S. Why the Magic Number Seven Plus or Minus Two. Math. Comput. Model. 2003, 38, 233–244. [Google Scholar] [CrossRef]
- Mathy, F.; Feldman, J. What’s magic about magic numbers? Chunking and data compression in short-term memory. Cognition 2012, 122, 346–362. [Google Scholar] [CrossRef]
- Chen, Z.; Cowan, N. Chunk Limits and Length Limits in Immediate Recall: A Reconciliation. J. Exp. Psychol. Mem. Cogn. 2005, 31, 1235–1249. [Google Scholar] [CrossRef] [Green Version]
- Chekaf, M.; Cowan, N.; Mathy, F. Chunk formation in immediate memory and how it relates to data compression. Cognition 2016, 155, 96–107. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Barrouillest, P.; Camos, V. As Time Goes By: Temporal Constraints in Working Memory. Curr. Dir. Psychol. Sci. 2012, 21, 413–419. [Google Scholar] [CrossRef]
- Conway, A.R.A.; Cowan, N.; Michael, F.; Bunting, M.F.; Therriaulta, D.J.; Minkoff, S.R.B. A latent variable analysis of working memory capacity, short-term memory capacity, processing speed, and general fluid intelligence. Intelligence 2002, 30, 163–183. [Google Scholar] [CrossRef]
- Manzoni, A. The Betrothed; Moore, M.F., Translator; The Modern Library: New York, NY, USA, 2022. [Google Scholar]
- Mazza, A. Studi Sulle Redazioni de I Promessi Sposi; Edizioni Paoline: Milan, Ialy, 1968. [Google Scholar]
- Giovanni Nencioni, N. La Lingua di Manzoni. Avviamento Alle Prose Manzoniane; Il Mulino: Bologna, Italy, 1993. [Google Scholar]
- Guntert, G. Manzoni Romanziere: Dalla Scrittura Ideologica Alla Rappresentazione Poetica; Franco Cesati Editore: Firenze, Italy, 2000. [Google Scholar]
- Frare, P. Leggere I Promessi Sposi; Il Mulino: Bologna, Italy, 2016. [Google Scholar]
Language | Language Family | ||
---|---|---|---|
Greek | Hellenic | 4.86 | 0.92 |
Latin | Italic | 5.16 | 0.87 |
Esperanto | Constructed | 4.43 | 1.01 |
French | Romance | 4.20 | 1.07 |
Italian | Romance | 4.48 | 1.00 |
Portuguese | Romance | 4.43 | 1.01 |
Romanian | Romance | 4.34 | 1.03 |
Spanish | Romance | 4.30 | 1.04 |
Danish | Germanic | 4.14 | 1.08 |
English | Germanic | 4.24 | 1.06 |
Finnish | Germanic | 5.90 | 0.76 |
German | Germanic | 4.68 | 0.96 |
Icelandic | Germanic | 4.34 | 1.03 |
Norwegian | Germanic | 4.08 | 1.10 |
Swedish | Germanic | 4.23 | 1.06 |
Bulgarian | Balto−Slavic | 4.41 | 1.02 |
Czech | Balto−Slavic | 4.51 | 0.99 |
Croatian | Balto−Slavic | 4.39 | 1.02 |
Polish | Balto−Slavic | 5.10 | 0.88 |
Russian | Balto−Slavic | 4.67 | 0.96 |
Serbian | Balto−Slavic | 4.24 | 1.06 |
Slovak | Balto−Slavic | 4.65 | 0.96 |
Ukrainian | Balto−Slavic | 4.56 | 0.98 |
Estonian | Uralic | 4.89 | 0.92 |
Hungarian | Uralic | 5.31 | 0.84 |
Albanian | Albanian | 4.07 | 1.10 |
Armenian | Armenian | 4.75 | 0.94 |
Welsh | Celtic | 4.04 | 1.11 |
Basque | Isolate | 6.22 | 0.72 |
Hebrew | Semitic | 4.22 | 1.06 |
Cebuano | Austronesian | 4.65 | 0.96 |
Tagalog | Austronesian | 4.83 | 0.93 |
Chichewa | Niger−Congo | 6.08 | 0.74 |
Luganda | Niger−Congo | 6.23 | 0.72 |
Somali | Afro−Asiatic | 5.32 | 0.84 |
Haitian | French Creole | 3.37 | 1.33 |
Nahuatl | Uto−Aztecan | 6.71 | 0.67 |
Literary Work | |||||
---|---|---|---|---|---|
Matthew King James translation (1611) | 4.27 | 23.51 | 5.91 | 55.14 | 55.86 |
Robinson Crusoe (D. Defoe, 1719) | 3.94 | 57.75 | 7.12 | 50.84 | 42.22 |
Pride and Prejudice (J. Austen, 1813) | 4.40 | 24.86 | 7.16 | 52.79 | 43.89 |
Wuthering Heights (E. Brontë, 1845–1846) | 4.27 | 25.82 | 5.97 | 53.65 | 53.89 |
Vanity Fair (W. Thackeray, 1847–1848) | 4.63 | 25.74 | 6.73 | 49.75 | 44.10 |
David Copperfield (C. Dickens, 1849–1850) | 4.04 | 24.40 | 5.61 | 56.68 | 59.66 |
Moby Dick (H. Melville, 1851) | 4.52 | 31.18 | 6.45 | 49.11 | 45.66 |
The Mill on The Floss (G. Eliot, 1860) | 4.29 | 28.03 | 7.09 | 52.70 | 44.32 |
Alice’s Adventures in Wonderland (L. Carroll, 1865) | 3.96 | 30.92 | 5.79 | 56.14 | 57.76 |
Little Women (L.M. Alcott, 1868–1869) | 4.18 | 21.08 | 6.30 | 57.31 | 54.99 |
Treasure Island (R. L. Stevenson, 1881–1882) | 4.02 | 21.89 | 6.05 | 58.78 | 58.39 |
Adventures of Huckleberry Finn (M. Twain, 1884) | 3.85 | 24.89 | 6.63 | 59.01 | 54.14 |
Three Men in a Boat (J.K. Jerome, 1889) | 4.25 | 13.71 | 6.14 | 64.19 | 63.13 |
The Picture of Dorian Gray (O. Wilde, 1890) | 4.19 | 16.56 | 6.29 | 62.83 | 60.58 |
The Jungle Book (R. Kipling, 1894) | 4.11 | 21.52 | 7.15 | 57.95 | 49.14 |
The War of the Worlds (H.G. Wells, 1897) | 4.38 | 20.85 | 7.67 | 55.31 | 42.48 |
The Wonderful Wizard of Oz (L.F. Baum, 1900) | 4.02 | 20.55 | 7.63 | 59.38 | 46.85 |
The Hound of The Baskervilles (A.C. Doyle, 1901–1902) | 4.15 | 17.79 | 7.83 | 60.27 | 46.16 |
Peter Pan (J.M. Barrie, 1902) | 4.12 | 18.20 | 6.35 | 60.53 | 57.85 |
A Little Princess (F.H. Burnett, 1902–1905) | 4.18 | 16.38 | 6.80 | 61.57 | 55.45 |
Martin Eden (J. London, 1908–1909) | 4.32 | 16.94 | 6.76 | 59.38 | 53.50 |
Women in love (D.H. Lawrence, 1920) | 4.26 | 13.71 | 5.22 | 63.98 | 70.02 |
The Secret Adversary (A. Christie, 1922) | 4.28 | 11.02 | 5.52 | 69.08 | 72.76 |
The Sun Also Rises (E. Hemingway, 1926) | 3.92 | 10.70 | 6.02 | 72.58 | 72.45 |
A Farewell to Arms (H. Hemingway,1929) | 3.94 | 10.12 | 6.80 | 73.17 | 66.99 |
Of Mice and Men (J. Steinbeck, 1937) | 4.02 | 9.67 | 5.61 | 74.20 | 77.24 |
Novel | |||||
---|---|---|---|---|---|
Anonymous (I Fioretti di San Francesco, XIV Century) | 4.65 | 37.70 | 8.24 | 50.70 | 37.26 |
Boccaccio Giovanni (Decameron, XIV) | 4.48 | 44.27 | 7.79 | 51.18 | 40.44 |
Buzzati Dino (Il deserto dei tartari, XX) | 5.10 | 17.75 | 6.63 | 55.27 | 51.49 |
Calvino Italo (Marcovaldo, XX) | 4.74 | 17.60 | 6.59 | 59.19 | 55.65 |
Cassola Carlo (La ragazza di Bube, XX) | 4.48 | 11.93 | 5.64 | 69.84 | 72.00 |
Collodi Carlo (Pinocchio, XIX) | 4.60 | 16.92 | 6.19 | 61.57 | 60.43 |
Deledda Grazia (Canne al vento, XX) | 4.51 | 15.08 | 6.06 | 64.39 | 64.03 |
D’Annunzio Gabriele (Le novelle delle Pescara, XX) | 4.91 | 17.99 | 6.38 | 58.16 | 55.88 |
Eco Umberto (Il nome della rosa, XX) | 4.81 | 21.08 | 7.46 | 55.78 | 47.02 |
Fogazzaro (Piccolo mondo antico, XIX-XX) | 4.79 | 16.08 | 6.10 | 61.46 | 60.86 |
Gadda (Quer pasticciaccio brutto… XX) | 4.76 | 18.43 | 4.98 | 58.24 | 64.36 |
Machiavelli Niccolò (Il principe, XV-XVI) | 4.71 | 40.17 | 6.45 | 49.54 | 46.84 |
Manzoni Alessandro (Fermo e Lucia, XIX) | 4.75 | 30.98 | 7.17 | 51.72 | 44.70 |
Manzoni Alessandro (I promessi sposi, XIX) | 4.60 | 24.83 | 5.30 | 56.00 | 60.20 |
Moravia Alberto (La ciociara, XX) | 4.56 | 29.93 | 7.28 | 53.52 | 45.84 |
Pavese Cesare (La luna e i falò, XX) | 4.47 | 17.83 | 6.83 | 61.90 | 56.92 |
Pirandello Luigi (Il fu Mattia Pascal) | 4.63 | 14.57 | 4.94 | 63.94 | 70.30 |
Svevo Italo (Senilità, XX) | 4.86 | 16.04 | 7.75 | 59.39 | 48.89 |
Tomasi di Lampedusa (Il gattopardo, XX) | 4.99 | 26.42 | 7.90 | 50.72 | 39.32 |
Verga (I Malavoglia, XIX-XX) | 4.46 | 20.45 | 6.82 | 59.34 | 54.42 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Matricciani, E. Readability Indices Do Not Say It All on a Text Readability. Analytics 2023, 2, 296-314. https://doi.org/10.3390/analytics2020016
Matricciani E. Readability Indices Do Not Say It All on a Text Readability. Analytics. 2023; 2(2):296-314. https://doi.org/10.3390/analytics2020016
Chicago/Turabian StyleMatricciani, Emilio. 2023. "Readability Indices Do Not Say It All on a Text Readability" Analytics 2, no. 2: 296-314. https://doi.org/10.3390/analytics2020016
APA StyleMatricciani, E. (2023). Readability Indices Do Not Say It All on a Text Readability. Analytics, 2(2), 296-314. https://doi.org/10.3390/analytics2020016