Next Article in Journal
How Muslim Students’ Knowledge of Christianity Is Related to Their Attitudes to Mainstream Australia and Australians: A National Survey
Previous Article in Journal
Airport Casualties: Non-Admission and Return Risks at Times of Internalized/Externalized Border Controls
Article Menu

Export Article

Open AccessArticle
Soc. Sci. 2015, 4(3), 758-799; doi:10.3390/socsci4030758

Hierarchical and Non-Hierarchical Linear and Non-Linear Clustering Methods to “Shakespeare Authorship Question”

School of English Literature, Language and Linguistics, University of Newcastle, Newcastle upon Tyne, Tyne and Wear NE1 7RU, UK
Academic Editor: Martin J. Bull
Received: 1 July 2015 / Revised: 12 August 2015 / Accepted: 6 September 2015 / Published: 17 September 2015

Abstract

A few literary scholars have long claimed that Shakespeare did not write some of his best plays (history plays and tragedies) and proposed at one time or another various suspect authorship candidates. Most modern-day scholars of Shakespeare have rejected this claim, arguing that strong evidence that Shakespeare wrote the plays and poems being his name appears on them as the author. This has caused and led to an ongoing scholarly academic debate for quite some long time. Stylometry is a fast-growing field often used to attribute authorship to anonymous or disputed texts. Stylometric attempts to resolve this literary puzzle have raised interesting questions over the past few years. The following paper contributes to “the Shakespeare authorship question” by using a mathematically-based methodology to examine the hypothesis that Shakespeare wrote all the disputed plays traditionally attributed to him. More specifically, the mathematically based methodology used here is based on Mean Proximity, as a linear hierarchical clustering method, and on Principal Components Analysis, as a non-hierarchical linear clustering method. It is also based, for the first time in the domain, on Self-Organizing Map U-Matrix and Voronoi Map, as non-linear clustering methods to cover the possibility that our data contains significant non-linearities. Vector Space Model (VSM) is used to convert texts into vectors in a high dimensional space. The aim of which is to compare the degrees of similarity within and between limited samples of text (the disputed plays). The various works and plays assumed to have been written by Shakespeare and possible authors notably, Sir Francis Bacon, Christopher Marlowe, John Fletcher, and Thomas Kyd, where “similarity” is defined in terms of correlation/distance coefficient measure based on the frequency of usage profiles of function words, word bi-grams, and character triple-grams. The claim that Shakespeare authored all the disputed plays traditionally attributed to him is falsified in favor of the alternative authors according to the stylistic criteria and analytic methodology used. The result of this validated analysis is empirically-based, objective, and involves replicable evidence which can be used in conjunction with existing arguments to resolve the question of whether or not Shakespeare of Stratford-upon-Avon wrote all the disputed plays traditionally attributed to him. View Full-Text
Keywords: stylometry; text-length normalization; dimensionality-reduction; dendrogram; word bi-grams; character triple-grams; correlation matrix; centroid analysis; clustering tendency test; vector space stylometry; text-length normalization; dimensionality-reduction; dendrogram; word bi-grams; character triple-grams; correlation matrix; centroid analysis; clustering tendency test; vector space
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

SciFeed Share & Cite This Article

MDPI and ACS Style

Aljumily, R. Hierarchical and Non-Hierarchical Linear and Non-Linear Clustering Methods to “Shakespeare Authorship Question”. Soc. Sci. 2015, 4, 758-799.

Show more citation formats Show less citations formats

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Soc. Sci. EISSN 2076-0760 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top