Multiple regression techniques for modeling dates of first performances of Shakespeare-era plays

by   Pablo Moscato, et al.

The date of the first performance of a play of Shakespeare's time must usually be guessed with reference to multiple indirect external sources, or to some aspect of the content or style of the play. Identifying these dates is important to literary history and to accounts of developing authorial styles, such as Shakespeare's. In this study, we took a set of Shakespeare-era plays (181 plays from the period 1585–1610), added the best-guess dates for them from a standard reference work as metadata, and calculated a set of probabilities of individual words in these samples. We applied 11 regression methods to predict the dates of the plays at an 80/20 training/test split. We withdrew one play at a time, used the best-guess date metadata with the probabilities and weightings to infer its date, and thus built a model of date-probabilities interaction. We introduced a memetic algorithm-based Continued Fraction Regression (CFR) which delivered models using a small number of variables, leading to an interpretable model and reduced dimensionality. An in-depth analysis of the most commonly occurring 20 words in the CFR models in 100 independent runs helps explain the trends in linguistic and stylistic terms. The analysis with the subset of words revealed an interesting correlation of signature words with the Shakespeare-era play's genre.


page 4

page 10

page 13


Stylometric Analysis of Early Modern Period English Plays

Function word adjacency networks (WANs) are used to study the authorship...

A data science and machine learning approach to continuous analysis of Shakespeare's plays

The availability of quantitative methods that can analyze text has provi...

Dating Ancient Paintings of Mogao Grottoes Using Deeply Learnt Visual Codes

Cultural heritage is the asset of all the peoples of the world. The pres...

CUE Vectors: Modular Training of Language Models Conditioned on Diverse Contextual Signals

We propose a framework to modularize the training of neural language mod...

Predicting students' learning styles using regression techniques

Traditional learning systems have responded quickly to the COVID pandemi...

Learning by Fictitious Play in Large Populations

We consider learning by fictitious play in a large population of agents ...

Understanding Mobile GUI: from Pixel-Words to Screen-Sentences

The ubiquity of mobile phones makes mobile GUI understanding an importan...

Please sign up or login with your details

Forgot password? Click here to reset