Contrastive Attention Networks for Attribution of Early Modern Print

06/12/2023
by   Nikolai Vogler, et al.
0

In this paper, we develop machine learning techniques to identify unknown printers in early modern (c. 1500–1800) English printed books. Specifically, we focus on matching uniquely damaged character type-imprints in anonymously printed books to works with known printers in order to provide evidence of their origins. Until now, this work has been limited to manual investigations by analytical bibliographers. We present a Contrastive Attention-based Metric Learning approach to identify similar damage across character image pairs, which is sensitive to very subtle differences in glyph shapes, yet robust to various confounding sources of noise associated with digitized historical books. To overcome the scarce amount of supervised data, we design a random data synthesis procedure that aims to simulate bends, fractures, and inking variations induced by the early printing process. Our method successfully improves downstream damaged type-imprint matching among printed works from this period, as validated by in-domain human experts. The results of our approach on two important philosophical works from the Early Modern period demonstrate potential to extend the extant historical research about the origins and content of these books.

READ FULL TEXT

page 2

page 3

page 11

research
10/09/2018

Decipherment of Historical Manuscript Images

European libraries and archives are filled with enciphered manuscripts f...
research
10/18/2016

Stylometric Analysis of Early Modern Period English Plays

Function word adjacency networks (WANs) are used to study the authorship...
research
05/13/2023

Contrastive Domain Generalization via Logit Attribution Matching

Domain Generalization (DG) is an important open problem in machine learn...
research
01/14/2019

A Modern Retrospective on Probabilistic Numerics

This article attempts to cast the emergence of probabilistic numerics as...
research
02/18/2022

From FreEM to D'AlemBERT: a Large Corpus and a Language Model for Early Modern French

Language models for historical states of language are becoming increasin...
research
09/14/2018

Ground Truth for training OCR engines on historical documents in German Fraktur and Early Modern Latin

In this paper we describe a dataset of German and Latin ground truth (GT...
research
09/17/2019

Historical and Modern Features for Buddha Statue Classification

While Buddhism has spread along the Silk Roads, many pieces of art have ...

Please sign up or login with your details

Forgot password? Click here to reset