Automatic Dating of Medieval Charters from Denmark

Sidsel Boldsen, Patrizia Paggio

1 Citationer (Scopus)

Abstract

Dating of medieval text sources is a central task common to the field of manuscript studies. It is a difficult process requiring expert philological and historical knowledge. We investigate the issue of automatic dating of a collection of about 300 charters from medieval Denmark, in particular how n-gram models based on different transcription levels of the charters can be used to assign the manuscripts to a specific temporal interval. We frame the problem as a classification task by dividing the period into bins of 50 years and using these as classes in a supervised learning setting to develop SVM classifiers. We show that the more detailed facsimile transcription, which captures palaeographic characteristics of a text, provides better results than the diplomatic level, where such distinctions are normalised. Furthermore, both character and word n-grams show promising results, the highest accuracy reaching 74.96 %. This level of classification accuracy corresponds to being able to date almost 75 % of the charters with a 25-year error margin, which philologists use as a standard of the precision with which medieval texts can be dated manually.

OriginalsprogEngelsk
TidsskriftCEUR Workshop Proceedings
Sider (fra-til)58-72
ISSN1613-0073
StatusUdgivet - 17 maj 2019
BegivenhedDigital Humanitiesin the Nordic Countries - University of Copenhagen, Copenhagen, Danmark
Varighed: 5 mar. 20198 mar. 2019
https://cst.dk/DHN2019/DHN2019.html

Konference

KonferenceDigital Humanitiesin the Nordic Countries
LokationUniversity of Copenhagen
Land/OmrådeDanmark
ByCopenhagen
Periode05/03/201908/03/2019
Internetadresse

Fingeraftryk

Dyk ned i forskningsemnerne om 'Automatic Dating of Medieval Charters from Denmark'. Sammen danner de et unikt fingeraftryk.

Citationsformater