Automatic Dating of Medieval Charters from Denmark

Sidsel Boldsen; Patrizia Paggio

Automatic Dating of Medieval Charters from Denmark

Institut for Nordiske Studier og Sprogvidenskab

1 Citationer (Scopus)

Abstract

Dating of medieval text sources is a central task common to the field of manuscript studies. It is a difficult process requiring expert philological and historical knowledge. We investigate the issue of automatic dating of a collection of about 300 charters from medieval Denmark, in particular how n-gram models based on different transcription levels of the charters can be used to assign the manuscripts to a specific temporal interval. We frame the problem as a classification task by dividing the period into bins of 50 years and using these as classes in a supervised learning setting to develop SVM classifiers. We show that the more detailed facsimile transcription, which captures palaeographic characteristics of a text, provides better results than the diplomatic level, where such distinctions are normalised. Furthermore, both character and word n-grams show promising results, the highest accuracy reaching 74.96 %. This level of classification accuracy corresponds to being able to date almost 75 % of the charters with a 25-year error margin, which philologists use as a standard of the precision with which medieval texts can be dated manually.

Originalsprog	Engelsk
Tidsskrift	CEUR Workshop Proceedings
Sider (fra-til)	58-72
ISSN	1613-0073
Status	Udgivet - 17 maj 2019
Begivenhed	Digital Humanitiesin the Nordic Countries - University of Copenhagen, Copenhagen, Danmark Varighed: 5 mar. 2019 → 8 mar. 2019 https://cst.dk/DHN2019/DHN2019.html

Konference

Konference	Digital Humanitiesin the Nordic Countries
Lokation	University of Copenhagen
Land/Område	Danmark
By	Copenhagen
Periode	05/03/2019 → 08/03/2019
Internetadresse	https://cst.dk/DHN2019/DHN2019.html

Adgang til dokumentet

http://ceur-ws.org/Vol-2364/5_paper.pdfLicens: CC BY-NC-ND

Citationsformater

@inproceedings{6d6da84f79e24c87a3e15eaa28b68174,

title = "Automatic Dating of Medieval Charters from Denmark",

abstract = "Dating of medieval text sources is a central task common to the field of manuscript studies. It is a difficult process requiring expert philological and historical knowledge. We investigate the issue of automatic dating of a collection of about 300 charters from medieval Denmark, in particular how n-gram models based on different transcription levels of the charters can be used to assign the manuscripts to a specific temporal interval. We frame the problem as a classification task by dividing the period into bins of 50 years and using these as classes in a supervised learning setting to develop SVM classifiers. We show that the more detailed facsimile transcription, which captures palaeographic characteristics of a text, provides better results than the diplomatic level, where such distinctions are normalised. Furthermore, both character and word n-grams show promising results, the highest accuracy reaching 74.96 %. This level of classification accuracy corresponds to being able to date almost 75 % of the charters with a 25-year error margin, which philologists use as a standard of the precision with which medieval texts can be dated manually.",

author = "Sidsel Boldsen and Patrizia Paggio",

year = "2019",

month = may,

day = "17",

language = "English",

pages = "58--72",

journal = "CEUR Workshop Proceedings",

issn = "1613-0073",

publisher = "ceur workshop proceedings",

note = "Digital Humanitiesin the Nordic Countries, DHM 2019 ; Conference date: 05-03-2019 Through 08-03-2019",

url = "https://cst.dk/DHN2019/DHN2019.html",

}

TY - GEN

T1 - Automatic Dating of Medieval Charters from Denmark

AU - Boldsen, Sidsel

AU - Paggio, Patrizia

PY - 2019/5/17

Y1 - 2019/5/17

N2 - Dating of medieval text sources is a central task common to the field of manuscript studies. It is a difficult process requiring expert philological and historical knowledge. We investigate the issue of automatic dating of a collection of about 300 charters from medieval Denmark, in particular how n-gram models based on different transcription levels of the charters can be used to assign the manuscripts to a specific temporal interval. We frame the problem as a classification task by dividing the period into bins of 50 years and using these as classes in a supervised learning setting to develop SVM classifiers. We show that the more detailed facsimile transcription, which captures palaeographic characteristics of a text, provides better results than the diplomatic level, where such distinctions are normalised. Furthermore, both character and word n-grams show promising results, the highest accuracy reaching 74.96 %. This level of classification accuracy corresponds to being able to date almost 75 % of the charters with a 25-year error margin, which philologists use as a standard of the precision with which medieval texts can be dated manually.

AB - Dating of medieval text sources is a central task common to the field of manuscript studies. It is a difficult process requiring expert philological and historical knowledge. We investigate the issue of automatic dating of a collection of about 300 charters from medieval Denmark, in particular how n-gram models based on different transcription levels of the charters can be used to assign the manuscripts to a specific temporal interval. We frame the problem as a classification task by dividing the period into bins of 50 years and using these as classes in a supervised learning setting to develop SVM classifiers. We show that the more detailed facsimile transcription, which captures palaeographic characteristics of a text, provides better results than the diplomatic level, where such distinctions are normalised. Furthermore, both character and word n-grams show promising results, the highest accuracy reaching 74.96 %. This level of classification accuracy corresponds to being able to date almost 75 % of the charters with a 25-year error margin, which philologists use as a standard of the precision with which medieval texts can be dated manually.

M3 - Conference article

SN - 1613-0073

SP - 58

EP - 72

JO - CEUR Workshop Proceedings

JF - CEUR Workshop Proceedings

T2 - Digital Humanitiesin the Nordic Countries

Y2 - 5 March 2019 through 8 March 2019

ER -

Automatic Dating of Medieval Charters from Denmark

Abstract

Konference

Adgang til dokumentet

Fingeraftryk

Citationsformater