LAIR: A Language for Automated Semantics-Aware Text Sanitization based on Frame Semantics

Steffen Hedegaard, Søren Houen, Jakob Grue Simonsen

Abstract

We present \lair{}: A domain-specific language that enables users to specify actions to be taken upon meeting specific semantic frames in a text, in particular to rephrase and redact the textual content. While \lair{} presupposes superficial knowledge of frames and frame semantics, it requires only limited prior programming experience. It neither contain scripting or I/O primitives, nor does it contain general loop constructions and is not Turing-complete. We have implemented a \lair{} compiler and integrated it in a pipeline for automated redaction of web pages. We detail our experience with automated redaction of web pages for subjectively undesirable content; initial experiments suggest that using a small language based on semantic recognition of undesirable terms can be highly useful as a supplement to traditional methods of text sanitization.
Original languageEnglish
Title of host publicationProceedings of the 3rd IEEE International Conference on Semantic Computing (ICSC 2009)
Number of pages6
PublisherIEEE Computer Society Press
Publication date2009
Pages47-52
ISBN (Print)978-0-7695-3800-6
DOIs
Publication statusPublished - 2009
EventIEEE International Conference on Semantic Computing - Berkeley, United States
Duration: 14 Sept 200916 Sept 2009
Conference number: 3

Conference

ConferenceIEEE International Conference on Semantic Computing
Number3
Country/TerritoryUnited States
CityBerkeley
Period14/09/200916/09/2009

Fingerprint

Dive into the research topics of 'LAIR: A Language for Automated Semantics-Aware Text Sanitization based on Frame Semantics'. Together they form a unique fingerprint.

Cite this