Any-language frame semantic parsing

Anders Trærup Johannsen; Hector Martinez Alonso; Anders Søgaard

Any-language frame semantic parsing

Anders Trærup Johannsen, Hector Martinez Alonso, Anders Søgaard

LUKKET: Center for Sprogteknologi

8 Citationer (Scopus)

Abstract

We present a multilingual corpus of Wikipedia and Twitter texts annotated with FRAMENET 1.5 semantic frames in nine different languages, as well as a novel technique for weakly supervised cross-lingual frame-semantic parsing. Our approach only assumes the existence of linked, comparable source and target language corpora (e.g., Wikipedia) and a bilingual dictionary (e.g., Wiktionary or BABELNET). Our approach uses a truly interlingual representation, enabling us to use the same model across all nine languages. We present average error reductions over running a state-of-the-art parser on word-to-word translations of 46% for target identification, 37% for frame identification, and 14% for argument identification.

Originalsprog	Engelsk
Titel	Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing
Antal sider	5
Udgivelsessted	Lisbon, Portugal
Forlag	Association for Computational Linguistics
Publikationsdato	2015
Sider	2062-2066
ISBN (Trykt)	978-1-941643-32-7
Status	Udgivet - 2015

Adgang til dokumentet

https://aclweb.org/anthology/D/D15/D15-1245.pdfLicens: Ikke-specificeret

Citationsformater

Any-language frame semantic parsing. / Johannsen, Anders Trærup; Martinez Alonso, Hector; Søgaard, Anders.
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon, Portugal: Association for Computational Linguistics, 2015. s. 2062-2066.

Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › peer review

@inproceedings{f8abe8b6bd0643428e6fbcc895d3326a,

title = "Any-language frame semantic parsing",

abstract = "We present a multilingual corpus of Wikipedia and Twitter texts annotated with FRAMENET 1.5 semantic frames in nine different languages, as well as a novel technique for weakly supervised cross-lingual frame-semantic parsing. Our approach only assumes the existence of linked, comparable source and target language corpora (e.g., Wikipedia) and a bilingual dictionary (e.g., Wiktionary or BABELNET). Our approach uses a truly interlingual representation, enabling us to use the same model across all nine languages. We present average error reductions over running a state-of-the-art parser on word-to-word translations of 46% for target identification, 37% for frame identification, and 14% for argument identification.",

author = "Johannsen, {Anders Tr{\ae}rup} and {Martinez Alonso}, Hector and Anders S{\o}gaard",

year = "2015",

language = "English",

isbn = "978-1-941643-32-7",

pages = "2062--2066",

booktitle = "Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing",

publisher = "Association for Computational Linguistics",

}

TY - GEN

T1 - Any-language frame semantic parsing

AU - Johannsen, Anders Trærup

AU - Martinez Alonso, Hector

AU - Søgaard, Anders

PY - 2015

Y1 - 2015

N2 - We present a multilingual corpus of Wikipedia and Twitter texts annotated with FRAMENET 1.5 semantic frames in nine different languages, as well as a novel technique for weakly supervised cross-lingual frame-semantic parsing. Our approach only assumes the existence of linked, comparable source and target language corpora (e.g., Wikipedia) and a bilingual dictionary (e.g., Wiktionary or BABELNET). Our approach uses a truly interlingual representation, enabling us to use the same model across all nine languages. We present average error reductions over running a state-of-the-art parser on word-to-word translations of 46% for target identification, 37% for frame identification, and 14% for argument identification.

AB - We present a multilingual corpus of Wikipedia and Twitter texts annotated with FRAMENET 1.5 semantic frames in nine different languages, as well as a novel technique for weakly supervised cross-lingual frame-semantic parsing. Our approach only assumes the existence of linked, comparable source and target language corpora (e.g., Wikipedia) and a bilingual dictionary (e.g., Wiktionary or BABELNET). Our approach uses a truly interlingual representation, enabling us to use the same model across all nine languages. We present average error reductions over running a state-of-the-art parser on word-to-word translations of 46% for target identification, 37% for frame identification, and 14% for argument identification.

M3 - Article in proceedings

SN - 978-1-941643-32-7

SP - 2062

EP - 2066

BT - Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

PB - Association for Computational Linguistics

CY - Lisbon, Portugal

ER -

Any-language frame semantic parsing

Abstract

Adgang til dokumentet

Fingeraftryk

Citationsformater