Cross-lingual Visual Verb Sense Disambiguation

Spandana Gella; Desmond Elliott; Frank Keller

doi:10.18653/v1/N19-1200

Cross-lingual Visual Verb Sense Disambiguation

Spandana Gella, Desmond Elliott, Frank Keller

2 Citationer (Scopus)

Abstract

Recent work has shown that visual context improves cross-lingual sense disambiguation for nouns. We extend this line of work to the more challenging task of cross-lingual verb sense disambiguation, introducing the MultiSense dataset of 9,504 images annotated with English, German, and Spanish verbs. Each image in MultiSense is annotated with an English verb and its translation in German or Spanish. We show that cross-lingual verb sense disambiguation models benefit from visual context, compared to unimodal baselines. We also show that the verb sense predicted by our best disambiguation model can improve the results of a text-only machine translation system when used for a multimodal translation task.

Originalsprog	Udefineret/Ukendt
Titel	Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
Antal sider	7
Udgivelsessted	Minneapolis, Minnesota
Forlag	Association for Computational Linguistics (ACL)
Publikationsdato	1 jun. 2019
Sider	1998-2004
DOI	https://doi.org/10.18653/v1/N19-1200
Status	Udgivet - 1 jun. 2019

Adgang til dokumentet

10.18653/v1/N19-1200

https://www.aclweb.org/anthology/N19-1200

Citationsformater

Gella, S., Elliott, D., & Keller, F. (2019). Cross-lingual Visual Verb Sense Disambiguation. I Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (s. 1998-2004). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/N19-1200

Cross-lingual Visual Verb Sense Disambiguation. / Gella, Spandana; Elliott, Desmond; Keller, Frank.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis, Minnesota: Association for Computational Linguistics (ACL), 2019. s. 1998-2004.

Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › peer review

Gella, S, Elliott, D & Keller, F 2019, Cross-lingual Visual Verb Sense Disambiguation. i Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics (ACL), Minneapolis, Minnesota, s. 1998-2004. https://doi.org/10.18653/v1/N19-1200

Gella S, Elliott D, Keller F. Cross-lingual Visual Verb Sense Disambiguation. I Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis, Minnesota: Association for Computational Linguistics (ACL). 2019. s. 1998-2004 doi: 10.18653/v1/N19-1200

Gella, Spandana ; Elliott, Desmond ; Keller, Frank. / Cross-lingual Visual Verb Sense Disambiguation. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis, Minnesota : Association for Computational Linguistics (ACL), 2019. s. 1998-2004

@inproceedings{ed9c7aefe0094983bc054057d52bdb8d,

title = "Cross-lingual Visual Verb Sense Disambiguation",

abstract = "Recent work has shown that visual context improves cross-lingual sense disambiguation for nouns. We extend this line of work to the more challenging task of cross-lingual verb sense disambiguation, introducing the MultiSense dataset of 9,504 images annotated with English, German, and Spanish verbs. Each image in MultiSense is annotated with an English verb and its translation in German or Spanish. We show that cross-lingual verb sense disambiguation models benefit from visual context, compared to unimodal baselines. We also show that the verb sense predicted by our best disambiguation model can improve the results of a text-only machine translation system when used for a multimodal translation task.",

author = "Spandana Gella and Desmond Elliott and Frank Keller",

year = "2019",

month = jun,

day = "1",

doi = "10.18653/v1/N19-1200",

language = "Udefineret/Ukendt",

pages = "1998--2004",

booktitle = "Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)",

publisher = "Association for Computational Linguistics (ACL)",

address = "USA",

}

TY - GEN

T1 - Cross-lingual Visual Verb Sense Disambiguation

AU - Gella, Spandana

AU - Elliott, Desmond

AU - Keller, Frank

PY - 2019/6/1

Y1 - 2019/6/1

N2 - Recent work has shown that visual context improves cross-lingual sense disambiguation for nouns. We extend this line of work to the more challenging task of cross-lingual verb sense disambiguation, introducing the MultiSense dataset of 9,504 images annotated with English, German, and Spanish verbs. Each image in MultiSense is annotated with an English verb and its translation in German or Spanish. We show that cross-lingual verb sense disambiguation models benefit from visual context, compared to unimodal baselines. We also show that the verb sense predicted by our best disambiguation model can improve the results of a text-only machine translation system when used for a multimodal translation task.

AB - Recent work has shown that visual context improves cross-lingual sense disambiguation for nouns. We extend this line of work to the more challenging task of cross-lingual verb sense disambiguation, introducing the MultiSense dataset of 9,504 images annotated with English, German, and Spanish verbs. Each image in MultiSense is annotated with an English verb and its translation in German or Spanish. We show that cross-lingual verb sense disambiguation models benefit from visual context, compared to unimodal baselines. We also show that the verb sense predicted by our best disambiguation model can improve the results of a text-only machine translation system when used for a multimodal translation task.

U2 - 10.18653/v1/N19-1200

DO - 10.18653/v1/N19-1200

M3 - Konferencebidrag i proceedings

SP - 1998

EP - 2004

BT - Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

PB - Association for Computational Linguistics (ACL)

CY - Minneapolis, Minnesota

ER -