A Discriminative Latent-Variable Model for Bilingual Lexicon Induction

Sebastian  Ruder; Ryan  Cotterell; Yova Radoslavova Kementchedjhieva; Anders Søgaard

A Discriminative Latent-Variable Model for Bilingual Lexicon Induction

Sebastian Ruder, Ryan Cotterell, Yova Radoslavova Kementchedjhieva, Anders Søgaard

Abstract

We introduce a novel discriminative latentvariablemodel for the task of bilingual lexiconinduction. Our model combines the bipartitematching dictionary prior of Haghighiet al. (2008) with a state-of-the-art embeddingbasedapproach. To train the model, we derivean efficient Viterbi EM algorithm. We provideempirical improvements on six language pairsunder two metrics and show that the prior theoreticallyand empirically helps to mitigate thehubness problem. We also demonstrate howprevious work may be viewed as a similarlyfashioned latent-variable model, albeit with adifferent prior.1

Originalsprog	Engelsk
Titel	Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Forlag	Association for Computational Linguistics
Publikationsdato	2018
Sider	458–468
Status	Udgivet - 2018
Begivenhed	2018 Conference on Empirical Methods in Natural Language Processing - Brussels, Belgien Varighed: 31 okt. 2018 → 4 nov. 2018

Konference

Konference	2018 Conference on Empirical Methods in Natural Language Processing
Land/Område	Belgien
By	Brussels
Periode	31/10/2018 → 04/11/2018

Citationsformater

A Discriminative Latent-Variable Model for Bilingual Lexicon Induction. / Ruder, Sebastian ; Cotterell, Ryan ; Kementchedjhieva, Yova Radoslavova et al.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2018. s. 458–468.

Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › peer review

Ruder, S, Cotterell, R, Kementchedjhieva, YR & Søgaard, A 2018, A Discriminative Latent-Variable Model for Bilingual Lexicon Induction. i Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, s. 458–468, 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgien, 31/10/2018.

@inproceedings{85a6379607ad4c6f8f0ac621555a72a4,

title = "A Discriminative Latent-Variable Model for Bilingual Lexicon Induction",

abstract = "We introduce a novel discriminative latent-variable model for the task of bilingual lexicon induction. Our model combines the bipartite matching dictionary prior of Haghighi et al. (2008) with a state-of-the-art embedding-based approach. To train the model, we derive an efficient Viterbi EM algorithm. We provide empirical improvements on six language pairs under two metrics and show that the prior theoretically and empirically helps to mitigate the hubness problem. We also demonstrate how previous work may be viewed as a similarly fashioned latent-variable model, albeit with a different prior.1",

author = "Sebastian Ruder and Ryan Cotterell and Kementchedjhieva, {Yova Radoslavova} and Anders S{\o}gaard",

year = "2018",

language = "English",

pages = "458–468",

booktitle = "Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing",

publisher = "Association for Computational Linguistics",

note = "2018 Conference on Empirical Methods in Natural Language Processing ; Conference date: 31-10-2018 Through 04-11-2018",

}

TY - GEN

T1 - A Discriminative Latent-Variable Model for Bilingual Lexicon Induction

AU - Ruder, Sebastian

AU - Cotterell, Ryan

AU - Kementchedjhieva, Yova Radoslavova

AU - Søgaard, Anders

PY - 2018

Y1 - 2018

N2 - We introduce a novel discriminative latent-variable model for the task of bilingual lexicon induction. Our model combines the bipartite matching dictionary prior of Haghighi et al. (2008) with a state-of-the-art embedding-based approach. To train the model, we derive an efficient Viterbi EM algorithm. We provide empirical improvements on six language pairs under two metrics and show that the prior theoretically and empirically helps to mitigate the hubness problem. We also demonstrate how previous work may be viewed as a similarly fashioned latent-variable model, albeit with a different prior.1

AB - We introduce a novel discriminative latent-variable model for the task of bilingual lexicon induction. Our model combines the bipartite matching dictionary prior of Haghighi et al. (2008) with a state-of-the-art embedding-based approach. To train the model, we derive an efficient Viterbi EM algorithm. We provide empirical improvements on six language pairs under two metrics and show that the prior theoretically and empirically helps to mitigate the hubness problem. We also demonstrate how previous work may be viewed as a similarly fashioned latent-variable model, albeit with a different prior.1

M3 - Article in proceedings

SP - 458

EP - 468

BT - Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

PB - Association for Computational Linguistics

T2 - 2018 Conference on Empirical Methods in Natural Language Processing

Y2 - 31 October 2018 through 4 November 2018

ER -

A Discriminative Latent-Variable Model for Bilingual Lexicon Induction

Abstract

Konference

Fingeraftryk

Citationsformater