A Discriminative Latent-Variable Model for Bilingual Lexicon Induction

Sebastian Ruder, Ryan Cotterell, Yova Radoslavova Kementchedjhieva, Anders Søgaard

    Abstract

    We introduce a novel discriminative latent-variable model for the task of bilingual lexicon induction. Our model combines the bipartite matching dictionary prior of Haghighi et al. (2008) with a state-of-the-art embedding-based approach. To train the model, we derive an efficient Viterbi EM algorithm. We provide empirical improvements on six language pairs under two metrics and show that the prior theoretically and empirically helps to mitigate the hubness problem. We also demonstrate how previous work may be viewed as a similarly fashioned latent-variable model, albeit with a different prior.1

    Original languageEnglish
    Title of host publicationProceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
    PublisherAssociation for Computational Linguistics
    Publication date2018
    Pages458–468
    Publication statusPublished - 2018
    Event2018 Conference on Empirical Methods in Natural Language Processing - Brussels, Belgium
    Duration: 31 Oct 20184 Nov 2018

    Conference

    Conference2018 Conference on Empirical Methods in Natural Language Processing
    Country/TerritoryBelgium
    CityBrussels
    Period31/10/201804/11/2018

    Fingerprint

    Dive into the research topics of 'A Discriminative Latent-Variable Model for Bilingual Lexicon Induction'. Together they form a unique fingerprint.

    Cite this