Inverted indexing for cross-lingual NLP

Anders Søgaard, Zeljko Agic, Hector Martinez Alonso, Barbara Plank, Bernd Bohnet

48 Citations (Scopus)

Abstract

We present a novel, count-based approach to obtaining inter-lingual word representations based on inverted indexing of Wikipedia. We present experiments applying these representations to 17 datasets in document classification, POS tagging, dependency parsing, and word alignment. Our approach has the advantage that it is simple, computationally efficient and almost parameter-free, and, more importantly, it enables multi-source crosslingual learning. In 14/17 cases, we improve over using state-of-The-Art bilingual embeddings.

Original languageEnglish
Title of host publicationThe 53rd Annual Meeting of the Association for Computational Linguistics (ACL)
Number of pages10
Volume1
PublisherAssociation for Computational Linguistics
Publication date2015
Pages1713-1722
ISBN (Print)978-1-941643-72-3
Publication statusPublished - 2015

Fingerprint

Dive into the research topics of 'Inverted indexing for cross-lingual NLP'. Together they form a unique fingerprint.

Cite this