Abstract
We present a novel, count-based approach to obtaining inter-lingual word representations based on inverted indexing of Wikipedia. We present experiments applying these representations to 17 datasets in document classification, POS tagging, dependency parsing, and word alignment. Our approach has the advantage that it is simple, computationally efficient and almost parameter-free, and, more importantly, it enables multi-source crosslingual learning. In 14/17 cases, we improve over using state-of-The-Art bilingual embeddings.
Original language | English |
---|---|
Title of host publication | The 53rd Annual Meeting of the Association for Computational Linguistics (ACL) |
Number of pages | 10 |
Volume | 1 |
Publisher | Association for Computational Linguistics |
Publication date | 2015 |
Pages | 1713-1722 |
ISBN (Print) | 978-1-941643-72-3 |
Publication status | Published - 2015 |