Abstract
The best systems at the SemEval-16 and SemEval-17 community question answering shared tasks - a task that amounts to question relevancy ranking - involve complex pipelines and manual feature engineering. Despite this, many of these still fail at beating the IR baseline, i.e., the rankings provided by Google's search engine. We present a strong baseline for question relevancy ranking by training a simple multi-task feed forward network on a bag of 14 distance measures for the input question pair. This baseline model, which is fast to train and uses only language-independent features, outperforms the best shared task systems on the task of retrieving relevant previously asked questions.
Original language | Danish |
---|---|
Title of host publication | Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing |
Publisher | Association for Computational Linguistics |
Publication date | 2018 |
Pages | 4810–4815 |
Publication status | Published - 2018 |
Event | 2018 Conference on Empirical Methods in Natural Language Processing - Brussels, Belgium Duration: 31 Oct 2018 → 4 Nov 2018 |
Conference
Conference | 2018 Conference on Empirical Methods in Natural Language Processing |
---|---|
Country/Territory | Belgium |
City | Brussels |
Period | 31/10/2018 → 04/11/2018 |