Deep multi-task learning with low level tasks supervised at lower layers

Anders Søgaard; Yoav Goldberg

Deep multi-task learning with low level tasks supervised at lower layers

Department of Nordic Studies and Linguistics

173 Citations (Scopus)

Abstract

In all previous work on deep multi-task learning we are aware of, all task supervisions are on the same (outermost) layer. We present a multi-task learning architecture with deep bi-directional RNNs, where different tasks supervision can happen at different layers. We present experiments in syntactic chunking and CCG supertagging, coupled with the additional task of POS-tagging. We show that it is consistently better to have POS supervision at the innermost rather than the outermost layer. We argue that this is because "lowlevel" tasks are better kept at the lower layers, enabling the higher-level tasks to make use of the shared representation of the lower-level tasks. Finally, we also show how this architecture can be used for domain adaptation.

Original language	English
Title of host publication	Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics
Number of pages	5
Volume	2
Publisher	Association for Computational Linguistics
Publication date	2016
Pages	231-235
ISBN (Electronic)	978-1-945626-01-2
Publication status	Published - 2016
Event	54th Annual Meeting of the Association for Computational Linguistics - Berlin, Germany Duration: 7 Aug 2016 → 12 Aug 2016 Conference number: 54

Conference

Conference	54th Annual Meeting of the Association for Computational Linguistics
Number	54
Country/Territory	Germany
City	Berlin
Period	07/08/2016 → 12/08/2016

Access to Document

Søgaard_2016_Deep_multi-taskFinal published version, 138 KB

https://www.aclweb.org/anthology/P/P16/P16-2038.pdf

Cite this

Deep multi-task learning with low level tasks supervised at lower layers. / Søgaard, Anders; Goldberg, Yoav.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Vol. 2 Association for Computational Linguistics, 2016. p. 231-235.

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Søgaard, A & Goldberg, Y 2016, Deep multi-task learning with low level tasks supervised at lower layers. in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. vol. 2, Association for Computational Linguistics, pp. 231-235, 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, 07/08/2016. <https://www.aclweb.org/anthology/P/P16/P16-2038.pdf>

@inproceedings{e94d9061adbd492586833f646fbf5398,

title = "Deep multi-task learning with low level tasks supervised at lower layers",

abstract = "In all previous work on deep multi-task learning we are aware of, all task supervisions are on the same (outermost) layer. We present a multi-task learning architecture with deep bi-directional RNNs, where different tasks supervision can happen at different layers. We present experiments in syntactic chunking and CCG supertagging, coupled with the additional task of POS-tagging. We show that it is consistently better to have POS supervision at the innermost rather than the outermost layer. We argue that this is because {"}lowlevel{"} tasks are better kept at the lower layers, enabling the higher-level tasks to make use of the shared representation of the lower-level tasks. Finally, we also show how this architecture can be used for domain adaptation.",

author = "Anders S{\o}gaard and Yoav Goldberg",

year = "2016",

language = "English",

volume = "2",

pages = "231--235",

booktitle = "Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics",

publisher = "Association for Computational Linguistics",

note = "54th Annual Meeting of the Association for Computational Linguistics ; Conference date: 07-08-2016 Through 12-08-2016",

}

TY - GEN

T1 - Deep multi-task learning with low level tasks supervised at lower layers

AU - Søgaard, Anders

AU - Goldberg, Yoav

N1 - Conference code: 54

PY - 2016

Y1 - 2016

N2 - In all previous work on deep multi-task learning we are aware of, all task supervisions are on the same (outermost) layer. We present a multi-task learning architecture with deep bi-directional RNNs, where different tasks supervision can happen at different layers. We present experiments in syntactic chunking and CCG supertagging, coupled with the additional task of POS-tagging. We show that it is consistently better to have POS supervision at the innermost rather than the outermost layer. We argue that this is because "lowlevel" tasks are better kept at the lower layers, enabling the higher-level tasks to make use of the shared representation of the lower-level tasks. Finally, we also show how this architecture can be used for domain adaptation.

AB - In all previous work on deep multi-task learning we are aware of, all task supervisions are on the same (outermost) layer. We present a multi-task learning architecture with deep bi-directional RNNs, where different tasks supervision can happen at different layers. We present experiments in syntactic chunking and CCG supertagging, coupled with the additional task of POS-tagging. We show that it is consistently better to have POS supervision at the innermost rather than the outermost layer. We argue that this is because "lowlevel" tasks are better kept at the lower layers, enabling the higher-level tasks to make use of the shared representation of the lower-level tasks. Finally, we also show how this architecture can be used for domain adaptation.

M3 - Article in proceedings

VL - 2

SP - 231

EP - 235

BT - Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics

PB - Association for Computational Linguistics

T2 - 54th Annual Meeting of the Association for Computational Linguistics

Y2 - 7 August 2016 through 12 August 2016

ER -

Deep multi-task learning with low level tasks supervised at lower layers

Abstract

Conference

Access to Document

Fingerprint

Cite this