Knowledge Distillation for Semi-supervised Domain Adaptation

Mauricio Orbes-Arteaga; Jorge Cardoso; Lauge Sørensen; Christian Igel; Sebastien Ourselin; Marc Modat; Mads Nielsen; Akshay Pai

doi:10.1007/978-3-030-32695-1_8

Knowledge Distillation for Semi-supervised Domain Adaptation

Mauricio Orbes-Arteaga^*, Jorge Cardoso, Lauge Sørensen, Christian Igel, Sebastien Ourselin, Marc Modat, Mads Nielsen, Akshay Pai

^*Corresponding author for this work

6 Citations (Scopus)

Abstract

In the absence of sufficient data variation (e.g., scanner and protocol variability) in annotated data, deep neural networks (DNNs) tend to overfit during training. As a result, their performance is significantly lower on data from unseen sources compared to the performance on data from the same source as the training data. Semi-supervised domain adaptation methods can alleviate this problem by tuning networks to new target domains without the need for annotated data from these domains. Adversarial domain adaptation (ADA) methods are a popular choice that aim to train networks in such a way that the features generated are domain agnostic. However, these methods require careful dataset-specific selection of hyperparameters such as the complexity of the discriminator in order to achieve a reasonable performance. We propose to use knowledge distillation (KD) – an efficient way of transferring knowledge between different DNNs – for semi-supervised domain adaption of DNNs. It does not require dataset-specific hyperparameter tuning, making it generally applicable. The proposed method is compared to ADA for segmentation of white matter hyperintensities (WMH) in magnetic resonance imaging (MRI) scans generated by scanners that are not a part of the training set. Compared with both the baseline DNN (trained on source domain only and without any adaption to target domain) and with using ADA for semi-supervised domain adaptation, the proposed method achieves significantly higher WMH dice scores.

Original language	English
Title of host publication	OR 2.0 Context-Aware Operating Theaters and Machine Learning in Clinical Neuroimaging - 2nd International Workshop, OR 2.0 2019, and 2nd International Workshop, MLCN 2019, Held in Conjunction with MICCAI 2019, Proceedings
Editors	Luping Zhou, Duygu Sarikaya, Seyed Mostafa Kia, Stefanie Speidel, Anand Malpani, Daniel Hashimoto, Mohamad Habes, Tommy Löfstedt, Kerstin Ritter, Hongzhi Wang
Number of pages	9
Publisher	Springer VS
Publication date	1 Jan 2019
Pages	68-76
ISBN (Print)	9783030326944
DOIs	https://doi.org/10.1007/978-3-030-32695-1_8
Publication status	Published - 1 Jan 2019
Event	2nd International Workshop on Context-Aware Surgical Theaters, OR 2.0 2019, and the 2nd International Workshop on Machine Learning in Clinical Neuroimaging, MLCN 2019, held in conjunction with the 22nd International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2019 - Shenzhen, China Duration: 17 Oct 2019 → 17 Oct 2019

Conference

Conference	2nd International Workshop on Context-Aware Surgical Theaters, OR 2.0 2019, and the 2nd International Workshop on Machine Learning in Clinical Neuroimaging, MLCN 2019, held in conjunction with the 22nd International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2019
Country/Territory	China
City	Shenzhen
Period	17/10/2019 → 17/10/2019

Series	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	11796 LNCS
ISSN	0302-9743

Keywords

Domain adaptation
Knowledge distillation
Semi-supervised learning
White matter hyperintensities

Access to Document

10.1007/978-3-030-32695-1_8

Cite this

Orbes-Arteaga, M., Cardoso, J., Sørensen, L., Igel, C., Ourselin, S., Modat, M., Nielsen, M., & Pai, A. (2019). Knowledge Distillation for Semi-supervised Domain Adaptation. In L. Zhou, D. Sarikaya, S. M. Kia, S. Speidel, A. Malpani, D. Hashimoto, M. Habes, T. Löfstedt, K. Ritter, & H. Wang (Eds.), OR 2.0 Context-Aware Operating Theaters and Machine Learning in Clinical Neuroimaging - 2nd International Workshop, OR 2.0 2019, and 2nd International Workshop, MLCN 2019, Held in Conjunction with MICCAI 2019, Proceedings (pp. 68-76). Springer VS. https://doi.org/10.1007/978-3-030-32695-1_8

Knowledge Distillation for Semi-supervised Domain Adaptation. / Orbes-Arteaga, Mauricio; Cardoso, Jorge; Sørensen, Lauge et al.
OR 2.0 Context-Aware Operating Theaters and Machine Learning in Clinical Neuroimaging - 2nd International Workshop, OR 2.0 2019, and 2nd International Workshop, MLCN 2019, Held in Conjunction with MICCAI 2019, Proceedings. ed. / Luping Zhou; Duygu Sarikaya; Seyed Mostafa Kia; Stefanie Speidel; Anand Malpani; Daniel Hashimoto; Mohamad Habes; Tommy Löfstedt; Kerstin Ritter; Hongzhi Wang. Springer VS, 2019. p. 68-76 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 11796 LNCS).

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Orbes-Arteaga, M, Cardoso, J, Sørensen, L , Igel, C, Ourselin, S, Modat, M, Nielsen, M & Pai, A 2019, Knowledge Distillation for Semi-supervised Domain Adaptation. in L Zhou, D Sarikaya, SM Kia, S Speidel, A Malpani, D Hashimoto, M Habes, T Löfstedt, K Ritter & H Wang (eds), OR 2.0 Context-Aware Operating Theaters and Machine Learning in Clinical Neuroimaging - 2nd International Workshop, OR 2.0 2019, and 2nd International Workshop, MLCN 2019, Held in Conjunction with MICCAI 2019, Proceedings. Springer VS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11796 LNCS, pp. 68-76, 2nd International Workshop on Context-Aware Surgical Theaters, OR 2.0 2019, and the 2nd International Workshop on Machine Learning in Clinical Neuroimaging, MLCN 2019, held in conjunction with the 22nd International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2019, Shenzhen, China, 17/10/2019. https://doi.org/10.1007/978-3-030-32695-1_8

Orbes-Arteaga M, Cardoso J, Sørensen L , Igel C, Ourselin S, Modat M et al. Knowledge Distillation for Semi-supervised Domain Adaptation. In Zhou L, Sarikaya D, Kia SM, Speidel S, Malpani A, Hashimoto D, Habes M, Löfstedt T, Ritter K, Wang H, editors, OR 2.0 Context-Aware Operating Theaters and Machine Learning in Clinical Neuroimaging - 2nd International Workshop, OR 2.0 2019, and 2nd International Workshop, MLCN 2019, Held in Conjunction with MICCAI 2019, Proceedings. Springer VS. 2019. p. 68-76. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 11796 LNCS). doi: 10.1007/978-3-030-32695-1_8

Orbes-Arteaga, Mauricio ; Cardoso, Jorge ; Sørensen, Lauge et al. / Knowledge Distillation for Semi-supervised Domain Adaptation. OR 2.0 Context-Aware Operating Theaters and Machine Learning in Clinical Neuroimaging - 2nd International Workshop, OR 2.0 2019, and 2nd International Workshop, MLCN 2019, Held in Conjunction with MICCAI 2019, Proceedings. editor / Luping Zhou ; Duygu Sarikaya ; Seyed Mostafa Kia ; Stefanie Speidel ; Anand Malpani ; Daniel Hashimoto ; Mohamad Habes ; Tommy Löfstedt ; Kerstin Ritter ; Hongzhi Wang. Springer VS, 2019. pp. 68-76 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 11796 LNCS).

@inproceedings{d877b988a7c3436cb25ebfbf6c44e4c3,

title = "Knowledge Distillation for Semi-supervised Domain Adaptation",

abstract = "In the absence of sufficient data variation (e.g., scanner and protocol variability) in annotated data, deep neural networks (DNNs) tend to overfit during training. As a result, their performance is significantly lower on data from unseen sources compared to the performance on data from the same source as the training data. Semi-supervised domain adaptation methods can alleviate this problem by tuning networks to new target domains without the need for annotated data from these domains. Adversarial domain adaptation (ADA) methods are a popular choice that aim to train networks in such a way that the features generated are domain agnostic. However, these methods require careful dataset-specific selection of hyperparameters such as the complexity of the discriminator in order to achieve a reasonable performance. We propose to use knowledge distillation (KD) – an efficient way of transferring knowledge between different DNNs – for semi-supervised domain adaption of DNNs. It does not require dataset-specific hyperparameter tuning, making it generally applicable. The proposed method is compared to ADA for segmentation of white matter hyperintensities (WMH) in magnetic resonance imaging (MRI) scans generated by scanners that are not a part of the training set. Compared with both the baseline DNN (trained on source domain only and without any adaption to target domain) and with using ADA for semi-supervised domain adaptation, the proposed method achieves significantly higher WMH dice scores.",

keywords = "Domain adaptation, Knowledge distillation, Semi-supervised learning, White matter hyperintensities",

author = "Mauricio Orbes-Arteaga and Jorge Cardoso and Lauge S{\o}rensen and Christian Igel and Sebastien Ourselin and Marc Modat and Mads Nielsen and Akshay Pai",

year = "2019",

month = jan,

day = "1",

doi = "10.1007/978-3-030-32695-1_8",

language = "English",

isbn = "9783030326944",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer VS",

pages = "68--76",

editor = "Luping Zhou and Duygu Sarikaya and Kia, {Seyed Mostafa} and Stefanie Speidel and Anand Malpani and Daniel Hashimoto and Mohamad Habes and Tommy L{\"o}fstedt and Kerstin Ritter and Hongzhi Wang",

booktitle = "OR 2.0 Context-Aware Operating Theaters and Machine Learning in Clinical Neuroimaging - 2nd International Workshop, OR 2.0 2019, and 2nd International Workshop, MLCN 2019, Held in Conjunction with MICCAI 2019, Proceedings",

note = "2nd International Workshop on Context-Aware Surgical Theaters, OR 2.0 2019, and the 2nd International Workshop on Machine Learning in Clinical Neuroimaging, MLCN 2019, held in conjunction with the 22nd International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2019 ; Conference date: 17-10-2019 Through 17-10-2019",

}

TY - GEN

T1 - Knowledge Distillation for Semi-supervised Domain Adaptation

AU - Orbes-Arteaga, Mauricio

AU - Cardoso, Jorge

AU - Sørensen, Lauge

AU - Igel, Christian

AU - Ourselin, Sebastien

AU - Modat, Marc

AU - Nielsen, Mads

AU - Pai, Akshay

PY - 2019/1/1

Y1 - 2019/1/1

N2 - In the absence of sufficient data variation (e.g., scanner and protocol variability) in annotated data, deep neural networks (DNNs) tend to overfit during training. As a result, their performance is significantly lower on data from unseen sources compared to the performance on data from the same source as the training data. Semi-supervised domain adaptation methods can alleviate this problem by tuning networks to new target domains without the need for annotated data from these domains. Adversarial domain adaptation (ADA) methods are a popular choice that aim to train networks in such a way that the features generated are domain agnostic. However, these methods require careful dataset-specific selection of hyperparameters such as the complexity of the discriminator in order to achieve a reasonable performance. We propose to use knowledge distillation (KD) – an efficient way of transferring knowledge between different DNNs – for semi-supervised domain adaption of DNNs. It does not require dataset-specific hyperparameter tuning, making it generally applicable. The proposed method is compared to ADA for segmentation of white matter hyperintensities (WMH) in magnetic resonance imaging (MRI) scans generated by scanners that are not a part of the training set. Compared with both the baseline DNN (trained on source domain only and without any adaption to target domain) and with using ADA for semi-supervised domain adaptation, the proposed method achieves significantly higher WMH dice scores.

AB - In the absence of sufficient data variation (e.g., scanner and protocol variability) in annotated data, deep neural networks (DNNs) tend to overfit during training. As a result, their performance is significantly lower on data from unseen sources compared to the performance on data from the same source as the training data. Semi-supervised domain adaptation methods can alleviate this problem by tuning networks to new target domains without the need for annotated data from these domains. Adversarial domain adaptation (ADA) methods are a popular choice that aim to train networks in such a way that the features generated are domain agnostic. However, these methods require careful dataset-specific selection of hyperparameters such as the complexity of the discriminator in order to achieve a reasonable performance. We propose to use knowledge distillation (KD) – an efficient way of transferring knowledge between different DNNs – for semi-supervised domain adaption of DNNs. It does not require dataset-specific hyperparameter tuning, making it generally applicable. The proposed method is compared to ADA for segmentation of white matter hyperintensities (WMH) in magnetic resonance imaging (MRI) scans generated by scanners that are not a part of the training set. Compared with both the baseline DNN (trained on source domain only and without any adaption to target domain) and with using ADA for semi-supervised domain adaptation, the proposed method achieves significantly higher WMH dice scores.

KW - Domain adaptation

KW - Knowledge distillation

KW - Semi-supervised learning

KW - White matter hyperintensities

UR - http://www.scopus.com/inward/record.url?scp=85075551780&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-32695-1_8

DO - 10.1007/978-3-030-32695-1_8

M3 - Article in proceedings

AN - SCOPUS:85075551780

SN - 9783030326944

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 68

EP - 76

BT - OR 2.0 Context-Aware Operating Theaters and Machine Learning in Clinical Neuroimaging - 2nd International Workshop, OR 2.0 2019, and 2nd International Workshop, MLCN 2019, Held in Conjunction with MICCAI 2019, Proceedings

A2 - Zhou, Luping

A2 - Sarikaya, Duygu

A2 - Kia, Seyed Mostafa

A2 - Speidel, Stefanie

A2 - Malpani, Anand

A2 - Hashimoto, Daniel

A2 - Habes, Mohamad

A2 - Löfstedt, Tommy

A2 - Ritter, Kerstin

A2 - Wang, Hongzhi

PB - Springer VS

T2 - 2nd International Workshop on Context-Aware Surgical Theaters, OR 2.0 2019, and the 2nd International Workshop on Machine Learning in Clinical Neuroimaging, MLCN 2019, held in conjunction with the 22nd International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2019

Y2 - 17 October 2019 through 17 October 2019

ER -

Knowledge Distillation for Semi-supervised Domain Adaptation

Abstract

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this