Robust Active Label Correction

Jan Kremer; Fei Sha; Christian Igel

Robust Active Label Correction

Jan Kremer, Fei Sha, Christian Igel

Department of Computer Science

4 Citations (Scopus)

Abstract

Active label correction addresses the problem of learning from input data for which noisy labels are available (e.g., from imprecise measurements or crowd-sourcing) and each true label can be obtained at a significant cost (e.g., through additional measurements or human experts). To minimize these costs, we are interested in identifying training patterns for which knowing the true labels maximally improves the learning performance. We approximate the true label noise by a model that learns the aspects of the noise that are class-conditional (i.e., independent of the input given the observed label). To select labels for correction, we adopt the active learning strategy of maximizing the expected model change. We consider the change in regularized empirical risk functionals that use different pointwise loss functions for patterns with noisy and true labels, respectively. Different loss functions for the noisy data lead to different active label correction algorithms. If loss functions consider the label noise rates, these rates are estimated during learning, where importance weighting compensates for the sampling bias. We show empirically that viewing the true label as a latent variable and computing the maximum likelihood estimate of the model parameters performs well across all considered problems. A maximum a posteriori estimate of the model parameters was beneficial in most test cases. An image classification experiment using convolutional neural networks demonstrates that the class-conditional noise model, which can be learned efficiently, can guide re-labeling in real-world applications.

Original language	English
Title of host publication	Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics
Number of pages	9
Volume	84
Publisher	PMLR
Publication date	2018
Pages	308-316
Publication status	Published - 2018
Event	21st International Conference on Artificial Intelligence and Statistics - Playa Blanca, Lanzarote, Canary Islands, Spain Duration: 9 Apr 2018 → 11 Apr 2018

Conference

Conference	21st International Conference on Artificial Intelligence and Statistics
Country/Territory	Spain
City	Playa Blanca, Lanzarote, Canary Islands
Period	09/04/2018 → 11/04/2018

Series	Proceedings of Machine Learning Research

Series	Proceedings of Machine Learning Research
Volume	84
ISSN	1938-7228

Access to Document

http://proceedings.mlr.press/v84/kremer18a.html

Cite this

Robust Active Label Correction. / Kremer, Jan; Sha, Fei; Igel, Christian.
Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics. Vol. 84 PMLR, 2018. p. 308-316 (Proceedings of Machine Learning Research). (Proceedings of Machine Learning Research, Vol. 84).

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Kremer, J, Sha, F & Igel, C 2018, Robust Active Label Correction. in Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics. vol. 84, PMLR, Proceedings of Machine Learning Research, Proceedings of Machine Learning Research, vol. 84, pp. 308-316, 21st International Conference on Artificial Intelligence and Statistics, Playa Blanca, Lanzarote, Canary Islands, Spain, 09/04/2018. <http://proceedings.mlr.press/v84/kremer18a.html>

@inproceedings{ddbb5305e84844b1bfc751cf9941897a,

title = "Robust Active Label Correction",

abstract = "Active label correction addresses the problem of learning from input data for which noisy labels are available (e.g., from imprecise measurements or crowd-sourcing) and each true label can be obtained at a significant cost (e.g., through additional measurements or human experts). To minimize these costs, we are interested in identifying training patterns for which knowing the true labels maximally improves the learning performance. We approximate the true label noise by a model that learns the aspects of the noise that are class-conditional (i.e., independent of the input given the observed label). To select labels for correction, we adopt the active learning strategy of maximizing the expected model change. We consider the change in regularized empirical risk functionals that use different pointwise loss functions for patterns with noisy and true labels, respectively. Different loss functions for the noisy data lead to different active label correction algorithms. If loss functions consider the label noise rates, these rates are estimated during learning, where importance weighting compensates for the sampling bias. We show empirically that viewing the true label as a latent variable and computing the maximum likelihood estimate of the model parameters performs well across all considered problems. A maximum a posteriori estimate of the model parameters was beneficial in most test cases. An image classification experiment using convolutional neural networks demonstrates that the class-conditional noise model, which can be learned efficiently, can guide re-labeling in real-world applications.",

author = "Jan Kremer and Fei Sha and Christian Igel",

year = "2018",

language = "English",

volume = "84",

series = "Proceedings of Machine Learning Research",

publisher = "PMLR",

pages = "308--316",

booktitle = "Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics",

note = "21st International Conference on Artificial Intelligence and Statistics ; Conference date: 09-04-2018 Through 11-04-2018",

}

TY - GEN

T1 - Robust Active Label Correction

AU - Kremer, Jan

AU - Sha, Fei

AU - Igel, Christian

PY - 2018

Y1 - 2018

N2 - Active label correction addresses the problem of learning from input data for which noisy labels are available (e.g., from imprecise measurements or crowd-sourcing) and each true label can be obtained at a significant cost (e.g., through additional measurements or human experts). To minimize these costs, we are interested in identifying training patterns for which knowing the true labels maximally improves the learning performance. We approximate the true label noise by a model that learns the aspects of the noise that are class-conditional (i.e., independent of the input given the observed label). To select labels for correction, we adopt the active learning strategy of maximizing the expected model change. We consider the change in regularized empirical risk functionals that use different pointwise loss functions for patterns with noisy and true labels, respectively. Different loss functions for the noisy data lead to different active label correction algorithms. If loss functions consider the label noise rates, these rates are estimated during learning, where importance weighting compensates for the sampling bias. We show empirically that viewing the true label as a latent variable and computing the maximum likelihood estimate of the model parameters performs well across all considered problems. A maximum a posteriori estimate of the model parameters was beneficial in most test cases. An image classification experiment using convolutional neural networks demonstrates that the class-conditional noise model, which can be learned efficiently, can guide re-labeling in real-world applications.

AB - Active label correction addresses the problem of learning from input data for which noisy labels are available (e.g., from imprecise measurements or crowd-sourcing) and each true label can be obtained at a significant cost (e.g., through additional measurements or human experts). To minimize these costs, we are interested in identifying training patterns for which knowing the true labels maximally improves the learning performance. We approximate the true label noise by a model that learns the aspects of the noise that are class-conditional (i.e., independent of the input given the observed label). To select labels for correction, we adopt the active learning strategy of maximizing the expected model change. We consider the change in regularized empirical risk functionals that use different pointwise loss functions for patterns with noisy and true labels, respectively. Different loss functions for the noisy data lead to different active label correction algorithms. If loss functions consider the label noise rates, these rates are estimated during learning, where importance weighting compensates for the sampling bias. We show empirically that viewing the true label as a latent variable and computing the maximum likelihood estimate of the model parameters performs well across all considered problems. A maximum a posteriori estimate of the model parameters was beneficial in most test cases. An image classification experiment using convolutional neural networks demonstrates that the class-conditional noise model, which can be learned efficiently, can guide re-labeling in real-world applications.

M3 - Article in proceedings

VL - 84

T3 - Proceedings of Machine Learning Research

SP - 308

EP - 316

BT - Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics

PB - PMLR

T2 - 21st International Conference on Artificial Intelligence and Statistics

Y2 - 9 April 2018 through 11 April 2018

ER -

Robust Active Label Correction

Abstract

Conference

Access to Document

Fingerprint

Cite this