Autoencoding beyond pixels using a learned similarity metric

Anders Boesen Lindbo Larsen, Søren Kaae Sønderby, Hugo Larochelle, Ole Winther

183 Citations (Scopus)

Abstract

We present an autoencoder that leverages learned representations to better measure similarities in data space. By combining a variational autoencoder (VAE) with a generative adversarial network (GAN) we can use learned feature representations in the GAN discriminator as basis for the VAE reconstruction objective. Thereby, we replace element-wise errors with feature-wise errors to better capture the data distribution while offering invariance towards e.g. translation. We apply our method to images of faces and show that it outperforms VAEs with element-wise similarity measures in terms of visual fidelity. Moreover, we show that the method learns an embedding in which high-level abstract visual features (e.g. wearing glasses) can be modified using simple arithmetic.

Original languageEnglish
Title of host publicationProceedings of The 33rd International Conference on Machine Learning
EditorsMaria Florina Balcan, Kilian Q. Weinberger
Number of pages9
Publication date2016
Pages1558–1566
ISBN (Electronic)978-151082900-8
Publication statusPublished - 2016
Event33rd International Conference on Machine Learning - New York, United States
Duration: 19 Jun 201624 Jun 2016
Conference number: 33

Conference

Conference33rd International Conference on Machine Learning
Number33
Country/TerritoryUnited States
CityNew York
Period19/06/201624/06/2016
SeriesJMLR: Workshop and Conference Proceedings
Volume48

Fingerprint

Dive into the research topics of 'Autoencoding beyond pixels using a learned similarity metric'. Together they form a unique fingerprint.

Cite this