Approximation properties of DBNs with binary hidden units and real-valued visible units

Oswin Krause, Asja Fischer, Tobias Glasmachers, Christian Igel

7 Citations (Scopus)

Abstract

Deep belief networks (DBNs) can approximate any distribution over fixed-length binary vectors. However, DBNs are frequently applied to model real-valued data, and so far little is known about their representational power in this case. We analyze the approximation properties of DBNs with two layers of binary hidden units and visible units with conditional distributions from the exponential family. It is shown that these DBNs can, under mild assumptions, model any additive mixture of distributions from the exponential family with independent variables. An arbitrarily good approximation in terms of Kullback-Leibler divergence of an m-dimensional mixture distribution with n components can be achieved by a DBN with m visible variables and n and n + 1 hidden variables in the first and second hidden layer, respectively. Furthermore, relevant infinite mixtures can be approximated arbitrarily well by a DBN with a finite number of neurons. This includes the important special case of an infinite mixture of Gaussian distributions with fixed variance restricted to a compact domain, which in turn can approximate any strictly positive density over this domain.

Original languageEnglish
Title of host publicationProceedings of the 30th International Conference on Machine Learning
EditorsSanjoy Dasgupta, David McAllester
Number of pages8
Publication date2013
Pages419-426
Publication statusPublished - 2013
Event30th International Conference on Machine Learning - Atlanta, United States
Duration: 16 Jun 201321 Jun 2013
Conference number: 30

Conference

Conference30th International Conference on Machine Learning
Number30
Country/TerritoryUnited States
CityAtlanta
Period16/06/201321/06/2013
SeriesJMLR: Workshop and Conference Proceedings
Volume28

Fingerprint

Dive into the research topics of 'Approximation properties of DBNs with binary hidden units and real-valued visible units'. Together they form a unique fingerprint.

Cite this