The flip-the-state transition operator for restricted Boltzmann machines

Kai Brügge; Asja Fischer; Christian Igel

doi:10.1007/s10994-013-5390-3

The flip-the-state transition operator for restricted Boltzmann machines

Kai Brügge, Asja Fischer, Christian Igel

Datalogisk Institut

14 Citationer (Scopus)

Abstract

Most learning and sampling algorithms for restricted Boltzmann machines (RMBs) rely on Markov chain Monte Carlo (MCMC) methods using Gibbs sampling. The most prominent examples are Contrastive Divergence learning (CD) and its variants as well as Parallel Tempering (PT). The performance of these methods strongly depends on the mixing properties of the Gibbs chain. We propose a Metropolis-type MCMC algorithm relying on a transition operator maximizing the probability of state changes. It is shown that the operator induces an irreducible, aperiodic, and hence properly converging Markov chain, also for the typically used periodic update schemes. The transition operator can replace Gibbs sampling in RBM learning algorithms without producing computational overhead. It is shown empirically that this leads to faster mixing and in turn to more accurate learning.

Originalsprog	Engelsk
Tidsskrift	Machine Learning
Vol/bind	93
Udgave nummer	1
Sider (fra-til)	53-69
Antal sider	17
ISSN	0885-6125
DOI	https://doi.org/10.1007/s10994-013-5390-3
Status	Udgivet - okt. 2013

Adgang til dokumentet

10.1007/s10994-013-5390-3

10.1007%2Fs10994-013-5390-3.pdfForlagets udgivne version, 733 KB

Citationsformater

@article{39be653c57774db9aafdb7e15c36a101,

title = "The flip-the-state transition operator for restricted Boltzmann machines",

abstract = "Most learning and sampling algorithms for restricted Boltzmann machines (RMBs) rely on Markov chain Monte Carlo (MCMC) methods using Gibbs sampling. The most prominent examples are Contrastive Divergence learning (CD) and its variants as well as Parallel Tempering (PT). The performance of these methods strongly depends on the mixing properties of the Gibbs chain. We propose a Metropolis-type MCMC algorithm relying on a transition operator maximizing the probability of state changes. It is shown that the operator induces an irreducible, aperiodic, and hence properly converging Markov chain, also for the typically used periodic update schemes. The transition operator can replace Gibbs sampling in RBM learning algorithms without producing computational overhead. It is shown empirically that this leads to faster mixing and in turn to more accurate learning.",

author = "Kai Br{\"u}gge and Asja Fischer and Christian Igel",

year = "2013",

month = oct,

doi = "10.1007/s10994-013-5390-3",

language = "English",

volume = "93",

pages = "53--69",

journal = "Machine Learning",

issn = "0885-6125",

publisher = "Springer",

number = "1",

}

TY - JOUR

T1 - The flip-the-state transition operator for restricted Boltzmann machines

AU - Brügge, Kai

AU - Fischer, Asja

AU - Igel, Christian

PY - 2013/10

Y1 - 2013/10

N2 - Most learning and sampling algorithms for restricted Boltzmann machines (RMBs) rely on Markov chain Monte Carlo (MCMC) methods using Gibbs sampling. The most prominent examples are Contrastive Divergence learning (CD) and its variants as well as Parallel Tempering (PT). The performance of these methods strongly depends on the mixing properties of the Gibbs chain. We propose a Metropolis-type MCMC algorithm relying on a transition operator maximizing the probability of state changes. It is shown that the operator induces an irreducible, aperiodic, and hence properly converging Markov chain, also for the typically used periodic update schemes. The transition operator can replace Gibbs sampling in RBM learning algorithms without producing computational overhead. It is shown empirically that this leads to faster mixing and in turn to more accurate learning.

AB - Most learning and sampling algorithms for restricted Boltzmann machines (RMBs) rely on Markov chain Monte Carlo (MCMC) methods using Gibbs sampling. The most prominent examples are Contrastive Divergence learning (CD) and its variants as well as Parallel Tempering (PT). The performance of these methods strongly depends on the mixing properties of the Gibbs chain. We propose a Metropolis-type MCMC algorithm relying on a transition operator maximizing the probability of state changes. It is shown that the operator induces an irreducible, aperiodic, and hence properly converging Markov chain, also for the typically used periodic update schemes. The transition operator can replace Gibbs sampling in RBM learning algorithms without producing computational overhead. It is shown empirically that this leads to faster mixing and in turn to more accurate learning.

U2 - 10.1007/s10994-013-5390-3

DO - 10.1007/s10994-013-5390-3

M3 - Journal article

SN - 0885-6125

VL - 93

SP - 53

EP - 69

JO - Machine Learning

JF - Machine Learning

IS - 1

ER -

The flip-the-state transition operator for restricted Boltzmann machines

Abstract

Adgang til dokumentet

Fingeraftryk

Citationsformater