Training restricted Boltzmann machines: an introduction

Asja Fischer; Christian Igel

doi:10.1016/j.patcog.2013.05.025

Training restricted Boltzmann machines: an introduction

Asja Fischer, Christian Igel

Department of Computer Science

286 Citations (Scopus)

Abstract

Restricted Boltzmann machines (RBMs) are probabilistic graphical models that can be interpreted as stochastic neural networks. They have attracted much attention as building blocks for the multi-layer learning systems called deep belief networks, and variants and extensions of RBMs have found application in a wide range of pattern recognition tasks. This tutorial introduces RBMs from the viewpoint of Markov random fields, starting with the required concepts of undirected graphical models. Different learning algorithms for RBMs, including contrastive divergence learning and parallel tempering, are discussed. As sampling from RBMs, and therefore also most of their learning algorithms, are based on Markov chain Monte Carlo (MCMC) methods, an introduction to Markov chains and MCMC techniques is provided. Experiments demonstrate relevant aspects of RBM training.

Original language	Danish
Journal	Pattern Recognition
Volume	47
Issue number	1
Pages (from-to)	25-39
Number of pages	15
ISSN	0031-3203
DOIs	https://doi.org/10.1016/j.patcog.2013.05.025
Publication status	Published - Jan 2014

Access to Document

10.1016/j.patcog.2013.05.025

Cite this

@article{a90d6fb021a645ee908ad3f7905a9271,

title = "Training restricted Boltzmann machines: an introduction",

abstract = "Restricted Boltzmann machines (RBMs) are probabilistic graphical models that can be interpreted as stochastic neural networks. They have attracted much attention as building blocks for the multi-layer learning systems called deep belief networks, and variants and extensions of RBMs have found application in a wide range of pattern recognition tasks. This tutorial introduces RBMs from the viewpoint of Markov random fields, starting with the required concepts of undirected graphical models. Different learning algorithms for RBMs, including contrastive divergence learning and parallel tempering, are discussed. As sampling from RBMs, and therefore also most of their learning algorithms, are based on Markov chain Monte Carlo (MCMC) methods, an introduction to Markov chains and MCMC techniques is provided. Experiments demonstrate relevant aspects of RBM training.",

author = "Asja Fischer and Christian Igel",

year = "2014",

month = jan,

doi = "10.1016/j.patcog.2013.05.025",

language = "Dansk",

volume = "47",

pages = "25--39",

journal = "Pattern Recognition",

issn = "0031-3203",

publisher = "Elsevier",

number = "1",

}

TY - JOUR

T1 - Training restricted Boltzmann machines

T2 - an introduction

AU - Fischer, Asja

AU - Igel, Christian

PY - 2014/1

Y1 - 2014/1

N2 - Restricted Boltzmann machines (RBMs) are probabilistic graphical models that can be interpreted as stochastic neural networks. They have attracted much attention as building blocks for the multi-layer learning systems called deep belief networks, and variants and extensions of RBMs have found application in a wide range of pattern recognition tasks. This tutorial introduces RBMs from the viewpoint of Markov random fields, starting with the required concepts of undirected graphical models. Different learning algorithms for RBMs, including contrastive divergence learning and parallel tempering, are discussed. As sampling from RBMs, and therefore also most of their learning algorithms, are based on Markov chain Monte Carlo (MCMC) methods, an introduction to Markov chains and MCMC techniques is provided. Experiments demonstrate relevant aspects of RBM training.

AB - Restricted Boltzmann machines (RBMs) are probabilistic graphical models that can be interpreted as stochastic neural networks. They have attracted much attention as building blocks for the multi-layer learning systems called deep belief networks, and variants and extensions of RBMs have found application in a wide range of pattern recognition tasks. This tutorial introduces RBMs from the viewpoint of Markov random fields, starting with the required concepts of undirected graphical models. Different learning algorithms for RBMs, including contrastive divergence learning and parallel tempering, are discussed. As sampling from RBMs, and therefore also most of their learning algorithms, are based on Markov chain Monte Carlo (MCMC) methods, an introduction to Markov chains and MCMC techniques is provided. Experiments demonstrate relevant aspects of RBM training.

U2 - 10.1016/j.patcog.2013.05.025

DO - 10.1016/j.patcog.2013.05.025

M3 - Tidsskriftartikel

SN - 0031-3203

VL - 47

SP - 25

EP - 39

JO - Pattern Recognition

JF - Pattern Recognition

IS - 1

ER -