Deep Neural Networks - Notes for Lesson 12

Restricted Boltzmann machines (RBMs)

This module deals with Boltzmann machine learning
deep learning
neural networks
notes
RBM
restricted Boltzmann machine
coursera
Author

Oren Bochman

Published

Wednesday, November 1, 2017

{{< pdf lec12.pdf width="100%" class="ppSlide" >}}

Lecture 12a: Boltzmann machine learning

Clarification: The energy is linear in the weights, but quadratic in the states. What matters for this argument is just that it’s linear in the weights.

Lecture 12b: More efficient ways to get the statistics

Lecture 12c: Restricted Boltmann Machines

Here, a “particle” is a configuration. These particles are moving around the configuration space, which, when considered with the energy function, is our mountain landscape.

It’s called a reconstruction because it’s based on the visible vector at t=0 (via the hidden vector at t=0). It will, typically, be quite similar to the visible vector at t=0.

A “fantasy” configuration is one drawn from the model distribution by running a Markov Chain for a long time.

The word “fantasy” is chosen as part of the analogy of a Boltzmann Machine vs. a brain that learned several memories.

Lecture 12d: An example of RBM learning

This is not an easy video. Prerequisite is a rather extensive understanding of what an RBM does. Be sure to understand video 12c quite well before proceeding with 12d.

Prerequisite for this video is that you understand the “reconstruction” concept of the previous video.

The first slide is about an RBM, but uses much of the same phrases that we previously used to talk about deterministic feedforward networks.

The hidden units are described as feature detectors, or “features” for short.

The weights are shown as arrows, even though a Boltzmann Machine has undirected connections.

That’s because calculating the probability of the hidden units turning on, given the state of the visible units, is exactly like calculating the real-valued state of a logistic hidden unit, in a deterministic feedforward network.

However, in a Boltzmann Machine, that number is then treated as a probability of turning on, and an actual state of 1 or 0 is chosen, randomly, based on that probability. We’ll make further use of that similarity next week.

2:30. That procedure for changing energies, that was just explained, is a repeat (in different words) of the Contrastive Divergence story of the previous video. If you didn’t fully realize that, then review.

Lecture 12e: RBMs for collaborative filtering

Reuse

CC SA BY-NC-ND

Citation

BibTeX citation:
@online{bochman2017,
  author = {Bochman, Oren},
  title = {Deep {Neural} {Networks} - {Notes} for {Lesson} 12},
  date = {2017-11-01},
  url = {https://orenbochman.github.io/notes/dnn/dnn-12/l_12.html},
  langid = {en}
}
For attribution, please cite this work as:
Bochman, Oren. 2017. “Deep Neural Networks - Notes for Lesson 12.” November 1, 2017. https://orenbochman.github.io/notes/dnn/dnn-12/l_12.html.